Software Engineer, Inference - Multi Modal (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Software Engineer, Inference - Multi Modal (AI): Building and optimizing infrastructure for serving multimodal AI models at scale with an accent on high-throughput and low-latency delivery of audio and image inputs. Focus on collaborating with researchers and product teams to enable state-of-the-art capabilities.
Location: Onsite in San Francisco
Company
OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.
What you will do
- Design and implement inference infrastructure for large-scale multimodal models.
- Optimize systems for high-throughput, low-latency delivery of image and audio inputs and outputs.
- Enable experimental research workflows to transition into reliable production services.
- Collaborate closely with researchers, infra teams, and product engineers to deploy state-of-the-art capabilities.
- Contribute to system-level improvements including GPU utilization, tensor parallelism, and hardware abstraction layers.
Requirements
- Experience building and scaling inference systems for LLMs or multimodal models.
- Familiarity with GPU-based ML workloads and performance dynamics of large models.
- Comfortable dealing with systems that span networking, distributed compute, and high-throughput data handling.
- Familiarity with inference tooling like vLLM, TensorRT-LLM, or custom model parallel systems.
- Ability to own problems end-to-end and operate in ambiguous, fast-moving spaces.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →