Infra Engineer (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Infra Engineer (AI/k8s): Building and scaling a production API for frontier foundation models with an accent on low-latency video streaming, GPU orchestration, and multi-region Kubernetes deployments. Focus on designing stateful request routing, optimizing GPU session lifecycles, and scaling GPU fleets to thousands of units.
Location: On-site (NYC, Stockholm, London, Paris, or Geneva). In-office, 5 days/week
Salary: $250,000 – $400,000
Company
Frontier research lab dedicated to building foundation models for environments that require deep spatial and temporal reasoning.
What you will do
- Own and develop the production API, ensuring low latency, high availability, and billing-grade reliability.
- Orchestrate video streaming protocols to efficiently route frames from clients to servers.
- Manage the runtime layer, including stateful request routing, GPU session lifecycles, and inference orchestration.
- Scale and lead Kubernetes deployments across multiple geographical regions.
- Execute GPU hosting strategies to scale the fleet from dozens to thousands of GPUs across providers.
- Partner with product engineering on observability, metering, and performance optimization.
Requirements
- Proven track record of personally scaling high-traffic, low-latency APIs in production.
- Deep expertise in Kubernetes and multi-region deployments.
- Strong proficiency in capacity planning and managing SLOs.
- Strong ownership instinct with experience taking systems from design to production end-to-end.
- Must be based in or able to work from NYC, Stockholm, London, Paris, or Geneva (5 days/week in-office).
Nice to have
- Experience with streaming video or audio inference models.
- Background in low-latency game streaming or video streaming infrastructure.
- Experience scaling GPU fleets across providers (e.g., GCP, Coreweave, Lambda).
- Experience with frontier model inference (LLMs, world models, multimodal).
- Knowledge of on-device or edge inference (ExecuTorch, Core ML).
Culture & Benefits
- Competitive salary and meaningful equity.
- Comprehensive medical, dental, and vision coverage.
- 401(k) plan and learning and development stipend.
- Wellness perks including Wellhub membership and mental health support via Spring Health and Headspace.
- Paid parental leave, generous PTO, and 11 paid company holidays.
- Daily meals and commuter benefits for those at the NYC HQ.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →