Infra Engineer (AI)

250 000 - 400 000$

Формат работы

onsite

Тип работы

fulltime

Грейд

senior

Английский

Страна

France/UK/US +2 еще

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Infra Engineer (AI/k8s): Building and scaling a production API for frontier foundation models with an accent on low-latency video streaming, GPU orchestration, and multi-region Kubernetes deployments. Focus on designing stateful request routing, optimizing GPU session lifecycles, and scaling GPU fleets to thousands of units.

Location: On-site (NYC, Stockholm, London, Paris, or Geneva). In-office, 5 days/week

Salary: $250,000 – $400,000

Company

Frontier research lab dedicated to building foundation models for environments that require deep spatial and temporal reasoning.

What you will do

Own and develop the production API, ensuring low latency, high availability, and billing-grade reliability.
Orchestrate video streaming protocols to efficiently route frames from clients to servers.
Manage the runtime layer, including stateful request routing, GPU session lifecycles, and inference orchestration.
Scale and lead Kubernetes deployments across multiple geographical regions.
Execute GPU hosting strategies to scale the fleet from dozens to thousands of GPUs across providers.
Partner with product engineering on observability, metering, and performance optimization.

Requirements

Proven track record of personally scaling high-traffic, low-latency APIs in production.
Deep expertise in Kubernetes and multi-region deployments.
Strong proficiency in capacity planning and managing SLOs.
Strong ownership instinct with experience taking systems from design to production end-to-end.
Must be based in or able to work from NYC, Stockholm, London, Paris, or Geneva (5 days/week in-office).

Nice to have

Experience with streaming video or audio inference models.
Background in low-latency game streaming or video streaming infrastructure.
Experience scaling GPU fleets across providers (e.g., GCP, Coreweave, Lambda).
Experience with frontier model inference (LLMs, world models, multimodal).
Knowledge of on-device or edge inference (ExecuTorch, Core ML).

Culture & Benefits

Competitive salary and meaningful equity.
Comprehensive medical, dental, and vision coverage.
401(k) plan and learning and development stipend.
Wellness perks including Wellhub membership and mental health support via Spring Health and Headspace.
Paid parental leave, generous PTO, and 11 paid company holidays.
Daily meals and commuter benefits for those at the NYC HQ.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →