Назад
Company hidden
13 часов назад

Infra Engineer (AI)

250 000 - 400 000$
Формат работы
onsite
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
France/UK/US +2 еще
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Infra Engineer (AI/k8s): Building and scaling a production API for frontier foundation models with an accent on low-latency video streaming, GPU orchestration, and multi-region Kubernetes deployments. Focus on designing stateful request routing, optimizing GPU session lifecycles, and scaling GPU fleets to thousands of units.

Location: On-site (NYC, Stockholm, London, Paris, or Geneva). In-office, 5 days/week

Salary: $250,000 – $400,000

Company

Frontier research lab dedicated to building foundation models for environments that require deep spatial and temporal reasoning.

What you will do

  • Own and develop the production API, ensuring low latency, high availability, and billing-grade reliability.
  • Orchestrate video streaming protocols to efficiently route frames from clients to servers.
  • Manage the runtime layer, including stateful request routing, GPU session lifecycles, and inference orchestration.
  • Scale and lead Kubernetes deployments across multiple geographical regions.
  • Execute GPU hosting strategies to scale the fleet from dozens to thousands of GPUs across providers.
  • Partner with product engineering on observability, metering, and performance optimization.

Requirements

  • Proven track record of personally scaling high-traffic, low-latency APIs in production.
  • Deep expertise in Kubernetes and multi-region deployments.
  • Strong proficiency in capacity planning and managing SLOs.
  • Strong ownership instinct with experience taking systems from design to production end-to-end.
  • Must be based in or able to work from NYC, Stockholm, London, Paris, or Geneva (5 days/week in-office).

Nice to have

  • Experience with streaming video or audio inference models.
  • Background in low-latency game streaming or video streaming infrastructure.
  • Experience scaling GPU fleets across providers (e.g., GCP, Coreweave, Lambda).
  • Experience with frontier model inference (LLMs, world models, multimodal).
  • Knowledge of on-device or edge inference (ExecuTorch, Core ML).

Culture & Benefits

  • Competitive salary and meaningful equity.
  • Comprehensive medical, dental, and vision coverage.
  • 401(k) plan and learning and development stipend.
  • Wellness perks including Wellhub membership and mental health support via Spring Health and Headspace.
  • Paid parental leave, generous PTO, and 11 paid company holidays.
  • Daily meals and commuter benefits for those at the NYC HQ.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →