Distributed Systems Engineer (AI)

120 000 - 400 000$

Формат работы

onsite

Тип работы

fulltime

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Distributed Systems Engineer (AI): Building platform that schedules, routes, and operates AI workloads across thousands of nodes with an accent on distributed scheduling, resource allocation, reliability, and fault tolerance. Focus on designing orchestration systems, handling failure modes, and optimizing performance across compilers, runtimes, and heterogeneous hardware.

Location: San Francisco, CA or New York City, NY

Salary: $120,000-$400,000

Company

AI infrastructure startup with $80M Series A, deployments with Fortune 500 and AI-native companies, working with foundation labs and hyperscalers.

What you will do

Build distributed scheduling and orchestration systems for large-scale AI workloads.
Implement resource allocation across thousands of nodes in production.
Design reliability, fault tolerance, and failure handling mechanisms.
Work across stack with compilers, runtimes, and hardware for performance and correctness.

Requirements

Proven ownership of distributed systems in production.
Strong Kubernetes experience.
Deep understanding of concurrency, failure modes, and system tradeoffs.
Strong programming in Go, C++, or Python.

Nice to have

Experience with ML inference systems or performance-critical workloads.
Familiarity with scheduling, queues, or resource management systems.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →