Назад
Company hidden
2 дня назад

Staff Software Engineer (AI)

207 000 - 275 000$
Формат работы
hybrid
Тип работы
fulltime
Грейд
senior/lead
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Staff Software Engineer (AI): Building and scaling Kubernetes-native research cluster platforms and sandbox infrastructure for agentic training with an accent on distributed systems, workload orchestration, and ML infrastructure. Focus on designing high-performance tools that enable researchers to run large-scale training jobs without managing complex infrastructure.

Location: Must be based in or able to work from Sunnyvale, CA or Bellevue, WA

Salary: $207,000–$275,000

Company

hirify.global is a specialized cloud provider delivering high-performance infrastructure for AI and large-scale compute workloads.

What you will do

  • Design and build a complete research cluster experience including CLI, job configuration schemas, and Kubernetes operators.
  • Own the Python SDK for sandbox infrastructure to enable RL training runs and agent rollouts at scale.
  • Collaborate with customers and internal teams to translate researcher requirements into robust system designs.
  • Develop documentation for running popular OSS training frameworks on the platform.
  • Solve complex challenges in code distribution, checkpoint-triggered evaluation, and cross-cluster scheduling.

Requirements

  • 8–12+ years of experience in distributed systems, ML infrastructure, or developer platforms.
  • Deep expertise in Kubernetes including custom controllers, operators, scheduling, and CRDs.
  • U.S. work authorization (U.S. citizen, permanent resident, or eligible for export control compliance).
  • Proven track record of shipping production-grade infrastructure systems.
  • Strong communication skills to work directly with customers and translate technical requirements.

Nice to have

  • Experience building internal ML platforms or research clusters at scale.
  • Familiarity with agentic AI, RL training, and sandbox isolation techniques.
  • Background with Slurm, Ray, or similar workload orchestration tools.
  • OSS contributions to Kubernetes SIGs, Ray, or PyTorch.

Culture & Benefits

  • Comprehensive medical, dental, and vision insurance (100% paid).
  • 401(k) with generous employer match.
  • Flexible PTO and casual work environment.
  • Family-forming support, childcare assistance, and mental wellness benefits.
  • Equity awards and discretionary bonus program.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →