Назад
Company hidden
2 дня назад

Staff Software Engineer (Kubernetes)

207 000 - 275 000$
Формат работы
onsite
Тип работы
fulltime
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Staff Software Engineer (Kubernetes): Building and advancing hirify.global’s orchestration platform, including SUNK (Slurm on Kubernetes), with an accent on distributed systems design, GPU cluster scheduling, and hyperscale performance. Focus on defining long-term architectural strategy, mentoring senior engineers, and solving complex challenges in reliability and observability for AI training and inference workloads.

Location: Must be based in or able to work from Sunnyvale, CA or Bellevue, WA

Salary: $207,000–$275,000

Company

hirify.global is a specialized cloud provider built for AI, delivering high-performance infrastructure and tools to scale AI training and inference for labs and enterprises.

What you will do

  • Lead the architectural direction for the orchestration platform and managed services.
  • Drive cross-organizational initiatives in scheduling, quota enforcement, and hyperscale scaling.
  • Build and optimize systems to eliminate infrastructure bottlenecks for AI workloads.
  • Mentor senior engineers and establish organizational best practices for reliability and observability.
  • Collaborate on evolving the orchestration layer to meet the demands of next-generation AI.

Requirements

  • Must be a U.S. person or eligible to access export-controlled information.
  • 8–12 years of professional software engineering experience.
  • Deep expertise in Slurm and Kubernetes internals.
  • Advanced proficiency in Go and distributed systems design.
  • Proven track record of designing and operating large-scale distributed systems in production.
  • Experience setting technical direction and influencing cross-team architecture.

Nice to have

  • Experience with orchestration technologies like Ray, Kubeflow, Kueue, Istio, or Argo Workflows.
  • Knowledge of GPU-based applications, ML pipelines, or HPC.
  • Familiarity with reliability practices including SLOs and post-incident reviews.

Culture & Benefits

  • Comprehensive medical, dental, and vision insurance (100% paid).
  • 401(k) with generous employer match.
  • Flexible PTO and casual work environment.
  • Support for family-forming, mental wellness, and childcare.
  • Catered lunches in office locations.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →