Назад
Company hidden
9 часов назад

Staff HPC Systems Software Engineer (AI)

225 000 - 275 000$
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Staff HPC Systems Software Engineer (Go/Python): Defining the technical direction and evolution of a Slurm-based HPC platform for a GPU cloud with an accent on cloud-native integration and automation. Focus on designing scalable cluster lifecycle models, optimizing GPU scheduling, and ensuring high-performance networking using InfiniBand and RDMA.

Location: Must be based in the US

Salary: $225,000 - $275,000 USD

Company

hirify.global is a high-performance GPU cloud provider engineered specifically to support AI startups and large enterprise customers.

What you will do

  • Own and evolve the technical direction for HPC systems domains, including Slurm platform architecture and scheduler integrations.
  • Define architectural patterns for packaging, automating, and exposing Slurm implementations as a service.
  • Drive cross-team design for integrations between Slurm, Kubernetes-adjacent systems, and infrastructure APIs.
  • Establish shared standards for automation, observability, reliability, and supportability across the HPC platform.
  • Lead technically critical initiatives spanning multiple engineering teams to reduce technical divergence.
  • Contribute hands-on code to de-risk critical work and provide technical escalation for complex domain issues.

Requirements

  • Extensive experience designing production software for HPC systems and Slurm-based environments.
  • Strong proficiency in Go, Python, or similar languages for building resilient and testable software.
  • Deep practical understanding of GPU-backed infrastructure and HPC networking (InfiniBand, RoCE, RDMA).
  • Proven ability to define technical direction across multiple teams or complex system domains.
  • Experience integrating HPC systems with cloud-native platforms and service delivery models.
  • Must be based in the United States

Nice to have

  • Experience with Kueue or other batch system schedulers.

Culture & Benefits

  • Highly competitive US compensation package including base salary, bonus, and equity.
  • Performance reviews conducted every 12 months with a dynamic progression plan.
  • Human-First Flexibility policy providing autonomy to shape your own workday.
  • Comprehensive benefits package including medical, dental, vision, and retirement plan participation.
  • Opportunity to directly shape how global AI capacity is planned and deployed.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →