Назад
Company hidden
13 часов назад

Staff Infrastructure Engineer, Cluster Infrastructure (AI)

320 000 - 4 050 000$
Формат работы
hybrid
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Staff Infrastructure Engineer (Cluster Infrastructure/AI): Designing and scaling the full lifecycle of compute clusters across cloud providers and datacenters with an accent on agent-driven automation and high-bandwidth connectivity. Focus on establishing technical strategy for cluster scalability, homogeneity, and fault tolerance at hyperscale.

Location: Hybrid: Must be based in or be able to work from San Francisco, CA; New York City, NY; or Seattle, WA (minimum 25% office presence)

Salary: $320,000 - $4,050,000 USD per year

Company

hirify.global is a public benefit corporation dedicated to creating reliable, interpretable, and steerable AI systems.

What you will do

  • Own the technical strategy and roadmap for agent-driven cluster lifecycle management, including provisioning, updates, and decommissioning.
  • Collaborate with cloud providers and internal research, inference, and product teams to shape long-term compute and infrastructure strategy.
  • Ensure clusters are provisioned secure-by-default and leverage cloud solutions for high-bandwidth inter-cluster connectivity.
  • Define and drive strategies for cluster scalability, homogeneity, and fault tolerance.
  • Establish operational-excellence practices, including incident response and a healthy on-call culture.
  • Provide technical mentorship and coaching to support the growth of surrounding engineers.

Requirements

  • Deep expertise in distributed systems, reliability, and cloud platforms (Kubernetes, IaC, AWS/GCP/Azure).
  • Strong proficiency in Rust, Go, or Python, and experience with Terraform.
  • Proven track record of leading complex, multi-quarter technical initiatives spanning multiple teams.
  • Ability to build alignment across senior stakeholders and communicate effectively.
  • Must be based in or able to work from one of the designated US offices.

Nice to have

  • 8+ years of software engineering experience, including time as a technical lead.
  • Experience operating hyperscale compute infrastructure (100+ clusters, 10K+ nodes).
  • Depth in Kubernetes internals, cluster orchestration systems (e.g., Mesos, Borg), or cloud networking (VPC, BGP, eBPF).
  • Experience with cluster security, pod security standards, RBAC, and container hardening.
  • Expertise with workflow orchestration tools like Temporal or Argo Workflows.

Culture & Benefits

  • Competitive compensation with optional equity donation matching.
  • Generous vacation and parental leave.
  • Flexible working hours and collaborative office spaces.
  • Highly collaborative "big science" approach to AI research.
  • Visa sponsorship available for eligible candidates.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →