Назад
Company hidden
обновлено 18 часов назад

Staff Infrastructure Engineer (AI)

340 000 - 425 000$
Формат работы
hybrid
Тип работы
fulltime
Грейд
senior/lead
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Staff Infrastructure Engineer (AI): Designing and implementing large-scale infrastructure systems to support AI scientist training, evaluation, and deployment with an accent on distributed environments, performance optimization, and container orchestration. Focus on building scalable VM/sandboxing architectures, optimizing data pipelines, and enabling stable reinforcement learning workflows.

Location: Hybrid with at least 25% office presence in San Francisco, USA

Salary: $340,000 - $425,000 USD annually

Company

hirify.global is a public benefit corporation focused on creating reliable, interpretable, and steerable AI systems that are safe and beneficial for society.

What you will do

  • Design and implement large-scale infrastructure for AI scientist training, evaluation, and deployment across distributed systems
  • Identify and resolve infrastructure bottlenecks impacting scientific AGI progress
  • Develop robust evaluation frameworks for scientific AGI measurement
  • Build scalable VM/sandbox/container architectures for safe execution of long-horizon AI tasks
  • Collaborate to translate experimental requirements into production-ready infrastructure
  • Develop and optimize large-scale data pipelines and reinforcement learning training/inference workflows

Requirements

  • Must have 6+ years of experience in infrastructure engineering with expertise in large-scale distributed systems
  • Strong communication and collaboration skills
  • Deep knowledge of performance optimization and system architectures for high-throughput ML workloads
  • Experience with containerization (Docker, Kubernetes) and orchestration at scale
  • Proven track record building large-scale data pipelines and distributed storage systems
  • Ability to diagnose and resolve complex infrastructure challenges in production
  • Experience working across the full ML stack from data pipelines to performance optimization
  • Experience collaborating with researchers to scale experimental ideas

Nice to have

  • Experience with language model training infrastructure and distributed ML frameworks (PyTorch, JAX)
  • Background in AI research lab infrastructure or large-scale ML organizations
  • Knowledge of GPU/TPU architectures and language model inference optimization
  • Experience with cloud platforms (AWS, GCP) at enterprise scale
  • Familiarity with VM and container orchestration
  • Experience with workflow orchestration and experiment management systems
  • History working with large-scale reinforcement learning
  • Comfort with large-scale data pipelines (Beam, Spark, Dask)

Culture & Benefits

  • Competitive compensation including equity and benefits
  • Generous vacation and parental leave
  • Flexible working hours and hybrid work policy
  • Visa sponsorship available with immigration lawyer support
  • Collaborative and impact-driven research environment

Hiring process

  • Evaluation of relevant experience and skills
  • Technical interviews focusing on infrastructure and distributed systems
  • Assessment of communication and collaboration abilities

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →