Назад
Company hidden
2 дня назад

Network Engineer (Supercomputing)

350 000 - 475 000$
Формат работы
onsite
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Релокация
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Network Engineer (Supercomputing): Owning the lowest layers of the network stack for large-scale AI training and inference with an accent on RDMA/RoCE fabric reliability and NVLink/NVSwitch interconnects. Focus on debugging production collectives, building instrumentation tooling, and driving technical resolutions with cloud providers to ensure fleet reliability at multi-thousand-GPU scale.

Location: Must be based in San Francisco, California

Compensation: $350,000 - $475,000 USD

Company

hirify.global is an AI research organization dedicated to advancing collaborative general intelligence and building accessible tools for the AI community.

What you will do

  • Validate and reason about GPU network fabric design across large-scale deployments.
  • Debug RDMA/RoCEv2, NCCL failures, and congestion control behavior across NIC vendors.
  • Manage NVLink/NVSwitch interconnects, including fabric manager health and link error diagnostics.
  • Develop host-level network instrumentation, dashboards, and automated alerts.
  • Triage complex issues across NIC, driver, kernel, switch, and workload boundaries.
  • Drive escalations with cloud-provider networking teams to ensure end-to-end resolution.

Requirements

  • Must be based in San Francisco, California
  • Bachelor’s degree or equivalent experience in computer science or engineering.
  • Proficiency in Python or Rust.
  • Experience operating large-scale clusters and container orchestration systems like Kubernetes or Slurm.
  • Ability to own projects end-to-end and thrive in a cross-functional environment.
  • Visa sponsorship is available for qualified candidates.

Culture & Benefits

  • Generous health, dental, and vision insurance coverage.
  • Unlimited PTO and paid parental leave.
  • Relocation support provided as needed.
  • Opportunity to work on cutting-edge AI infrastructure at massive scale.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →