Назад
Company hidden
2 часа назад

Senior Systems Engineer (AI)

Формат работы
onsite
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Senior Systems Engineer (AI): Leading hands-on bringup of network clusters in data center environments with an accent on node, rack, and network deployment validation. Focus on tuning high-speed fabrics, debugging performance issues, and building repeatable, scalable infrastructure processes.

Location: Must be based in the US and comfortable working onsite in data center environments.

Company

hirify.global is a startup building next-generation AI infrastructure, focused on delivering performant and scalable network clusters for frontier AI workloads.

What you will do

  • Execute end-to-end bringup of network nodes and racks from installation to production.
  • Validate BIOS, BMC, firmware configurations, and overall network health.
  • Bring up and validate high-speed network fabrics including InfiniBand, RoCE, and Ethernet.
  • Configure leaf/spine connectivity and run cluster-wide burn-in and stress testing.
  • Troubleshoot hardware, firmware, and fabric-level issues to optimize performance.
  • Automate provisioning processes and improve deployment documentation.

Requirements

  • 5–8+ years in infrastructure engineering, hardware deployment, or data center operations.
  • Hands-on experience deploying network servers like HGX or DGX platforms.
  • Deep understanding of high-speed networking fabrics (InfiniBand, RoCE, Ethernet).
  • Strong Linux systems knowledge.
  • Proven ability to troubleshoot distributed systems performance issues.
  • Must be able to work onsite in data center environments as required.

Nice to have

  • Experience in AI/ML infrastructure or HPC environments.
  • Familiarity with NCCL, CUDA, and RDMA.
  • Proficiency in automation tools like Python, Ansible, Terraform, or Bash.
  • Experience managing high-density power and cooling environments.

Culture & Benefits

  • Fast-paced startup environment with an emphasis on urgency and ownership.
  • Opportunity to build core AI infrastructure from the ground up.
  • Direct collaboration with networking, systems software, and data center teams.
  • Focus on developing repeatable, high-scale systems and processes.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Текст вакансии взят без изменений

Источник - загрузка...