Назад
Company hidden
13 часов назад

Lead Systems HPC Engineer (AI)

170 000 - 300 000$
Формат работы
remote (только USA)
Тип работы
fulltime
Грейд
lead
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify RU Global, списка компаний с восточно-европейскими корнями
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Lead Systems HPC Engineer (AI): Building and optimizing a hyperscaler platform and large-scale GPU clusters with an accent on the intersection of hardware and system software. Focus on identifying performance bottlenecks, tuning distributed communication layers, and optimizing system behavior across the full stack.

Location: Remote (Must be based in the United States)

Salary: $170k–$300k OTE + equity

Company

hirify.global is leading a new era in cloud computing to serve the global AI economy by creating infrastructure tools for AI/ML transformation.

What you will do

  • Optimize the performance of large-scale GPU clusters at the intersection of hardware and software.
  • Investigate and troubleshoot performance issues of GPU clusters under real training and inference workloads.
  • Evaluate and integrate new hardware, system configurations, and tuning approaches through the software stack.
  • Operate across the full stack, including networking (InfiniBand/RoCE), virtualization (KVM/QEMU), and communication layers (MPI, NCCL).
  • Collaborate with internal infrastructure teams and hardware vendors such as NVIDIA, Mellanox, and Intel.
  • Contribute to hardware and cluster qualification to ensure systems meet performance expectations.

Requirements

  • 5+ years of professional experience in system-level software development focused on performance optimization and low-level programming.
  • 3+ years of hands-on experience with Linux systems administration, troubleshooting, and performance tuning.
  • In-depth understanding of server architecture, including PCIe devices, NICs, Linux OS/Kernel, and HPC systems.
  • Strong proficiency in performance-oriented languages such as C/C++, Go, or Python.
  • Must be based in the United States.

Culture & Benefits

  • 100% company-paid medical, dental, and vision coverage for employees and families.
  • 401(k) plan with up to 4% company match and immediate vesting.
  • Generous parental leave: 20 weeks for primary caregivers and 12 weeks for secondary caregivers.
  • Remote work reimbursement of up to $85/month for mobile and internet.
  • Company-paid short-term, long-term, and life insurance coverage.

Hiring process

  • The process includes coding interviews to evaluate technical proficiency.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →