Software Engineer (GPU Networking)

150 000 - 250 000$

Формат работы

onsite

Тип работы

fulltime

Грейд

senior

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Software Engineer (GPU Networking): Building and optimizing high-performance GPU networking and distributed systems for AI inference with an accent on integrating RDMA capabilities and co-optimizing communication alongside computation. Focus on architecting the software fabric that unifies thousands of GPUs, enabling serverless-grade startup speeds for LLMs, and deep-diving into bleeding-edge hardware performance.

Location: Onsite in San Francisco, US

Salary: $150,000–$250,000 annually, with equity

Company

hirify.global is a fast-growing product company that powers mission-critical AI inference for leading AI companies.

What you will do

Integrate RDMA/RoCE/InfiniBand capabilities directly into the inference stack to achieve order-of-magnitude improvements in bandwidth and latency.
Implement and tune networking layers for efficient Disaggregated KV Cache Offload and Wide Expert Parallelism (WideEP) for MoE models.
Enable sub-10-second startup for trillion-parameter models by working deeply with checkpointing and storage mechanisms.
Characterize and validate networking performance on bleeding-edge GPU clusters (H100/H200, B200/B300, GB200/300 NVL72).
Design tools to visualize packet flow, congestion, and effective bandwidth across GPU interconnects for diagnosing distributed system behaviors.
Work with communication libraries (NCCL, NVSHMEM) and potentially write custom communication kernels to overlap compute and data transfer.

Requirements

Deep experience with high-performance networking protocols (InfiniBand, RoCE v2).
Proficiency in C++ or Python, with the ability to bridge high-level logic and hardware.
Deep understanding of the memory hierarchy in modern NVIDIA architectures (H100/Blackwell) and optimization skills.
Ability to deep-dive into TensorRT-LLM source code, write custom C++/Python bindings, or debug NVLink topology issues.
Proven ability to build custom solutions when off-the-shelf tools are insufficient for performance needs.
Work onsite in San Francisco, US.

Nice to have

Deep knowledge of NCCL, NVSHMEM, and UCX.
Experience with GPUDirect Storage (GDS) or high-performance filesystems like Weka or 3FS.
Familiarity with TensorRT-LLM, vLLM, or Sglang.
Experience running low-level benchmarks to qualify new hardware clusters.

Culture & Benefits

Competitive compensation, including meaningful equity.
100% coverage of medical, dental, and vision insurance for employee and dependents.
Generous PTO policy, including a company-wide Winter Break.
Paid parental leave.
Company-facilitated 401(k).
Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.
Opportunity to work with bleeding-edge hardware like Blackwell (B200/B300) and Rubin architectures.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Текст вакансии взят без изменений

Источник - загрузка...

Software Engineer (GPU Networking)

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Nice to have

Culture & Benefits

Похожие вакансии

Member of Technical Staff, AI Networking (AI)

Staff Software Engineer (AI)

Staff Software Engineer (AI)

Senior Applied Scientist (AI)

Senior Software Engineer

Lead Engineer, ML Network Stack (AI Engineering)