TL;DR
Senior Systems Engineer (AI): Leading the hands-on bringup and deployment of GPU clusters for large-scale AI training with an accent on rack integration, network fabric validation, and performance tuning. Focus on building repeatable deployment systems and optimizing GPU infrastructure for production readiness.
Location: Must be based in Seattle, US
Company
hirify.global is a startup building next-generation AI infrastructure, delivering highly performant and scalable GPU clusters purpose-built for large-scale AI training and inference.
What you will do
- Execute end-to-end bringup of GPU nodes and racks, from physical installation to production readiness.
- Configure and validate high-speed network fabrics including InfiniBand and RoCE.
- Perform GPU-to-GPU and node-to-node performance validation using NCCL and RDMA.
- Troubleshoot hardware, firmware, and fabric-level issues to ensure stability.
- Contribute to automation efforts for provisioning and cluster validation processes.
- Collaborate with networking, systems software, and data center teams to support rapid scaling.
Requirements
- 5–8+ years of experience in infrastructure engineering, hardware deployment, or data center operations.
- Hands-on experience deploying GPU servers such as HGX or DGX platforms.
- Proficiency with high-speed networking including InfiniBand, RoCE, and Ethernet fabrics.
- Strong Linux systems knowledge and experience troubleshooting distributed systems.
- Must be comfortable working onsite in data center environments as needed.
Nice to have
- Experience in AI/ML infrastructure or HPC environments.
- Familiarity with CUDA and performance tuning tools.
- Automation proficiency using Python, Ansible, Terraform, or Bash.
- Experience working in high-density power and cooling environments.
Culture & Benefits
- Fast-paced startup environment with an emphasis on ownership and bias for action.
- Opportunity to build foundational infrastructure for frontier AI workloads.
- Direct impact on scaling AI capabilities through hands-on technical contribution.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →