Назад
Company hidden
3 месяца назад

Principal Deployment Engineer (AI)

Формат работы
onsite
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Principal Deployment Engineer (AI): Architecting and leading the bringup of large-scale GPU clusters, responsible for defining how we deploy, validate, and scale AI superclusters across sites with an accent on rack design, fabric architecture, cluster validation frameworks and production readiness standards. Focus on defining technical standards for node, rack, and full-cluster bringup.

Location: Onsite in Seattle, US

Company

We are building AI infrastructure for frontier-scale workloads.

What you will do

  • Define technical standards for node, rack, and full-cluster bringup.
  • Lead large-scale GPU cluster deployments.
  • Architect high-performance network fabrics optimized for AI workloads.
  • Establish cluster-level acceptance criteria and validation frameworks.
  • Design repeatable deployment models for multi-site expansion.
  • Serve as the escalation point for complex bringup and performance issues.

Requirements

  • 10+ years of experience in large-scale infrastructure or HPC environments.
  • Proven experience bringing up large GPU clusters (hundreds+ GPUs).
  • Deep expertise in high-speed networking (InfiniBand, RoCE, Ethernet fabrics).
  • Strong understanding of server architecture (PCIe, NUMA, memory hierarchy).
  • Experience debugging performance issues across compute and network layers.
  • Strong automation and systems-level thinking.

Nice to have

  • Experience scaling AI training clusters for frontier models.
  • Experience with liquid cooling or ultra-high-density deployments.
  • Knowledge of distributed storage systems (Lustre, Ceph, NVMe-oF).
  • Experience defining infrastructure standards in a fast-growing organization.

Culture & Benefits

  • Move fast, operate with ownership.
  • Expect technical leaders to define standards—not just follow them.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →