Назад
Company hidden
1 день назад

HPC Cluster Architect

Формат работы
remote (только United_kingdom)
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
UK
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

HPC Cluster Architect (NVIDIA GPU/HPC): Own end-to-end cluster architecture for large-scale dedicated GPU deployments from customer requirements to production handover with an accent on compute, networking, storage, and physical design. Focus on designing high-performance network fabrics, engaging with vendors, and providing technical oversight during deployment and validation.

Location: UK (Remote)

Company

hirify.global is behind Hyperstack, a full-stack AI cloud delivering on-demand and private GPU infrastructure to AI researchers and enterprises running compute-intensive workloads.

What you will do

  • Own end-to-end cluster architecture for large-scale NVIDIA GPU deployments, including rack layouts, BOM, power, cooling, and production handover
  • Design high-performance network fabrics across compute (InfiniBand, RDMA, NVLink/NVSwitch), storage, and WAN with topology and scaling strategies
  • Engage directly with OEMs and vendors to validate hardware, review quotes, and optimize designs commercially
  • Provide technical oversight during deployment, hardware validation, performance testing, and issue escalation
  • Act as senior technical leader across Solutions Architecture, Cloud Engineering, and data centre partners to build standardized reference designs

Requirements

  • Proven experience designing and delivering GPU-based HPC or AI clusters at scale across full lifecycle
  • Deep hands-on knowledge of NVIDIA GPU platforms (H100/H200/B-series) and reference architectures
  • Strong InfiniBand/RDMA design experience including topology and performance tuning
  • Solid grounding in Linux systems, PCIe topology, NUMA alignment, and server performance
  • Background from OEM, hyperscaler, neo-cloud, or HPC environment with full design-to-deployment exposure
  • Confident engaging with customers, vendors, and teams as technical authority

Nice to have

  • Experience with Spectrum-X or next-generation Ethernet fabrics
  • Prior involvement in 1,000+ GPU cluster deployments and benchmarking (NCCL, MLPerf)
  • Exposure to air-cooled/liquid-cooled HPC and infrastructure-as-code

Culture & Benefits

  • Competitive salary and annual discretionary bonus
  • Employee wellbeing benefits and 25 days holiday plus public holidays
  • Flexible working (remote or hybrid depending on role and location)
  • Real ownership, autonomy, and career progression in a fast-growing company
  • Collaborative international culture with trust, transparency, and mission-driven colleagues

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →