AI Infrastructure Engineer (HPC)

Тип работы

fulltime

Грейд

senior

Английский

Страна

Singapore

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

AI Infrastructure Engineer (GPU/HPC): Operating and optimizing GPU clusters and implementing elastic scheduling for inference and training with an accent on high-throughput serving, distributed communication stacks, and unified orchestration. Focus on tuning vLLM/SGLang runtimes, optimizing NCCL/RDMA communication, and building comprehensive observability for GPU utilization.

Location: Singapore, SG

Company

hirify.global is a world-leading technology company specializing in Bitcoin mining solutions and AI cloud services, operating a global portfolio of HPC datacenters.

What you will do

Operate and optimize GPU clusters using Kubernetes, Slurm, and Ray across multiple regions.
Implement elastic scheduling and unified orchestration for inference and training jobs using Kueue, NVIDIA KAI Scheduler, or KEDA.
Manage and tune vLLM and SGLang runtimes for high-throughput, low-latency serving, focusing on continuous batching and KV-cache paging.
Optimize distributed communication stacks, including NCCL/RCCL, RDMA over RoCEv2, and InfiniBand.
Benchmark and profile performance across various model sizes (7B to 70B+) and precisions (FP8, AWQ, GPTQ).
Build observability stacks with Prometheus, Grafana, and OpenTelemetry to monitor GPU utilization and latency.

Requirements

Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, or a related field (PhD preferred).
4–8+ years of experience in backend engineering, distributed systems, platform engineering, or applied AI.
Strong proficiency in Python, plus experience with Go, TypeScript, Rust, or C++.
Hands-on experience with Kubernetes, Slurm, or Ray.
Strong background in PyTorch/JAX and distributed communication stacks (NCCL/RCCL, RDMA).
Fluent in English.

Nice to have

Experience with major cloud platforms and designing production-grade architectures.
Familiarity with retrieval systems, embeddings, and vector stores like Qdrant, Chroma, or pgvector.
Experience with agent frameworks, tool-calling, function orchestration, or MCP.

Culture & Benefits

Opportunity to work in a global environment with datacenters in the US, Bhutan, Norway, Canada, Malaysia, and Ethiopia.
Exposure to world-leading technology in ASIC chip design and HPC cloud capabilities.
Commitment to equal employment opportunities and a diverse, inclusive workplace.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →