AI Platform Engineer (Training and Inference)

Формат работы

hybrid

Тип работы

fulltime

Грейд

middle/senior

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

AI Platform Engineer (Training and Inference): Building and operating the compute layer for training, evaluating, and serving AI models with an accent on distributed training, LLM inference mesh, and model promotion lifecycles. Focus on scaling Ray-based infrastructure, optimizing inference performance, and automating the full model deployment pipeline.

Location: Must be based in or able to work from San Francisco (Hybrid)

Company

hirify.global is a leader in identity security, providing an AI-powered platform that manages and governs access to applications, data, and business processes for Fortune 500 companies and government institutions.

What you will do

Manage the Ray ecosystem end-to-end, including KubeRay on GKE and distributed object stores.
Operate distributed training pipelines using Ray Train, TorchTrainer, and multi-node H100 clusters.
Build and scale the LLM inference mesh using vLLM, SGLang, and NVIDIA Triton.
Design and operate the model routing layer with cost-aware fallback between SLMs and LLMs.
Implement the full model promotion lifecycle, including shadow mode, A/B testing, and canary rollouts.
Integrate RAG retrieval into the inference mesh for context-aware LLM responses.

Requirements

Experience in ML engineering with a focus on ML platform or MLOps roles.
Production-level expertise with the Ray ecosystem (Ray Train, Serve, Core, Data).
Hands-on experience with LLM serving engines like vLLM, SGLang, or NVIDIA Triton.
Strong proficiency in Python, PyTorch, and ML orchestration tools like Flyte.
Knowledge of distributed training techniques including DDP, FSDP, and NCCL collectives.
Understanding of model lifecycle operations, including registry, shadow/canary patterns, and automated rollbacks.

Nice to have

Experience with model quantization techniques such as INT8/INT4/FP8 (GPTQ, AWQ, bitsandbytes).

Culture & Benefits

Competitive total rewards package.
Opportunities for career growth and advancement.
Eligibility for discretionary bonus plans based on individual and organizational performance.
Work in a high-impact environment focused on cutting-edge AI identity security.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →