Назад
Company hidden
18 часов назад

AI Platform Engineer (Training and Inference)

Формат работы
hybrid
Тип работы
fulltime
Грейд
middle/senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

AI Platform Engineer (Training and Inference): Building and operating the compute layer for training, evaluating, and serving AI models with an accent on distributed training, LLM inference mesh, and model promotion lifecycles. Focus on scaling Ray-based infrastructure, optimizing inference performance, and automating the full model deployment pipeline.

Location: Must be based in or able to work from San Francisco (Hybrid)

Company

hirify.global is a leader in identity security, providing an AI-powered platform that manages and governs access to applications, data, and business processes for Fortune 500 companies and government institutions.

What you will do

  • Manage the Ray ecosystem end-to-end, including KubeRay on GKE and distributed object stores.
  • Operate distributed training pipelines using Ray Train, TorchTrainer, and multi-node H100 clusters.
  • Build and scale the LLM inference mesh using vLLM, SGLang, and NVIDIA Triton.
  • Design and operate the model routing layer with cost-aware fallback between SLMs and LLMs.
  • Implement the full model promotion lifecycle, including shadow mode, A/B testing, and canary rollouts.
  • Integrate RAG retrieval into the inference mesh for context-aware LLM responses.

Requirements

  • Experience in ML engineering with a focus on ML platform or MLOps roles.
  • Production-level expertise with the Ray ecosystem (Ray Train, Serve, Core, Data).
  • Hands-on experience with LLM serving engines like vLLM, SGLang, or NVIDIA Triton.
  • Strong proficiency in Python, PyTorch, and ML orchestration tools like Flyte.
  • Knowledge of distributed training techniques including DDP, FSDP, and NCCL collectives.
  • Understanding of model lifecycle operations, including registry, shadow/canary patterns, and automated rollbacks.

Nice to have

  • Experience with model quantization techniques such as INT8/INT4/FP8 (GPTQ, AWQ, bitsandbytes).

Culture & Benefits

  • Competitive total rewards package.
  • Opportunities for career growth and advancement.
  • Eligibility for discretionary bonus plans based on individual and organizational performance.
  • Work in a high-impact environment focused on cutting-edge AI identity security.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →