Назад
Company hidden
2 дня назад

ML Platform & Infrastructure Engineer (AI)

Формат работы
onsite
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Релокация
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

ML Platform & Infrastructure Engineer (AI): Design and implement robust CI/CD pipelines for machine learning workflows and build scalable evaluation harnesses with an accent on reliability, automation, and performance optimization. Focus on developing research tooling, observability systems, and dashboards for model latency, GPU utilization, and system health.

Location: San Francisco Office, Full time On-site

Company

Stealth AI startup building trustworthy consumer-grade hirify.global agents for human-AI collaboration, backed by tier-1 investors, with team from Stanford, OpenAI, and DeepMind.

What you will do

  • Design and implement CI/CD pipelines for ML workflows, automating training runs, data ingestion, orchestration, checkpointing, and artifact management.
  • Build scalable evaluation infrastructure to benchmark models on every merge, optimizing latency and catching performance regressions.
  • Develop internal SDKs, CLIs, and lightweight UIs for researchers to inspect trajectories, visualize failures, curate datasets, and iterate efficiently.
  • Implement observability for model performance, GPU utilization, cluster health, inference costs, with dashboards and alerting systems.

Requirements

  • Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience.
  • 3+ years in Software Engineering, MLOps, or ML Infrastructure.
  • Strong Python proficiency.
  • Experience building internal developer tools, CLIs, or dashboards.
  • Experience with cloud infrastructure (AWS or GCP) and containerization (Docker, Kubernetes).

Nice to have

  • Experience designing CI/CD pipelines for ML workflows.
  • Familiarity with LLM serving stacks such as vLLM or TGI.
  • Experience manhirify.globalng GPU clusters and optimizing distributed workloads.

Culture & Benefits

  • All in, in person — work moves faster face-to-face.
  • Ship by default — speed is the feature.
  • Radical candor, zero politics, help each other win.
  • Competitive company-sponsored medical, dental, and vision insurance.
  • Top-tier relocation and immigration support.

Hiring process

  • Send a link or 60-second video of something you built and why it matters, your resume or LinkedIn, and two sentences on the hardest problem you've cracked.
  • Every exceptional candidate hears back within 48 hours.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →