ML Platform & Infrastructure Engineer (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
ML Platform & Infrastructure Engineer (AI): Design and implement robust CI/CD pipelines for machine learning workflows and build scalable evaluation harnesses with an accent on reliability, automation, and performance optimization. Focus on developing research tooling, observability systems, and dashboards for model latency, GPU utilization, and system health.
Location: San Francisco Office, Full time On-site
Company
Stealth AI startup building trustworthy consumer-grade agents for human-AI collaboration, backed by tier-1 investors, with team from Stanford, OpenAI, and DeepMind.
What you will do
- Design and implement CI/CD pipelines for ML workflows, automating training runs, data ingestion, orchestration, checkpointing, and artifact management.
- Build scalable evaluation infrastructure to benchmark models on every merge, optimizing latency and catching performance regressions.
- Develop internal SDKs, CLIs, and lightweight UIs for researchers to inspect trajectories, visualize failures, curate datasets, and iterate efficiently.
- Implement observability for model performance, GPU utilization, cluster health, inference costs, with dashboards and alerting systems.
Requirements
- Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience.
- 3+ years in Software Engineering, MLOps, or ML Infrastructure.
- Strong Python proficiency.
- Experience building internal developer tools, CLIs, or dashboards.
- Experience with cloud infrastructure (AWS or GCP) and containerization (Docker, Kubernetes).
Nice to have
- Experience designing CI/CD pipelines for ML workflows.
- Familiarity with LLM serving stacks such as vLLM or TGI.
- Experience manng GPU clusters and optimizing distributed workloads.
Culture & Benefits
- All in, in person — work moves faster face-to-face.
- Ship by default — speed is the feature.
- Radical candor, zero politics, help each other win.
- Competitive company-sponsored medical, dental, and vision insurance.
- Top-tier relocation and immigration support.
Hiring process
- Send a link or 60-second video of something you built and why it matters, your resume or LinkedIn, and two sentences on the hardest problem you've cracked.
- Every exceptional candidate hears back within 48 hours.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →