Senior Machine Learning Engineer (ML Infrastructure)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Senior Machine Learning Engineer (ML Infrastructure): Designing and building scalable, high-performance AI/ML platform infrastructure to support advanced AI research and intelligent driving technologies with an accent on distributed training and resource optimization. Focus on scaling large ML models, maximizing GPU utilization across heterogeneous hardware, and optimizing training workflows.
Location: Remote within the US; however, candidates living near a GM hub are expected to report to the office three times a week. Willingness to travel to Sunnyvale, CA as needed is required.
Salary: $170,000 – $240,000
Company
is developing intelligent driving technologies to create a world with zero crashes, zero emissions, and zero congestion.
What you will do
- Design and develop scalable, reliable, high-performance ML frameworks to support model training at scale.
- Analyze and optimize model training performance to scale distributed workflows and maximize resource utilization.
- Improve system observability, debuggability, and overall operational excellence and user experience.
- Collaborate with cross-functional teams to integrate new features and technologies into the AI platform.
Requirements
- Must be based in the United States.
- Bachelor's degree in Computer Science or equivalent experience.
- 3+ years of professional software engineering experience.
- 2+ years of specialized experience in AI/ML infrastructure, specifically enabling distributed training for large models.
- Strong proficiency in Python and frameworks such as PyTorch or TensorFlow.
- Experience with distributed computing, GPU computing, and cloud environments (AWS, GCP, or Azure).
Nice to have
- Extensive experience with PyTorch 2.x+ and distributed training frameworks.
- Experience with FSDP, Pipeline Parallelism, and scalable solutions for large foundational models.
- Proven track record in profiling, analysis, debugging, and optimizing data loading performance.
- Strong communication skills for consensus building and risk management.
Culture & Benefits
- Comprehensive health and wellbeing programs including medical, dental, and vision.
- Financial security through retirement savings plans, HSA, and FSA.
- Generous paid vacation, holidays, and sickness/accident benefits.
- Professional growth support via tuition assistance programs.
- Employee perks including GM vehicle discounts.
- Eligible for relocation benefits.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →