2 дня назад

Staff ML Engineer (Generative Model Performance & Efficiency)

251 000 - 310 000$

Формат работы

onsite

Тип работы

fulltime

Грейд

lead

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Staff ML Engineer (Generative Model Performance & Efficiency): Analyze and optimize generative model training and inference for low-latency, high-throughput serving with an accent on performance bottlenecks, model compression, and scalable distributed execution. Focus on designing efficient serving and training pipelines, experimenting with partitioning/sharding strategies, and building tooling for profiling and debugging of ML workloads.

Company

Waymo builds autonomous driving technology and the Waymo Driver for fully autonomous ride-hail.

What you will do

Analyze model architectures to identify bottlenecks in training and inference performance (memory bandwidth, compute, communication).
Develop and apply techniques for efficiency, including quantization (FP8/INT4), pruning, knowledge distillation, and efficient attention mechanisms.
Optimize model code for hardware accelerators (TPUs/GPUs) using compiler features and low-level libraries such as XLA.
Experiment with model partitioning and sharding strategies (data, tensor, pipeline parallelism, expert parallelism) to improve scalability and efficiency.
Design and implement low-latency, high-throughput serving solutions for generative models and optimize training pipelines to reduce training time.
Build and maintain tools for performance analysis, profiling, and debugging (e.g., xprof).

Requirements

MS or PhD in Computer Science, Machine Learning, Robotics, or a related field.
5+ years of experience with deep learning architectures (Transformers, Diffusion Models, MoEs) and optimization techniques.
Proficiency in JAX and Flax; experience with TensorFlow/PyTorch is a plus.
Expertise using profiling tools (XProf, Perfetto, NVIDIA Nsight) to diagnose performance issues in ML workloads.
Hands-on experience with quantization, pruning, distillation, and other model compression methods.
Strong programming skills in Python and potentially C++, with software development best practices.

Culture & Benefits

On-site role in Mountain View, California.
Discretionary annual bonus program and equity incentive plan (subject to eligibility).
Generous company benefits program (subject to eligibility).

Hiring process

Recruiter shares the specific salary range for the role location (or preferred location if remote is possible) during the hiring process.

Location: On Site — Mountain View, California

Salary: $251,000—$310,000 USD (base salary range across US locations)

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Похожие вакансии

Thinking Machines Lab

Staff ML Engineer (Generative Model Performance & Efficiency)

Waymo

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Culture & Benefits

Hiring process

Похожие вакансии

Software Engineer, Research Acceleration (AI)

Research Engineer (AI)

Applied Scientist / Machine Learning Engineer (AI)

Staff Machine Learning Engineer (Vision Models)

Research Engineer (Robotics/AI)

Staff AI Inference and Acceleration Engineer (Robotics)

Разработка

Game Dev

Design и Creative

Аналитика

Менеджмент

People & Business

Staff ML Engineer (Generative Model Performance & Efficiency)

Waymo

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Culture & Benefits

Hiring process

Categories

Похожие вакансии

Software Engineer, Research Acceleration (AI)

Research Engineer (AI)

Applied Scientist / Machine Learning Engineer (AI)

Staff Machine Learning Engineer (Vision Models)

Research Engineer (Robotics/AI)

Staff AI Inference and Acceleration Engineer (Robotics)