Research Scientist, RL for Autonomous Planning & World Modeling (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Research Scientist, RL for Autonomous Planning & World Modeling (AI): Researching and developing RL and distillation techniques for autonomous vehicle trajectory planning with an accent on foundation world models and post-training evaluation. Focus on integrating emerging AI research into internal infrastructure and scaling high-impact ML methods.
Location: Hybrid (Mountain View, San Francisco, New York City, Kirkland). Must be based in the US.
Salary: $204,000—$259,000 USD
Company
is an autonomous driving technology company building the Driver to improve mobility and safety through fully autonomous ride-hailing services.
What you will do
- Participate in ’s Foundation World Model post-training and evaluation.
- Research and develop cutting-edge RL and distillation techniques for Autonomous Vehicle Trajectory Planning.
- Integrate emerging research from the AI community into internal RL infrastructure through rigorous ablations.
- Partner with cross-functional engineering and research teams to scale post-training best practices.
Requirements
- PhD or Masters in Computer Science, Machine Learning, Robotics, or a similar technical field.
- 3+ years of industry or post-doc research experience in Reinforcement Learning or Foundation Models.
- Demonstrated original contributions via high-impact publications (NeurIPS, ICLR, CVPR) or significant open-source work.
- Proficiency in implementing scalable, distributed model training flows (Data parallel, FSDP, and other sharding approaches).
- Willingness to work with the complexity of globally distributed inference infrastructure.
Nice to have
- PhD with a research focus on RL, Foundation Models, or Multi-Modal learning.
- Experience designing and deploying RL infrastructure, specifically for on-policy learning or alignment with human preferences.
- First-author publications at top-tier AI venues.
- Experience with many-machine training infrastructure and tensor-parallel inference techniques.
Culture & Benefits
- Discretionary annual bonus program.
- Equity incentive plan.
- Generous company benefits program.
- Opportunity to collaborate with other research teams across Alphabet.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →