Reinforcement Learning Engineer (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Reinforcement Learning Engineer (AI): Developing and optimizing advanced RL workflows for self-improving agents with an accent on data efficiency, training methodologies, and large-scale model optimization. Focus on solving complex research challenges in sequence-level importance ratios and continuous learning, leveraging massive GPU infrastructure to push the boundaries of AI agents.
Location: Remote within the US or hybrid (San Francisco/Bellevue). Must be a U.S. person or eligible for export-controlled information access.
Salary: $188,000–$275,000
Company
is an AI Hyperscaler providing high-performance GPU infrastructure and end-to-end platforms for developers to build, train, and scale AI models.
What you will do
- Generate and investigate research ideas to overcome obstacles in continuous learning.
- Collaborate with the OpenPipe team to validate research directions on real-world customer tasks.
- Engineer scalable and data-efficient training methods for self-improving agents.
- Utilize massive compute clusters to run large-scale training experiments and ablations.
- Build and maintain production services supporting the RL platform lifecycle.
Requirements
- Proven experience training LLMs to SOTA on specific tasks.
- Strong engineering background with a focus on high-impact project results.
- Deep understanding of RL methods, specifically sequence-level or token-level importance ratios.
- Eligibility for access to export-controlled information (U.S. person or authorized status required).
- Proficiency in Python and familiarity with the stack (Kubernetes, Postgres, FastAPI).
Culture & Benefits
- 100% employer-paid medical, dental, and vision insurance.
- 401(k) retirement plan with a generous employer match.
- Comprehensive disability, life insurance, and mental wellness support (Spring Health).
- Flexible Paid Time Off policy to support work-life balance.
- Paid parental leave and family-forming support through Carrot.
- Hybrid work environment with quarterly team gatherings for collaboration.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →