Member Of Engineering (RL Infrastructure AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Member Of Engineering (RL Infrastructure): Building and scaling the infrastructure for reliable and efficient reinforcement learning training of LLMs with an accent on improving reasoning and coding abilities. Focus on designing distributed RL pipelines, optimizing performance across the stack, and implementing novel exploration or training algorithms.
Location: Remote (EMEA/East Coast). Note: The team meets in-person once a month for 3 days and for longer offsites twice a year.
Company
is an AI research and engineering company focused on building AGI to serve as the engine behind economically valuable work and scientific progress.
What you will do
- Research and implement new RL exploration and training algorithms to improve LLM reasoning and coding.
- Design and scale robust, flexible, and distributed RL environments and pipelines.
- Develop methods for tuning training and inference end-to-end for high throughput.
- Build observability tooling to identify and debug system-level issues causing training regressions.
- Optimize performance across the stack, including networking, memory, compute scheduling, and I/O.
- Collaborate with the engineering team to plan and execute the RL infrastructure roadmap.
Requirements
- Experience with LLMs and model post-training workflows.
- Deep understanding of Reinforcement Learning principles and its main bottlenecks.
- Solid software engineering fundamentals, including testing, code review, and debugging complex systems.
- Proficiency in Python with expertise in concurrency, asynchronous programming, and multiprocessing.
- Familiarity with deep learning frameworks such as PyTorch or JAX.
- Experience designing and maintaining large-scale distributed RL training systems and inference stacks (e.g., vLLM).
Nice to have
- Open-source contributions to RL or distributed ML projects.
Culture & Benefits
- Fully remote work with flexible hours.
- Generous time off with 37 days of vacation and holidays per year.
- Health insurance allowance for employees and their dependents.
- Company-provided equipment and allowances for home office and continuous learning.
- A diverse, inclusive, people-first culture with frequent team get-togethers.
Hiring process
- Introductory call with a Founding Engineer.
- One or more technical interviews focused on engineering and RL.
- Team fit discussion with the People team.
- Final interview with a Founding Engineer.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →