Software Engineer, RL Training Infra (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Software Engineer, RL Training Infra (AI): Maintaining fast, reliable, and unblocked reinforcement learning training runs for frontier agents used in Codex, ChatGPT, and the API with an accent on scaling, orchestration, and debugging distributed infrastructure. Focus on solving technical problems at the boundary between research and engineering, improving training reliability and efficiency, and turning recurring operational issues into better tools and systems.
Location: Must be based in San Francisco
Salary: $295K – $445K + Offers Equity
Company
is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.
What you will do
- Keep large-scale RL training runs moving by addressing urgent engineering and infrastructure problems.
- Debug issues across training systems, inference, orchestration, scaling, and distributed infrastructure.
- Solve technical problems related to scaling experiments, improving training reliability, and reducing latency and cost.
- Improve reliability and efficiency for RL training runs.
- Help researchers who are developing infra-heavy integrations, such as multi-agent capabilities or memory.
- Turn recurring operational issues into better tools, systems, processes, or abstractions.
Requirements
- Experience in some layer of ML infrastructure.
- Experience in RL, inference, scaling, training systems, or orchestration.
- Comfortable operating across unfamiliar layers and learning quickly.
- Strong debugging skills with high ownership and excellent communication.
- Ability to become useful quickly in ambiguous areas with tight timelines.
Nice to have
- Experience supporting large-scale model training, async RL systems, or high-throughput ML infrastructure.
- Experience debugging distributed systems across GPUs, networking, orchestration, or inference stacks.
- Background in performance optimization, scaling, or production-critical infrastructure.
- Experience working directly with researchers or fast-moving model teams.
Culture & Benefits
- Push the boundaries of AI systems and safely deploy them to the world through products.
- Be part of an equal opportunity employer that values different perspectives and experiences.
- Contribute to ensuring that general-purpose artificial intelligence benefits all of humanity.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →