Agent Post-Training Research (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Agent Post-Training Research (AI): Developing and optimizing frontier agentic models for Codex, ChatGPT, and the API with an accent on RL, RLHF, and multi-agent coordination. Focus on building training signals, creating robust evaluations, and integrating advanced capabilities into production models.
Location: San Francisco
Salary: $295K – $445K
Company
is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.
What you will do
- Design and run experiments to improve agentic model behavior across coding, tool use, function calling, and multi-agent collaboration.
- Own end-to-end improvements to the post-training stack, including RL, data pipelines, reward signals, and model-behavior analysis.
- Build evals and environments to identify model failures and translate them into training data or new research directions.
- Partner with product teams to translate user needs into model improvements for Codex, API, and ChatGPT.
- Develop alignment interventions, synthetic data mixtures, and eval loops that shape downstream agent behavior.
- Optimize large-scale training machinery for increased velocity, reliability, observability, and production readiness.
Requirements
- Strong technical fundamentals in machine learning, software engineering, systems, or statistics.
- Hands-on experience with LLMs, RL, RLHF/RLAIF, post-training, evals, or production ML systems.
- Ability to move from vague behavioral problems to concrete hypotheses and experiments.
- Comfort working across research, product, infrastructure, data, and safety boundaries.
- Must be based in San Francisco
Culture & Benefits
- High-agency role where work lands directly in frontier models used by millions.
- Collaborative environment working with world-class researchers and engineers.
- Mission-driven culture focused on the safe deployment of AGI.
- Commitment to equal opportunity and diversity in the workplace.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →