Researcher, Agent Post-Training (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Researcher, Agent Post-Training (AI): Improving the capabilities, reliability, and product fit of agentic models for power users and API developers with an accent on post-training interventions and behavior improvement. Focus on designing evals from real developer workflows, building training environments, and optimizing tool-use and long-horizon execution.
Location: San Francisco, USA
Company
An AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.
What you will do
- Design and run experiments to improve model behavior in API and power-user workflows, including function calling, coding, and planning.
- Build evals, graders, and environments based on real developer workflows to convert failures into training data and hypotheses.
- Partner with API and product teams to identify behavior gaps and implement post-training interventions.
- Own end-to-end model behavior projects from qualitative failure analysis through data generation to launch readiness.
- Develop feedback loops using power-user traces and production-like environments to discover new model gaps.
- Improve the machinery for large-scale training, focusing on experiment velocity, reliability, and observability.
Requirements
- Must be based in San Francisco, USA
- Strong technical fundamentals in ML, software engineering, systems, or statistics.
- Hands-on experience with LLMs, post-training, RL/RLHF/RLAIF, evals, or production ML systems.
- Proven ability to turn ambiguous model behavior problems into concrete progress.
- Experience with synthetic data, coding agents, or tool-using agents.
- Strong taste for model behavior with the ability to form hypotheses from traces and API interactions.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →