Agent Post-Training, API & Power Users (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Agent Post-Training, API & Power Users (AI): Training and refining frontier agentic models for API developers and power users with an accent on tool use, coding, and long-horizon execution. Focus on designing evals from real-world workflows, implementing RLHF/post-training interventions, and improving model reliability in complex agentic systems.
Location: San Francisco; Must be US-based
Salary: $295K – $445K
Company
is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.
What you will do
- Design and run experiments to improve model behavior in API and power-user workflows, including function calling, tool use, and planning.
- Build evals, graders, and environments from real developer workflows to convert observed failures into training data.
- Partner with API and product teams to identify behavior gaps and implement post-training interventions.
- Develop feedback loops using power-user traces and API usage patterns to discover new model failures.
- Own end-to-end model behavior projects from qualitative failure analysis to launch readiness.
- Improve large-scale training machinery, focusing on experiment velocity, reliability, and production readiness.
Requirements
- Strong technical fundamentals in ML, software engineering, systems, or statistics.
- Hands-on experience with LLMs, post-training, RL/RLHF/RLAIF, evals, or synthetic data.
- Proven ability to analyze model traces and form concrete hypotheses for behavioral improvement.
- Experience building coding agents or tool-using agents in production ML systems.
- Must be based in the United States.
Nice to have
- Experience with multi-agent systems.
- Experience training models directly against production-like environments.
Culture & Benefits
- High-agency role with direct impact on frontier models used by millions.
- Collaborative environment working across research, engineering, and safety boundaries.
- Opportunity to push the boundaries of general-purpose AI capabilities.
- Commitment to equal opportunity and diversity in the workplace.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →