Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Researcher, Agent Post-Training, Personality (AI): Developing and training the behavioral layer of frontier agents to make them exceptional collaborators with an accent on reward signals, evaluation frameworks, and RLHF. Focus on translating qualitative human trust and communication nuances into scalable training data and model improvements.
Location: Must be based in San Francisco
Company
OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.
What you will do
- Develop a rigorous understanding of what makes an agent a great collaborator across professional, creative, and technical domains.
- Translate qualitative judgments about model behavior into concrete hypotheses, evaluations, graders, and training interventions.
- Improve reward models and RL objectives to shape model personality, trust, and satisfaction.
- Collaborate with pretraining teams on data mixtures, synthetic data, and upstream choices shaping downstream behavior.
- Partner with product teams (ChatGPT, Codex) to validate model improvements in real-world workflows.
- Own projects end-to-end, from identifying behavioral failures through experimentation and training to launch.
Requirements
- Strong technical foundations in machine learning, software engineering, statistics, behavioral science, or HCI.
- Experience with LLMs, post-training, RL/RLHF, reward modeling, evals, or production ML systems.
- Ability to translate subjective product questions into falsifiable hypotheses and rigorous evaluations.
- Strong taste for model behavior with the ability to explain why specific responses are thoughtful and useful.
- Ability to work effectively across researchers, engineers, product teams, and designers.
- Location: Must be based in San Francisco
Culture & Benefits
- Opportunity to shape how frontier agents communicate and build trust with millions of people.
- Work in a high-impact environment creating the next generation of proactive intelligence.
- Commitment to diversity and equal opportunity employment.
- Collaboration with world-class researchers and engineers in the field of AGI.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →