Researcher, Computer Use - Agent Post-Training (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Researcher, Computer Use - Agent Post-Training (AI): Developing and training frontier agents capable of operating computers, browsers, and desktops with an accent on RL, RLHF, and post-training signals. Focus on building robust environments, graders, and data pipelines to improve agent reliability and long-horizon execution.
Location: San Francisco, USA
Salary: $250,000 – $380,000 + Equity
Company
is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.
What you will do
- Design and run experiments to improve agentic model behavior for desktop and browser computer use.
- Own end-to-end post-training improvements including RL, data pipelines, reward signals, and evaluations.
- Build evaluation environments to identify model failures and convert them into training data or research directions.
- Collaborate with Codex and ChatGPT product teams to translate user needs into model improvements.
- Work on early-training interventions, including data mixtures, synthetic data, and alignment loops.
- Debug complex failures in shipped models and develop concrete hypotheses for technical fixes.
Requirements
- Strong technical fundamentals in machine learning, software engineering, systems, or statistics.
- Hands-on experience with LLMs, RL, RLHF/RLAIF, post-training, or production ML systems.
- Ability to move from vague behavioral problems to concrete, executable experiments.
- Comfort working across research, product, infrastructure, and safety boundaries.
- Experience with coding agents, tool-using agents, or synthetic data generation.
Culture & Benefits
- Opportunity to work on frontier models that land directly in global products.
- High-agency environment solving open-ended, complex AI problems.
- Competitive compensation package including equity.
- Inclusive work culture as an equal opportunity employer.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →