Researcher, Computer Use - Agent Post-Training (AI)

250 000 - 380 000$

Тип работы

fulltime

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Researcher, Computer Use - Agent Post-Training (AI): Developing and training frontier agents capable of operating computers, browsers, and desktops with an accent on RL, RLHF, and post-training signals. Focus on building robust environments, graders, and data pipelines to improve agent reliability and long-horizon execution.

Location: San Francisco, USA

Salary: $250,000 – $380,000 + Equity

Company

hirify.global is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.

What you will do

Design and run experiments to improve agentic model behavior for desktop and browser computer use.
Own end-to-end post-training improvements including RL, data pipelines, reward signals, and evaluations.
Build evaluation environments to identify model failures and convert them into training data or research directions.
Collaborate with Codex and ChatGPT product teams to translate user needs into model improvements.
Work on early-training interventions, including data mixtures, synthetic data, and alignment loops.
Debug complex failures in shipped models and develop concrete hypotheses for technical fixes.

Requirements

Strong technical fundamentals in machine learning, software engineering, systems, or statistics.
Hands-on experience with LLMs, RL, RLHF/RLAIF, post-training, or production ML systems.
Ability to move from vague behavioral problems to concrete, executable experiments.
Comfort working across research, product, infrastructure, and safety boundaries.
Experience with coding agents, tool-using agents, or synthetic data generation.

Culture & Benefits

Opportunity to work on frontier models that land directly in global products.
High-agency environment solving open-ended, complex AI problems.
Competitive compensation package including equity.
Inclusive work culture as an equal opportunity employer.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →