Назад
Company hidden
обновлено 2 часа назад

Researcher, Artifacts - Agent Post-Training (AI)

250 000 - 380 000$
Формат работы
onsite
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Researcher, Artifacts - Agent Post-Training (AI): Training frontier models to create polished work products like documents, spreadsheets, and dashboards with an accent on RL, data pipelines, and reward signals. Focus on improving agentic model behavior, designing complex evals, and turning model failures into training data.

Location: San Francisco, USA

Salary: $250,000 – $380,000 USD + Equity

Company

hirify.global is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.

What you will do

  • Design and run experiments to improve agentic model behavior for complex software and plugins.
  • Own end-to-end improvements to the post-training stack, including RL, data pipelines, reward signals, and diagnostics.
  • Build evals and environments to identify model failures and convert them into training data or research directions.
  • Partner with Codex and ChatGPT product teams to translate user needs into model improvements.
  • Implement early-training and alignment interventions using synthetic data and eval loops.
  • Debug complex failures in shipped models and develop concrete hypotheses and fixes.

Requirements

  • Strong technical fundamentals in machine learning, software engineering, systems, or statistics.
  • Hands-on experience with LLMs, RL, RLHF/RLAIF, and post-training.
  • Expertise in evals, graders, synthetic data, and production ML systems.
  • Ability to translate vague behavioral problems into concrete experiments and analysis.
  • Must be based in San Francisco, USA

Nice to have

  • Prior background in consulting, finance, marketing, operations, or data science.

Culture & Benefits

  • High-agency role with work landing directly in frontier models.
  • Collaborative environment spanning research, product, infrastructure, and safety teams.
  • Opportunity to shape the next generation of proactive intelligence.
  • Competitive compensation including equity offers.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →