Назад
Company hidden
3 дня назад

Agent Post-Training, Frontier Evals and Environments Research (AI)

295 000 - 445 000$
Формат работы
onsite
Тип работы
fulltime
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Agent Post-Training, Frontier Evals and Environments Research (AI): Building and optimizing frontier agents and evaluation environments to drive progress towards AGI/ASI with an accent on RL environments, model capabilities, and automated behavior exploration. Focus on designing scalable evaluation systems, creating self-improvement loops, and steering the largest training runs at hirify.global.

Location: San Francisco (Onsite)

Salary: $295K – $445K

Company

AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.

What you will do

  • Create ambitious RL environments to measure frontier model capabilities, skills, and behaviors.
  • Develop new methodologies for automatically exploring the behavior of frontier models.
  • Analyze the science of measurement, focusing on scalability, reliability, and variance of evaluation methodology.
  • Steer training for the largest training runs to guide future model capabilities.
  • Design scalable systems and processes to support continuous evaluation.
  • Build self-improvement loops to automate model understanding.

Requirements

  • Strong technical fundamentals in machine learning, software engineering, systems, or statistics.
  • Hands-on experience with LLMs, RL, RLHF/RLAIF, post-training, evals, graders, synthetic data, or production ML systems.
  • Ability to transition from vague behavioral problems to concrete experiments and pipelines.
  • Comfort working across research, product, infrastructure, data, evals, and safety boundaries.
  • Capability to communicate clearly with diverse technical and non-technical groups.

Culture & Benefits

  • High-agency role with work landing directly in frontier models.
  • Collaboration with researchers, engineers, product teams, and safety/alignment partners.
  • Opportunity to steer the most ambitious training runs in the industry.
  • Commitment to equal opportunity and diverse perspectives.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →