Назад
Company hidden
2 дня назад

Researcher, Alignment Oversight (AI)

250 000 - 445 000$
Формат работы
hybrid
Тип работы
fulltime
Английский
b2
Страна
US
Релокация
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Researcher, Alignment Oversight (AI): Designing and implementing oversight systems to improve the control and alignment of agentic AI models with an accent on model training, evaluation design, and scalable oversight. Focus on developing evaluations for alignment failure modes and translating research intuition into production-facing systems.

Location: Hybrid in San Francisco, CA (Relocation assistance provided)

Salary: $250,000 – $445,000 + Equity

Company

An AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.

What you will do

  • Design and implement alignment experiments focused on oversight systems for increasingly agentic AI models.
  • Deploy practical systems for action monitoring, red-teaming, and human-in-the-loop control.
  • Develop evaluations for frontier model failure modes such as overeagerness, covert actions, and scheming propensity.
  • Analyze deployment data to understand model failures and identify opportunities for training more aligned models.
  • Develop techniques for feeding oversight signals back into training while preserving process reliability.
  • Collaborate across research, product, security, and engineering teams to turn alignment ideas into working systems.

Requirements

  • Strong hands-on experience training, evaluating, or debugging large ML models, especially LLMs.
  • Experience with reinforcement learning, post-training, preference optimization, or scalable oversight.
  • Strong engineering execution to turn ambiguous research ideas into reliable tools and training pipelines.
  • Research intuition grounded in implementation details and empirical results.
  • Must be based in or be able to relocate to San Francisco, CA.

Culture & Benefits

  • Hybrid work model with 3 days in the office per week.
  • Relocation assistance provided for new employees.
  • Fast-paced, collaborative research environment where priorities shift based on evidence.
  • Culture that views AI safety and usefulness as coupled, essential goals.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →