Researcher, Alignment Science (AI)

250 000 - 445 000$

Формат работы

remote (только USA)/hybrid

Тип работы

fulltime

Грейд

senior

Английский

Страна

Релокация

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Researcher, Alignment Science (AI): Designing and implementing experiments to ensure frontier models follow user intent and remain honest with an accent on reinforcement learning, calibration, and robustness. Focus on developing scalable alignment methods, building evaluations for failure modes like hallucination and reward hacking, and integrating these techniques into model deployment.

Location: Hybrid in San Francisco, CA (3 days in office). Open to exceptional remote candidates. Relocation assistance provided.

Salary: $250,000 – $445,000 + Equity

Company

An AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.

What you will do

Design and implement alignment experiments focused on intent following, honesty, calibration, and robustness.
Train and evaluate models using reinforcement learning and other empirical ML methods.
Develop evaluations for failure modes such as hallucination, reward hacking, covert actions, and scheming.
Build monitoring and inference-time interventions to ensure compliant behavior.
Investigate how alignment methods scale with model capability, compute, data, and adversarial pressure.
Integrate successful alignment techniques into model training and deployment workflows.

Requirements

Strong hands-on experience training, evaluating, or debugging large ML models, especially LLMs.
Excellent engineering skills in Python and modern ML frameworks such as PyTorch.
Mathematical rigor and ability to turn ambiguous research questions into measurable experiments.
Experience with reinforcement learning, post-training, preference optimization, or scalable oversight.
Ability to operate with high independence in a fast-paced, collaborative research environment.
Strong record in technical problem solving (e.g., competitive programming, math contests, or rigorous engineering projects).

Culture & Benefits

Hybrid work model (3 days in office per week).
Relocation assistance for new employees.
Opportunity to produce externally publishable research that advances the science of alignment.
Collaborative environment working across post-training, RL, and safety teams.
Focus on building trustworthy, honest, and reliable AI systems for high-stakes settings.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →