2 дня назад

Research Engineer (RL Scaling Science)

375 000 - 640 000GBP

Формат работы

hybrid

Тип работы

fulltime

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Research Engineer (RL Scaling Science): Designing and running large-scale RL experiments to develop training recipes for frontier models with an accent on scaling laws, compute efficiency, and task horizons. Focus on building benchmarks for long-horizon RL and resolving complex bottlenecks at the intersection of research and infrastructure.

Location: Hybrid: Must be based in London, UK (minimum 25% office presence)

Salary: £375,000 - £640,000 GBP

Company

Anthropic is a public benefit corporation dedicated to creating reliable, interpretable, and steerable AI systems that are safe and beneficial for society.

What you will do

Design, run, and interpret large-scale RL experiments to understand scaling behavior across model size and compute.
Build and maintain benchmarks for long-horizon RL to ensure progress is measurable and reproducible.
Translate validated research findings into production training recipes for frontier models.
Debug complex failures occurring at the seam where research meets large-scale infrastructure.
Collaborate with adjacent RL teams to advance the overall RL stack and capabilities.

Requirements

Strong empirical research skills in Reinforcement Learning or large-scale ML training.
Ability to own large experiments end-to-end, from design through interpretation.
Proficiency in Python and experience with distributed ML systems.
Comfort operating and debugging at the research/systems boundary.
Bachelor's degree or equivalent combination of education and professional experience.

Nice to have

Published or shipped work in long-horizon RL or RL fundamentals.
Experience translating research findings into production training recipes.
Demonstrated large-scale industry impact via RL interventions.
Experience working on frontier-scale training runs with long trajectories.

Culture & Benefits

Collaborative "big science" environment focusing on high-impact research over small puzzles.
Competitive compensation with optional equity donation matching.
Generous vacation and parental leave policies.
Flexible working hours and high-quality collaborative office space.
Visa sponsorship available for eligible candidates.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Похожие вакансии

Research Engineer (RL Scaling Science)

Anthropic

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Nice to have

Culture & Benefits

Похожие вакансии

Researcher, Training (AI)

Data Scientist, Safety (AI)

Senior Machine Learning Engineer (AI)

Senior Data Engineer (AI)

Data Science Manager (AI/ML)

Senior Data Scientist (Media)

Разработка

Game Dev

Design и Creative

Аналитика

Менеджмент

People & Business

Research Engineer (RL Scaling Science)

Anthropic

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Nice to have

Culture & Benefits

Categories

Похожие вакансии

Researcher, Training (AI)

Data Scientist, Safety (AI)

Senior Machine Learning Engineer (AI)

Senior Data Engineer (AI)

Data Science Manager (AI/ML)

Senior Data Scientist (Media)