2 дня назад

Research Engineer, Code RL (AI)

500 000 - 850 000$

Формат работы

hybrid

Тип работы

fulltime

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Research Engineer, Code RL (AI/RL): Advancing models' ability to write, edit, test, and debug software using reinforcement learning with an accent on designing RL environments and reward signals. Focus on building scalable RL infrastructure, enhancing model reasoning, and implementing long-horizon autonomous engineering.

Location: Hybrid (San Francisco, CA or New York City, NY); staff are expected to be in one of the offices at least 25% of the time

Salary: $500,000 - $850,000 USD

Company

Anthropic is a public benefit corporation dedicated to creating reliable, interpretable, and steerable AI systems that are safe and beneficial for society.

What you will do

Design RL environments and coding tasks to advance models' ability to ship real software end-to-end.
Build reward signals and verifiers that define and capture the criteria for high-quality code.
Execute training experiments on frontier models and diagnose performance bottlenecks.
Improve the speed and reliability of pipelines to enable fast iteration of RL research.
Collaborate with alignment and frontier red teams to ensure systems are both capable and safe.

Requirements

Strong software engineering skills and deep Python expertise, including async/concurrent programming.
Ability to own systems end-to-end and debug across the entire stack.
Capacity to balance research exploration with rigorous engineering implementation and experimental design.
Deep commitment to code quality, testing, and system performance.
Bachelor’s degree or equivalent combination of education and professional experience in a relevant field.

Nice to have

Experience with RL, RLHF, post-training, or LLM finetuning.
Experience building coding agents, code-execution sandboxes, or developer tooling.
Background in program analysis, compilers, verification, or formal methods.
Proficiency with PyTorch, large-scale distributed training, and ML system optimization.
Experience with CUDA/GPU/TPU kernels and accelerator-performance intuition.

Culture & Benefits

Collaborative "big science" research environment focusing on high-impact goals.
Competitive compensation with optional equity donation matching.
Generous vacation and parental leave policies.
Flexible working hours and modern office spaces for collaboration.
Visa sponsorship is available for qualified candidates.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Похожие вакансии

Research Engineer, Code RL (AI)

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Nice to have

Culture & Benefits

Похожие вакансии

ML/RL Research Engineer (AI)

Principal Machine Learning Engineer (AI)

Staff Machine Learning Systems Engineer (AI)

Data Scientist (AI)

Senior Data Scientist (AI)

Director of Data Science (AI)

Разработка

Game Dev

Design и Creative

Аналитика

Менеджмент

People & Business