Research Scientist - RL Training (AI)

200 000 - 325 000$

Формат работы

remote (только USA)/hybrid

Тип работы

fulltime

Грейд

senior

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Research Scientist - RL Training (AI): Developing reinforcement learning techniques and data pipelines to steer LLM behavior with an accent on reward modeling and preference datasets. Focus on implementing RLHF, DPO, and GRPO to produce high-quality training corpora for frontier AI labs.

Location: Hybrid in Redwood City or San Francisco, CA, or Remote within the United States

Salary: $200,000 - $325,000 USD

Company

hirify.global helps enterprises transform expert knowledge into specialized AI at scale by focusing on the data used to build AI systems.

What you will do

Research and implement RL techniques (GRPO, RLHF, RLAIF, DPO) to create data products like preference datasets and reward signals.
Design and build data pipelines for high-quality RL training signals and AI-assisted annotation to improve model generalization.
Prototype end-to-end RL training recipes to inform data-as-a-service deliveries.
Collaborate with research, engineering, and delivery teams to translate RL research into customer-ready data products.
Stay current with multi-node LLM training, alignment research, and scalable RL methods.
Contribute to research publications and the internal knowledge base.

Requirements

Deep expertise in RLHF, reward modeling, and credit attribution.
Experience training or fine-tuning 30B+ large language models at scale using distributed training infrastructure.
Proficiency in Python, PyTorch, HuggingFace, and RL frameworks such as Verl and SkyRL.
Strong software engineering fundamentals to build extensible research prototypes.
Familiarity with AWS, GCP, Kubernetes, or Slurm.
Ph.D. in machine learning, reinforcement learning, or a related field strongly preferred.

Culture & Benefits

Opportunity to shape priorities and strategic decisions in a rapidly scaling company.
Support for deepening technical expertise and exploring leadership opportunities.
Combination of stability with robust funding and the excitement of high growth.
Inclusive work environment committed to diversity and equal employment opportunities.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →