Назад
11 часов назад

Research Engineer (AI)

Формат работы
hybrid
Тип работы
fulltime
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Research Engineer (AI): Building RL environments and data strategies to improve Claude's performance in specialized domains with an accent on reward design and synthetic data sourcing. Focus on developing QA frameworks to mitigate reward hacking and conducting generalization experiments to scale model capabilities.

Location: Hybrid (San Francisco, CA or New York City, NY)

Company

Anthropic is a public benefit corporation dedicated to creating reliable, interpretable, and steerable AI systems.

What you will do

  • Own end-to-end data strategy for knowledge work verticals, from task sourcing through RL training.
  • Design RL environments, identify high-value tasks, and create reward signals.
  • Manage technical relationships with external data vendors and evaluate data quality.
  • Collaborate with domain experts to build data pipelines and evaluations.
  • Develop QA frameworks to prevent reward hacking and ensure environment quality.
  • Run generalization experiments to measure the impact of data strategies on model capabilities.

Requirements

  • Experience with fine-tuning LLMs for specific domains or real-world use cases.
  • Experience with reinforcement learning, reward design, or training data curation for LLMs.
  • Ability to manage technical vendor relationships and iterate quickly on feedback.
  • Strong cross-functional collaboration skills.
  • Bachelor’s degree in a relevant field or equivalent professional experience.
  • Must be based in or able to work from San Francisco or New York City.

Nice to have

  • Experience training production ML systems.
  • Experience designing evaluations or benchmarks for LLMs.
  • Domain expertise in finance, healthcare, or legal verticals.
  • Experience working with external technical partners.

Culture & Benefits

  • Collaborative "big science" environment focused on high-impact, large-scale research.
  • Competitive compensation and optional equity donation matching.
  • Generous vacation and parental leave policies.
  • Flexible working hours and high-quality office spaces for collaboration.
  • Visa sponsorship available for qualifying candidates.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →