Software Engineer, RL Data (AI)

320 000 - 485 000$

Формат работы

hybrid

Тип работы

fulltime

Грейд

senior

Английский

Страна

UK/US

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Software Engineer, RL Data (AI): Building and optimizing high-quality reinforcement learning data pipelines and human feedback tooling for Claude with an accent on data collection, execution environments, and QA frameworks. Focus on hardening sandboxed environments, iterating on prompt pipelines, and scaling data collection to ensure model trustworthiness.

Location: Hybrid; must be based in or near London, San Francisco, Seattle, or New York City (minimum 25% office presence required)

Salary: $320,000 - $485,000 USD

Company

hirify.global is a public benefit corporation dedicated to creating reliable, interpretable, and steerable AI systems that are safe and beneficial for society.

What you will do

Own end-to-end technical architecture and operational success of RL data stacks.
Build and iterate on data collection pipelines, prompts, and evaluation graders.
Develop QA frameworks to prevent reward hacking and ensure environment quality.
Create efficient interfaces for human data collection to streamline feedback.
Harden execution environments, including sandboxing and snapshotting, to ensure stability at training scale.
Collaborate with domain experts and manage technical relationships with external data vendors.

Requirements

Proficiency in modern programming languages, specifically Python and TypeScript.
Experience designing and running backend systems or infrastructure.
Must be able to work from one of the office hubs (London, SF, Seattle, NYC) at least 25% of the time.
Willingness to own problems end-to-end, including operational and non-engineering tasks.
Ability to iterate quickly in ambiguous, fast-changing situations.
Effective use of AI tools in day-to-day professional work.

Nice to have

Experience with LLM-powered systems, prompt pipelines, or RL on LLMs.
Background as a founder or early startup engineer.
Experience with containers, Kubernetes, or simulation infrastructure.
Experience handling sensitive data or working under tight security controls.
Familiarity with AI safety or security research.

Culture & Benefits

Competitive compensation and optional equity donation matching.
Generous vacation and parental leave.
Flexible working hours and a collaborative research-driven environment.
Visa sponsorship is available for eligible candidates.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →