Research Engineer / Scientist – Reinforcement Learning (RL, AI)

Формат работы

onsite

Тип работы

fulltime

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Research Engineer/Scientist (Reinforcement Learning): Advancing RL capabilities for decision-making in critical industries like healthcare, manufacturing, and energy with an accent on real-world deployment and optimization. Focus on developing RL methods for complex planning tasks, building experimental infrastructure, and transitioning research to production platforms.

New York City; Boston

Company

hirify.global transforms critical institutions with applied AI through forward-deployed expertise, in-house Mosaic toolkit for agentic workflows, and partnerships with Anthropic, McKinsey, AWS, and General Catalyst.

What you will do

Identify real-world challenges tractable for RL-guided decision making.
Develop RL methods for complex tasks in planning, decision-making, or optimization.
Build and maintain experimental infrastructure including simulation environments, data pipelines, training, and evaluation frameworks.
Conduct large-scale in-the-wild evaluations driving significant business value.
Partner with applied AI engineers to integrate research into Mosaic platform features.
Communicate research outcomes to technical and non-technical stakeholders.

Requirements

MS/PhD in Computer Science, ML, or related field, or equivalent experience.
Track record of effective RL work.
Motivated by impact in critical industries including healthcare, supply chains, energy, and finance.
Experience performing rigorous RL experimentation.
Strong ownership mindset.
Belief in AI's transformative potential for critical industries.

Nice to have

High-performance large-scale distributed systems.
Large-scale LLM or RL training.
Strong Python programming skills.
Implementing LLM post-training algorithms.
Experience with vLLM/SGLang, Ray, Kubernetes (or AWS EKS).
Distributed checkpointing, multi-node/multi-GPU training, custom KV-caching.
Asynchronous training/inference with VeRL, ROLL, SkyRL, AReal, or CleanRL.

Culture & Benefits

Dream bigger: Tackle ambitious problems with optimism and responsibility.
Heart in the game: Commit fully to meaningful work without hour monitoring.
Win for the customer: Focus on delivering outcomes over outputs.
Make the call: Empower high-agency decisions with open communication.
Intensity with kindness: Excel in execution, feedback, and prioritization while building trust through kindness.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →