AI Research Engineer (Reinforcement Learning)

Формат работы

remote (Global)

Тип работы

fulltime

Английский

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

AI Research Engineer (Reinforcement Learning): Driving innovation in reinforcement learning approaches for advanced AI models, optimizing decision-making and adaptive behavior. Focus on curating simulation environments, strengthening policy performance, and resolving bottlenecks in the reinforcement learning process to unlock superior AI performance in dynamic environments.

Company

Tether is pioneering a global financial revolution with cutting-edge solutions that empower businesses to seamlessly integrate reserve-backed tokens across blockchains.

What you will do

Develop and implement state-of-the-art reinforcement learning algorithms to optimize decision-making in simulated and real-world settings.
Build, run, and monitor controlled reinforcement learning experiments, tracking key performance indicators and documenting iterative results.
Identify and curate high-quality simulation environments and training datasets tailored to specific domain challenges.
Systematically debug and optimize the reinforcement learning pipeline by analyzing computational efficiency and learning performance metrics.
Collaborate with cross-functional teams to integrate reinforcement learning agents into production systems, defining clear success metrics and ensuring continuous monitoring.

Requirements

A degree in Computer Science or related field, ideally a PhD in NLP, Machine Learning, or a related field.
Proven experience with large-scale reinforcement learning experiments, including online RL techniques such as Group Relative Policy Optimization (GRPO).
Deep understanding of reinforcement learning algorithms, including state-of-the-art online RL methods and other gradient-based optimization approaches like policy gradients, actor-critic, and GRPO.
Strong expertise in PyTorch and relevant reinforcement learning frameworks is a must.
Demonstrated ability to apply empirical research to overcome reinforcement learning challenges such as sample inefficiency, exploration-exploitation tradeoffs, and training instability.

Culture & Benefits

Global talent powerhouse team working remotely from every corner of the world.
Opportunity to collaborate with some of the brightest minds, pushing boundaries and setting new standards in the fintech space.

Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →