Helix AI Engineer, Reinforcement Learning (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Helix AI Engineer, Reinforcement Learning (AI): Developing learning systems that enable robots to acquire skills through interaction, feedback, and experience with an accent on improving policy performance, robustness, and long-horizon decision-making in embodied systems. Focus on applying and advancing reinforcement learning across simulation and real-world environments.
Location: Requires 5 days/week in-office collaboration in San Jose, CA
Company
is an AI robotics company developing autonomous general-purpose humanoid robots.
What you will do
- Design and implement reinforcement learning algorithms for embodied agents operating in real-world and simulated environments.
- Train policies that learn from interaction, feedback, and large-scale experience across diverse tasks.
- Develop reward modeling, credit assignment, and exploration strategies for complex, long-horizon behaviors.
- Improve policy robustness to real-world challenges such as noise, partial observability, and environment variability.
- Collaborate closely with pretraining, video, generative, agent, and robot learning teams to integrate RL into the full autonomy stack.
- Build scalable training systems for RL, including distributed rollouts, simulation infrastructure, and experiment management.
Requirements
- Experience developing and applying reinforcement learning algorithms in complex environments.
- Strong understanding of RL fundamentals (e.g., policy optimization, value methods, model-based RL).
- Experience training policies in simulation and/or real-world systems.
- Proficiency in Python and deep learning frameworks such as PyTorch.
- Experience with large-scale experimentation and distributed training systems.
- Solid software engineering skills and ability to build scalable, reliable systems.
Nice to have
- Experience applying RL to robotics, control systems, or embodied AI.
- Experience with large-scale RL infrastructure (distributed rollouts, simulation at scale).
- Background in offline RL, imitation learning, or hybrid learning approaches.
- Experience with reward modeling or human-in-the-loop learning.
- Familiarity with robotics systems, simulation environments, or real-world deployment constraints.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →