Назад
Company hidden
2 дня назад

Research Internship Reinforcement Learning (AI)

Формат работы
hybrid
Тип работы
fulltime
Грейд
trainee
Английский
b2
Страна
France
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Research Internship Reinforcement Learning (AI): Developing cutting-edge RL and LLM techniques focusing on self-distillation and RLVR for code and agentic tasks. Focus on bridging theoretical mathematical modeling with production-oriented implementation to advance the state-of-the-art in model training.

Location: Hybrid in Paris, France

Company

hirify.global is an AI research lab building frontier models for developers and enterprises to power content generation, semantic search, RAG, and agents.

What you will do

  • Implement state-of-the-art RL and self-distillation algorithms.
  • Execute experiments on code generation and agentic tasks to evaluate proposed methods.
  • Develop and maintain codebases for theoretical modeling and practical implementations.
  • Collaborate with researchers to analyze results and prepare findings for publication.
  • Design mechanisms to handle extremely large rollouts, such as summarization and hierarchical sub-agents.
  • Document methodologies and project outcomes comprehensively.

Requirements

  • Currently pursuing a Master's or PhD in Computer Science, Machine Learning, or a related field.
  • Strong background in reinforcement learning and deep learning.
  • Proficiency in Python and ML frameworks (PyTorch, TensorFlow).
  • Familiarity with LLM training paradigms.
  • Must be able to work in a hybrid setup in Paris, France.

Nice to have

  • Prior experience with RLVR, self-distillation, or large-scale ML experiments.
  • Experience with coding tasks, unit testing, or compiler tools.

Culture & Benefits

  • Opportunity to work on the cutting edge of AI research within an inclusive culture.
  • Full health and dental benefits, including a dedicated mental health budget.
  • Generous vacation policy with 30 working days per year.
  • Weekly lunch stipend and in-office snacks.
  • 100% Parental Leave top-up for up to 6 months.
  • Personal enrichment benefits for arts, culture, fitness, and workspace improvement.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →