Research Internship Reinforcement Learning (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Research Internship Reinforcement Learning (AI): Developing cutting-edge RL and LLM techniques focusing on self-distillation and RLVR for code and agentic tasks. Focus on bridging theoretical mathematical modeling with production-oriented implementation to advance the state-of-the-art in model training.
Location: Hybrid in Paris, France
Company
is an AI research lab building frontier models for developers and enterprises to power content generation, semantic search, RAG, and agents.
What you will do
- Implement state-of-the-art RL and self-distillation algorithms.
- Execute experiments on code generation and agentic tasks to evaluate proposed methods.
- Develop and maintain codebases for theoretical modeling and practical implementations.
- Collaborate with researchers to analyze results and prepare findings for publication.
- Design mechanisms to handle extremely large rollouts, such as summarization and hierarchical sub-agents.
- Document methodologies and project outcomes comprehensively.
Requirements
- Currently pursuing a Master's or PhD in Computer Science, Machine Learning, or a related field.
- Strong background in reinforcement learning and deep learning.
- Proficiency in Python and ML frameworks (PyTorch, TensorFlow).
- Familiarity with LLM training paradigms.
- Must be able to work in a hybrid setup in Paris, France.
Nice to have
- Prior experience with RLVR, self-distillation, or large-scale ML experiments.
- Experience with coding tasks, unit testing, or compiler tools.
Culture & Benefits
- Opportunity to work on the cutting edge of AI research within an inclusive culture.
- Full health and dental benefits, including a dedicated mental health budget.
- Generous vacation policy with 30 working days per year.
- Weekly lunch stipend and in-office snacks.
- 100% Parental Leave top-up for up to 6 months.
- Personal enrichment benefits for arts, culture, fitness, and workspace improvement.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →