Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
Текст:
TL;DR
AI Researcher (Post Training): Developing and optimizing the post-training pipeline for a natural-language software creation platform with an accent on RFT/RLVR, preference optimization, and code generation. Focus on translating cutting-edge research into production-ready training recipes and building scalable evaluation systems.
Location: Stockholm, Sweden
Company
Lovable enables millions of users to transform raw ideas into real software products using plain language.
What you will do
- Own the full post-training lifecycle, from data curation and training runs through evaluation and deployment.
- Adapt reinforcement learning, preference optimization, and SFT to improve code generation, reasoning, and agentic reliability.
- Build evaluation and experimentation infrastructure to measure helpfulness, safety, latency, and reliability.
- Develop and operate production systems for large-scale training, including GPU orchestration and data pipelines.
- Collaborate with agent, product, and infrastructure engineers to translate model gains into user-facing improvements.
- Investigate and resolve end-to-end failures in training recipes, data, or serving regressions.
Requirements
- Hands-on experience running post-training jobs (RFT/RLVR, preference optimization) on LLMs.
- Ability to write reliable production-grade code.
- Proficiency in PyTorch or JAX and experience with distributed training and GPU clusters.
- Strong understanding of the mathematics behind reward modeling, alignment, and preference optimization.
- Experience building evaluation systems that capture real-world quality rather than just benchmarks.
- English: Required (company language)
Nice to have
- Experience with code generation or agentic use cases.
- History of owning the full loop from data curation to production monitoring.
- Ability to rapidly prototype research papers into running code.
- Experience with speculative decoding or other model efficiency techniques.
- Contributions to the open-source ML ecosystem or research publications.
Culture & Benefits
- Talent-dense team with a culture of extreme ownership and high velocity.
- Low-ego collaboration environment.
- Fast-paced atmosphere focused on shipping impact to users quickly.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →
Похожие вакансии
3 дня назад
AI Research Scientist (AI Engineering)
200 000 - 385 000$
Runway
2 дня назад
Research Engineer (AI)
270 000 - 370 000$
2 дня назад
ML Infrastructure Engineer (AI)
180 000 - 350 000$
Synthesia
2 дня назад
Principal Research Engineer (AI)
2 дня назад
AI Engineer (Fintech)
FAIR
6 часов назад
Research Engineer, SysML (AI)
141 000 - 208 000$