2 месяца назад

AI Researcher (Post Training)

Тип работы

fulltime

Английский

Страна

Sweden

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

AI Researcher (Post Training): Developing and optimizing the post-training pipeline for a natural-language software creation platform with an accent on RFT/RLVR, preference optimization, and code generation. Focus on translating cutting-edge research into production-ready training recipes and building scalable evaluation systems.

Location: Stockholm, Sweden

Company

Lovable enables millions of users to transform raw ideas into real software products using plain language.

What you will do

Own the full post-training lifecycle, from data curation and training runs through evaluation and deployment.
Adapt reinforcement learning, preference optimization, and SFT to improve code generation, reasoning, and agentic reliability.
Build evaluation and experimentation infrastructure to measure helpfulness, safety, latency, and reliability.
Develop and operate production systems for large-scale training, including GPU orchestration and data pipelines.
Collaborate with agent, product, and infrastructure engineers to translate model gains into user-facing improvements.
Investigate and resolve end-to-end failures in training recipes, data, or serving regressions.

Requirements

Hands-on experience running post-training jobs (RFT/RLVR, preference optimization) on LLMs.
Ability to write reliable production-grade code.
Proficiency in PyTorch or JAX and experience with distributed training and GPU clusters.
Strong understanding of the mathematics behind reward modeling, alignment, and preference optimization.
Experience building evaluation systems that capture real-world quality rather than just benchmarks.
English: Required (company language)

Nice to have

Experience with code generation or agentic use cases.
History of owning the full loop from data curation to production monitoring.
Ability to rapidly prototype research papers into running code.
Experience with speculative decoding or other model efficiency techniques.
Contributions to the open-source ML ecosystem or research publications.

Culture & Benefits

Talent-dense team with a culture of extreme ownership and high velocity.
Low-ego collaboration environment.
Fast-paced atmosphere focused on shipping impact to users quickly.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Похожие вакансии

AI Researcher (Post Training)

Lovable

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Nice to have

Culture & Benefits

Похожие вакансии

AI Research Scientist (AI Engineering)

Research Engineer (AI)

ML Infrastructure Engineer (AI)

Principal Research Engineer (AI)

AI Engineer (Fintech)

Research Engineer, SysML (AI)

Разработка

Game Dev

Design и Creative

Аналитика

Менеджмент

People & Business

AI Researcher (Post Training)

Lovable

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Nice to have

Culture & Benefits

Categories

Похожие вакансии

AI Research Scientist (AI Engineering)

Research Engineer (AI)

ML Infrastructure Engineer (AI)

Principal Research Engineer (AI)

AI Engineer (Fintech)

Research Engineer, SysML (AI)