Member of Technical Staff - Post Training (AI)

Формат работы

remote/hybrid

Тип работы

fulltime

Грейд

senior

Английский

Страна

US/Germany

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Member of Technical Staff - Post Training (AI): Owning the post-training pipeline for multimodal generative models with an accent on preference optimization, distillation, and alignment. Focus on implementing SFT, RLHF, and DPO to drive measurable gains in model quality across image and video modalities.

Location: Hybrid (Freiburg or San Francisco) or Remote with a mandatory monthly in-person week

Company

The research lab behind Latent Diffusion, Stable Diffusion, and FLUX, focusing on cutting-edge generative systems and open science.

What you will do

Own the end-to-end post-training pipeline, including data curation, reward modeling, fine-tuning, and deployment.
Advance techniques such as SFT, RLHF, RLAIF, and DPO to align models with human intent and aesthetic judgment.
Develop post-training capabilities across multiple modalities: text-to-image, image editing, and video.
Build personalization and customization features allowing users to adapt models to specific creative styles.
Design and maintain high-throughput fine-tuning and evaluation infrastructure to support rapid research iteration.
Identify and resolve alignment gaps through rigorous evaluation and targeted engineering.

Requirements

Proven experience owning post-training for a frontier generative model through release.
Deep expertise in reward modeling, preference learning, and RLHF/RLAIF.
Strong PyTorch fluency and ability to write maintainable research code.
Experience working across multimodal systems (text-to-image, editing, and ideally video).
Track record of delivering measurable quality improvements on human preferences or standard benchmarks.

Nice to have

Experience with distillation techniques such as LADD, DMD, or consistency models.
Experience building high-throughput evaluation pipelines.

Culture & Benefits

Research-driven culture defined by obsession with quality, low ego, boldness, and kindness.
Distributed team structure with physical offices in Freiburg and San Francisco.
Coverage of reasonable travel costs for mandatory in-person collaboration periods.
Opportunity to work on foundation models used by millions of creators worldwide.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →