Назад
Company hidden
2 месяца назад

Research Mid-Training (AI)

Формат работы
onsite
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Research Mid-Training (AI): Own late-stage training decisions sharpening raw base model capabilities into reliable reasoning foundations with an accent on data mix, quality uplift, annealing schedules, context length extension, and synthetic data strategies. Focus on capability injection across coding, math, and long-horizon reasoning, evaluating interventions, and scaling methodologies for AI agents like Devin.

Location: On-site in San Francisco Bay Area

Company

Applied AI lab building end-to-end software agents including Devin, the first AI software engineer, and Windsurf, an AI-native IDE. Small, talent-dense team from top AI organizations like Scale AI, Palantir, Cursor, and Google DeepMind.

What you will do

  • Design and iterate on high-quality data mixtures for late-stage and annealing training runs, developing methods for sourcing, filtering, and weighting data.
  • Drive targeted improvements in coding, mathematics, and long-horizon reasoning through curated data strategies and training interventions.
  • Develop and evaluate synthetic data pipelines that generate training signal at scale, understanding limits and failure modes.
  • Research and optimize multi-stage learning rate schedules, warmup strategies, and compute allocation across training phases.
  • Research and implement methods for extending effective context length without degrading short-context performance.
  • Build evals distinguishing real capability improvements from overfitting and measure how interventions scale with compute and data.

Requirements

  • Deep familiarity with the LLM training pipeline end-to-end: pre-training data, optimization, architecture, mid-training, and post-training interactions
  • Hands-on experience with continual pre-training, annealing, or late-stage data mixing for large models
  • Strong intuition for data quality, filtering, curation at scale, and how data mix choices impact evals
  • Experience developing or evaluating synthetic data pipelines for capability improvement
  • Proficiency in Python and PyTorch; comfortable debugging distributed training at scale
  • Strong fundamentals in optimization, statistics, and ML theory; track record of original contributions

Culture & Benefits

  • Small, highly selective team where research and product move together; prototypes reach deployment quickly
  • Compute is not a constraint: large GPU allocations from day one
  • Environment rewards speed, autonomy, and technical depth with minimal process overhead

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →