Member Of Technical Staff - Research Engineer (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Member of Technical Staff - Research Engineer (AI): Developing and optimizing large-scale training systems for multimodal generative models with an accent on GPU performance, numerical stability, and distributed training. Focus on implementing custom kernels, low-precision training paths, and debugging complex distributed training failures to enable frontier research.
Location: San Francisco (USA) or Freiburg (Germany). Hybrid (at least 2 days a week) or remote with a required monthly in-person week.
Salary: $180,000 - $290,000 + equity
Company
A frontier research lab behind foundational technologies like Stable Diffusion and FLUX, creating advanced generative models for images and video.
What you will do
- Optimize the performance, reliability, and numerical stability of production training runs for large multimodal generative models.
- Profile full training steps across model code, attention, kernels, data loading, and communication.
- Implement GPU-level optimizations using CUDA, Triton, CuTe, and CUTLASS.
- Develop and validate low-precision training paths including FP8, MXFP8, and FP4-style formats.
- Debug distributed training failures such as NaNs, loss spikes, and NCCL issues.
- Build benchmarking and profiling harnesses to validate performance across various hardware and configurations.
Requirements
- Deep experience with large-scale training systems and strong PyTorch fluency.
- Proficiency in distributed training concepts (FSDP, tensor/model parallelism, NCCL).
- Hands-on experience improving training throughput, memory footprint, or stability.
- Experience profiling GPU workloads with Nsight Systems, Nsight Compute, or torch profiler.
- Understanding of low-precision training and quantization tradeoffs (FP8, FP4).
- Must be based in or able to travel to San Francisco or Freiburg for monthly in-person weeks.
Nice to have
- Experience co-owning training for a shipped frontier foundation model.
- Proven ability to write or substantially improve forward/backward GPU kernels.
- Experience with Hopper or Blackwell-class GPUs.
- Background in diffusion, flow matching, DiT, or LLM training systems.
Culture & Benefits
- Distributed team with physical offices in SF and Freiburg.
- Company covers reasonable travel costs for required in-person weeks.
- Culture based on scientific obsession, low ego, boldness, and kindness.
- Equity compensation provided alongside base salary.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →