10 дней назад
Member Of Technical Staff (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
Текст:
TL;DR
Member Of Technical Staff (AI): Building and scaling distributed training systems that power frontier model pre-training with an accent on efficient training across thousands of GPUs using modern distributed training frameworks. Focus on optimizing training throughput, stability, and efficiency for large model training workloads.
Location: San Francisco; London; New York
Company
Reflection’s mission is to build open superintelligence and make it accessible to all.
What you will do
- Build and scale distributed training systems for frontier model pre-training.
- Design and operate large-scale training runs for foundation models.
- Optimize training throughput, stability, and efficiency for large model training workloads.
- Translate experimental ideas into scalable, production-ready training systems.
- Improve performance of distributed training workloads through optimization of communication, memory usage, and GPU utilization.
- Build and maintain training pipelines that support large-scale datasets, checkpointing, and experiment iteration.
Requirements
- Experience building or operating distributed training systems for large machine learning models.
- Strong experience working with modern distributed training frameworks such as Megatron, DeepSpeed, or similar large-scale training systems.
- Familiarity with large-scale model parallelism strategies (data, tensor, pipeline, or expert parallelism).
- Experience optimizing training throughput and GPU utilization in large distributed environments.
- Familiarity with GPU communication libraries such as NCCL and performance tuning for distributed workloads.
- Experience working closely with ML researchers to productionize experimental training workflows.
Culture & Benefits
- Top-tier compensation with salary and equity.
- Comprehensive medical, dental, vision, life, and disability insurance.
- Fully paid parental leave for all new parents, including adoptive and surrogate journeys and financial support for family planning.
- Paid time off when you need it, relocation support, and more perks that optimize your time.
- Opportunities to connect with teammates with lunch and dinner provided daily, regular off-sites and team celebrations.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →