Research Engineer (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Research Engineer (AI): Building and optimizing high-scale distributed training infrastructure for large-scale AI models with an accent on GPU cluster performance, experiment orchestration, and data pipelines. Focus on diagnosing bottlenecks, implementing advanced parallelism strategies, and ensuring system reliability under heavy training loads.
Location: On-site in the San Francisco Bay Area.
Company
An applied AI lab focused on building end-to-end software agents like Devin and .
What you will do
- Build and own distributed systems for training large-scale models reliably across GPU clusters.
- Profile and improve end-to-end training throughput by identifying bottlenecks in compute and communication.
- Maintain and design experiment orchestration tools to maximize research velocity.
- Develop high-throughput, reliable data pipelines for model training and evaluation.
- Diagnose complex failures across GPUs, networking, and numerical stability.
- Implement and optimize diverse parallelism strategies including data, tensor, and pipeline parallelism.
Requirements
- Deep experience operating distributed training systems for large-scale models.
- Strong systems engineering fundamentals covering networking, storage, and distributed compute.
- Proficiency in Python and C++ with experience in PyTorch or JAX at a systems level.
- Hands-on experience with GPU performance profiling and memory optimization.
- Understanding of ML architectures to support research-specific infrastructure needs.
- Advanced degree (PhD) in CS, ML, Physics, or Mathematics, or equivalent industry experience.
Culture & Benefits
- Small, talent-dense team featuring world-class competitive programmers and researchers.
- High-impact environment where infrastructure directly accelerates frontier AI research.
- Access to massive GPU compute resources with minimal process overhead.
- Focus on speed, autonomy, and technical depth in a competitive problem space.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →