Member Of Technical Staff (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Member Of Technical Staff (AI): Building and optimizing distributed training infrastructure for large-scale GPU clusters with an accent on compute efficiency and performance profiling. Focus on designing software for distributed training parallelism, benchmarking performance bottlenecks, and collaborating with researchers to scale frontier-scale AI models.
Location: Must be based in the San Francisco Bay area and in office 4 days a week.
Salary: $158,400 – $304,200 per year (depending on level and specific Bay area/NYC location).
Company
A global technology leader driving innovation in artificial intelligence, cloud computing, and productivity software.
What you will do
- Design, implement, and optimize distributed training infrastructure using Python and C++.
- Develop telemetry systems to provide visibility into infrastructure performance, utilization, and cost metrics.
- Profile, benchmark, and debug performance bottlenecks across compute, memory, and networking subsystems.
- Drive architectural improvements for ML services to achieve measurable efficiency gains.
- Collaborate with hardware and research teams to optimize for next-generation accelerators like NVIDIA and MAIA.
Requirements
- Must be local to the San Francisco area and able to work in office 4 days a week.
- Bachelor’s Degree in Computer Science or related technical discipline.
- 6+ years of technical engineering experience coding in C, C++, C#, Java, JavaScript, or Python.
- Deep understanding of GPU architectures and distributed computing fundamentals.
- Experience profiling and analyzing performance in large-scale ML or Generative AI workloads.
- Strong background in distributed training parallelism and networking topologies.
Nice to have
- Master’s degree in Computer Science or related technical field.
- 10+ years of technical engineering experience for senior-level roles.
- Proficiency with low-level GPU programming (CUDA, Triton, NCCL).
- Experience with deep learning frameworks such as PyTorch or JAX.
Culture & Benefits
- Work on frontier-scale AI models impacting billions of users.
- Supportive culture centered on values of respect, integrity, and accountability.
- Opportunities for professional growth and impact within a startup-like team environment.
- Access to comprehensive benefits and compensation packages.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →