TL;DR
Software Development Engineer, Annapurna Labs, Trainium Collectives (AI/ML): Building and optimizing distributed AI/ML systems focused on collective operations for scaling across multiple accelerators and servers with an accent on low-level C/C++, Linux kernels, and performant code. Focus on solving complex concurrency challenges, designing high-speed networking solutions, and ensuring system reliability for the largest AI models and clusters.
Location: Must be based in Cupertino, California, USA
Salary: $165,200–$223,600 USD annually
Company
Annapurna Labs, an integral part of AWS, develops critical hardware and software components for EC2 infrastructure, specializing in optimizing the AWS customer experience for customers ranging from startups to Global 500 companies.
What you will do
- Work on distributed AI/ML systems, focusing on collective operations.
- Enable AI to scale across multiple accelerators and servers.
- Build networking solutions for Machine Learning (ML) and High-Performance Computing (HPC) workloads on AWS.
- Collaborate with infrastructure experts, hardware engineers, RTL engineers, scientists, and architects.
- Contribute to features for the largest clusters, customers, and AI models.
Requirements
- 3+ years of non-internship professional software development experience.
- 2+ years of non-internship design or architecture experience (design patterns, reliability and scaling) of new and existing systems.
- Experience programming with at least one software programming language.
- Solid knowledge of C/C++, Linux, kernels, and performant code.
Nice to have
- 3+ years of full software development life cycle experience.
- Bachelor's degree in computer science or equivalent.
- Experience with embedded systems.
- Experience with high-speed networking or HPC interconnects.
Culture & Benefits
- Offers flexibility in working hours and respects work-life balance as a core tenet.
- Emphasizes knowledge-sharing and mentorship for new and junior engineers.
- Provides opportunities for continuous learning and career growth in the fast-moving AI/ML field.
- Offers comprehensive benefits including health insurance, 401(k) matching, paid time off, and parental leave.
- Fosters an inclusive team culture through employee-led affinity groups and ongoing learning experiences.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →