Эта вакансия в архиве
Посмотреть похожие вакансии ↓обновлено 1 месяц назад
Inference Runtime, Engineering Manager (AI)
455 000 - 555 000$
Описание вакансии
Текст:
TL;DR
Inference Runtime, Engineering Manager (AI): Leading a team of engineers to optimize large AI models for high-volume, low-latency, and high-availability production and research environments, with an accent on distributed systems, model architecture co-design, and performance tuning. Focus on introducing new techniques, tools, and architecture to improve inference stack efficiency and optimizing GPU utilization for complex AI systems.
Location: Onsite in San Francisco, USA.
Salary: $455,000 – $555,000 USD
Company
is an AI research and deployment company dedicated to ensuring general-purpose artificial intelligence benefits all of humanity.
What you will do
- Lead a team of engineers specializing in distributed systems and model architecture.
- Collaborate with ML researchers, engineers, and product managers to deploy latest AI technologies.
- Contribute across the entire stack, from infrastructure to performance tuning.
- Introduce new techniques, tools, and architecture to enhance model inference performance and efficiency.
- Develop tools for bottleneck visibility and implement solutions to address critical issues.
- Optimize code and GPU fleets to maximize hardware utilization.
Requirements
- Location: Onsite in San Francisco, USA.
- Understanding of modern ML architectures and intuition for inference optimization.
- At least 15 years of professional software engineering experience.
- Familiarity with PyTorch, NVidia GPUs, and software stacks like NCCL, CUDA, InfiniBand, MPI, NVLink.
- Experience architecting, building, observing, and debugging production distributed systems.
- Experience rebuilding or refactoring production systems due to rapidly increasing scale.
Culture & Benefits
- Work in an outcome-oriented environment where everyone contributes across layers of the stack.
- Committed to providing reasonable accommodations to applicants with disabilities.
- Dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.
- Focus on pushing the boundaries of AI systems and safely deploying them to the world.