Inference Runtime, Engineering Manager (AI)

455 000 - 555 000$

Формат работы

onsite

Тип работы

fulltime

Английский

Страна

Описание вакансии

Текст:

TL;DR

Inference Runtime, Engineering Manager (AI): Leading a team of engineers to optimize large AI models for high-volume, low-latency, and high-availability production and research environments, with an accent on distributed systems, model architecture co-design, and performance tuning. Focus on introducing new techniques, tools, and architecture to improve inference stack efficiency and optimizing GPU utilization for complex AI systems.

Location: Onsite in San Francisco, USA.

Salary: $455,000 – $555,000 USD

Company

hirify.global is an AI research and deployment company dedicated to ensuring general-purpose artificial intelligence benefits all of humanity.

What you will do

Lead a team of engineers specializing in distributed systems and model architecture.
Collaborate with ML researchers, engineers, and product managers to deploy latest AI technologies.
Contribute across the entire stack, from infrastructure to performance tuning.
Introduce new techniques, tools, and architecture to enhance model inference performance and efficiency.
Develop tools for bottleneck visibility and implement solutions to address critical issues.
Optimize code and GPU fleets to maximize hardware utilization.

Requirements

Location: Onsite in San Francisco, USA.
Understanding of modern ML architectures and intuition for inference optimization.
At least 15 years of professional software engineering experience.
Familiarity with PyTorch, NVidia GPUs, and software stacks like NCCL, CUDA, InfiniBand, MPI, NVLink.
Experience architecting, building, observing, and debugging production distributed systems.
Experience rebuilding or refactoring production systems due to rapidly increasing scale.

Culture & Benefits

Work in an outcome-oriented environment where everyone contributes across layers of the stack.
Committed to providing reasonable accommodations to applicants with disabilities.
Dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.
Focus on pushing the boundaries of AI systems and safely deploying them to the world.