Эта вакансия в архиве

Посмотреть похожие вакансии ↓
Company hidden
обновлено 1 месяц назад

Inference Runtime, Engineering Manager (AI)

455 000 - 555 000$
Формат работы
onsite
Тип работы
fulltime
Английский
b2
Страна
US

Описание вакансии

Текст:
/

TL;DR

Inference Runtime, Engineering Manager (AI): Leading a team of engineers to optimize large AI models for high-volume, low-latency, and high-availability production and research environments, with an accent on distributed systems, model architecture co-design, and performance tuning. Focus on introducing new techniques, tools, and architecture to improve inference stack efficiency and optimizing GPU utilization for complex AI systems.

Location: Onsite in San Francisco, USA.

Salary: $455,000 – $555,000 USD

Company

hirify.global is an AI research and deployment company dedicated to ensuring general-purpose artificial intelligence benefits all of humanity.

What you will do

  • Lead a team of engineers specializing in distributed systems and model architecture.
  • Collaborate with ML researchers, engineers, and product managers to deploy latest AI technologies.
  • Contribute across the entire stack, from infrastructure to performance tuning.
  • Introduce new techniques, tools, and architecture to enhance model inference performance and efficiency.
  • Develop tools for bottleneck visibility and implement solutions to address critical issues.
  • Optimize code and GPU fleets to maximize hardware utilization.

Requirements

  • Location: Onsite in San Francisco, USA.
  • Understanding of modern ML architectures and intuition for inference optimization.
  • At least 15 years of professional software engineering experience.
  • Familiarity with PyTorch, NVidia GPUs, and software stacks like NCCL, CUDA, InfiniBand, MPI, NVLink.
  • Experience architecting, building, observing, and debugging production distributed systems.
  • Experience rebuilding or refactoring production systems due to rapidly increasing scale.

Culture & Benefits

  • Work in an outcome-oriented environment where everyone contributes across layers of the stack.
  • Committed to providing reasonable accommodations to applicants with disabilities.
  • Dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.
  • Focus on pushing the boundaries of AI systems and safely deploying them to the world.