Назад
Company hidden
5 дней назад

Staff AI Inference and Acceleration Engineer (Robotics)

180 000 - 275 000$
Формат работы
onsite
Тип работы
fulltime
Грейд
lead
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Staff AI Inference and Acceleration Engineer (Robotics): Owning the on-board inference architecture for humanoid robots with an accent on mapping AI workloads to accelerators and optimizing for latency and power. Focus on optimizing inference toolchains, applying quantization and pruning, and collaborating with ML teams to define hardware-friendly model architectures.

Location: San Jose, CA

Salary: $180,000 - $275,000

Company

hirify.global is an AI robotics company developing autonomous general-purpose humanoid robots with human-level intelligence for home and commercial markets.

What you will do

  • Own the on-board inference architecture, mapping models to accelerators (NPU, GPU, DSP, CPU) based on latency, power, and memory budgets.
  • Partition inference workloads across heterogeneous compute resources to balance real-time performance with thermal constraints.
  • Optimize end-to-end inference toolchains from model export through runtime execution for target hardware.
  • Apply quantization (INT8, INT4, mixed-precision), pruning, and operator fusion to reduce compute, memory, and power footprints.
  • Profile inference pipelines to eliminate bottlenecks in latency, memory bandwidth, and power consumption.
  • Partner with AI/ML teams to define hardware-friendly model architecture constraints.

Requirements

  • M.S. or Ph.D. in Computer Engineering, Electrical Engineering, Computer Science, or equivalent industry experience.
  • At least 8 years of industry experience in hardware acceleration, ML systems, or compute architecture.
  • Deep understanding of AI/ML inference, model formats (ONNX, TFLite), and deployment pipelines.
  • Hands-on experience optimizing models for edge or embedded hardware using quantization and pruning.
  • Strong understanding of computer architecture, including memory hierarchies and heterogeneous compute.
  • Proficiency in C++ and Python.

Nice to have

  • Knowledge of real-time operating constraints and their impact on inference scheduling.
  • Track record of co-designing model architectures with ML teams to meet hardware constraints.

Culture & Benefits

  • Competitive US base salary range.
  • Opportunity to work on the frontier of general-purpose humanoid robotics.
  • Comprehensive compensation package with additional components based on role and experience.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →