Inference Engineer (AI)

140 000 - 325 000$

Тип работы

fulltime

Грейд

middle/senior

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Inference Engineer (AI): Building and optimizing large-scale inference infrastructure for next-generation AI workloads with an accent on runtime performance, memory efficiency, and distributed systems orchestration. Focus on solving complex challenges in KV cache management, request scheduling, and low-latency model serving under production load.

Location: Must be based in the United States (San Francisco, CA)

Salary: $140,000–$325,000

Company

hirify.global is partnering with an AI infrastructure company focused on building high-performance systems for large-scale AI model execution.

What you will do

Design and optimize large-scale inference pipelines for production environments.
Improve system latency, throughput, and concurrency under heavy real-world load.
Build and maintain inference runtimes and serving infrastructure.
Optimize request orchestration, batching, and scheduling strategies.
Manage KV cache allocation, reuse, and eviction strategies to maximize memory efficiency.
Profile and resolve performance bottlenecks across model, runtime, and distributed layers.

Requirements

Strong systems engineering fundamentals.
Experience building or scaling ML inference and model serving systems.
Deep understanding of performance optimization and memory behavior.
Proficiency with runtimes such as vLLM, TensorRT-LLM, or custom serving infrastructure.
Strong understanding of transformer architectures and attention mechanisms.
Strong Python and/or C++ engineering skills.

Culture & Benefits

Work on cutting-edge inference infrastructure and foundational AI systems.
Join a small, highly technical engineering team.
Significant ownership and opportunity for high technical impact.
Build systems designed for next-generation AI scale.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →