Software Engineer (ML Inference)

250 000 - 320 000$

Формат работы

onsite

Тип работы

fulltime

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Software Engineer, ML Inference: Build and scale end-to-end inference systems for a next-generation AI cloud with an accent on runtime, serving infrastructure, memory management, and hardware optimisation. Focus on optimising latency/throughput/concurrency, designing batching/scheduling/queuing systems, improving KV cache efficiency, and debugging bottlenecks across model/runtime/hardware layers.

Location: San Francisco (On-Site)

Salary: $250,000–$320,000 base + equity

Company

Early-stage infrastructure company building a next-generation AI cloud — rethinking how frontier models run across heterogeneous compute environments.

What you will do

Build and scale end-to-end inference systems from request to runtime to response
Optimise latency, throughput, concurrency, and reliability under production workloads
Design batching, scheduling, and queuing systems for high-performance serving
Improve KV cache management and memory efficiency at scale
Debug performance bottlenecks across model, runtime, and hardware layers
Collaborate with systems, infrastructure, and ML teams to advance inference performance

Requirements

Experience building ML inference or model serving systems
Strong systems engineering or backend infrastructure fundamentals
Experience with performance, scaling, memory, or distributed systems challenges
Strong Python and/or C++ skills
Must be based in San Francisco for on-site work

Nice to have

Familiarity with modern inference frameworks and runtimes (vLLM, TensorRT-LLM, custom runtimes)

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →