Назад
Company hidden
10 часов назад

Software Engineer (ML Inference)

250 000 - 320 000$
Формат работы
onsite
Тип работы
fulltime
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Software Engineer, ML Inference: Build and scale end-to-end inference systems for a next-generation AI cloud with an accent on runtime, serving infrastructure, memory management, and hardware optimisation. Focus on optimising latency/throughput/concurrency, designing batching/scheduling/queuing systems, improving KV cache efficiency, and debugging bottlenecks across model/runtime/hardware layers.

Location: San Francisco (On-Site)

Salary: $250,000–$320,000 base + equity

Company

Early-stage infrastructure company building a next-generation AI cloud — rethinking how frontier models run across heterogeneous compute environments.

What you will do

  • Build and scale end-to-end inference systems from request to runtime to response
  • Optimise latency, throughput, concurrency, and reliability under production workloads
  • Design batching, scheduling, and queuing systems for high-performance serving
  • Improve KV cache management and memory efficiency at scale
  • Debug performance bottlenecks across model, runtime, and hardware layers
  • Collaborate with systems, infrastructure, and ML teams to advance inference performance

Requirements

  • Experience building ML inference or model serving systems
  • Strong systems engineering or backend infrastructure fundamentals
  • Experience with performance, scaling, memory, or distributed systems challenges
  • Strong Python and/or C++ skills
  • Must be based in San Francisco for on-site work

Nice to have

  • Familiarity with modern inference frameworks and runtimes (vLLM, TensorRT-LLM, custom runtimes)

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →