Staff Software Engineer (ML Infrastructure)

146 600 - 215 100$

Формат работы

hybrid

Тип работы

fulltime

Грейд

senior

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Staff Software Engineer (ML Infrastructure): Building and scaling cloud-side ML infrastructure and applied ML research for intelligent home security products with an accent on real-time computer vision inference and LLM/GenAI serving. Focus on designing high-throughput, low-latency distributed systems, optimizing GPU utilization, and establishing model lifecycle management.

Location: Hybrid in Boston, MA (expectation to be in office 2 core days per week)

Salary: $146,600 – $215,100 per year

Company

A high-tech home security company dedicated to keeping every home secure through a culture of collaboration and innovation.

What you will do

Drive architecture and technical direction for a Kubernetes-based ML platform using Ray, KServe, Triton, and vLLM.
Design and evolve cloud-side real-time computer vision inference systems to process live video and events.
Establish production infrastructure for LLM/GenAI serving, including KV-cache and batching strategies.
Mentor engineers through design and code reviews and define best practices for model lifecycle management.
Lead incident response and define SLOs and observability standards for critical ML services.

Requirements

8+ years of software engineering experience with a track record of building large-scale distributed systems.
Deep expertise in high-throughput, low-latency services and operational experience running them at scale.
Strong production experience with Kubernetes, AWS (EKS, S3, IAM), Kafka, and CI/CD.
Proficiency in Python; experience with Go, C++, or Rust is a plus.
Staff-level technical leadership skills to drive ambiguous, cross-cutting initiatives.
Must be based in or able to work hybrid in Boston, MA.

Nice to have

Hands-on experience with Ray, KServe, Triton, or vLLM serving stacks.
Experience with LLM serving in production (TensorRT-LLM, SGLang, etc.).
Experience building real-time video or streaming pipelines using Kafka, Kinesis, or Flink.
Expertise in GPU-based inference systems and GPU-aware scheduling.
Familiarity with ML lifecycle tooling such as MLflow or Weights & Biases.

Culture & Benefits

Mission-driven, inclusive, and "no ego" collaborative work environment.
Comprehensive total rewards package including medical, retirement, and lifestyle benefits.
Free hirify.global system and professional monitoring for your home.
Hybrid work model providing a balance between office collaboration and home flexibility.
Employee Resource Groups (ERGs) for networking and professional advocacy.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →