Staff Machine Learning Engineer (ML Infrastructure)

183 500 - 269 100$

Формат работы

hybrid

Тип работы

fulltime

Грейд

lead

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Staff Machine Learning Engineer (ML Infrastructure): Designing and operating high-scale ML infrastructure for home security products with an accent on real-time computer vision inference and LLM/GenAI serving. Focus on optimizing Kubernetes-based platforms using Ray, reducing deployment friction, and scaling GPU utilization for production workloads.

Location: Hybrid in Boston, MA (office attendance required on two core days)

Salary: $183,500–$269,100 per year

Company

A high-tech home security company focused on keeping every home secure through innovation and a collaborative, no-ego culture.

What you will do

Drive architecture decisions for a Kubernetes-based ML platform using Ray, KServe, Triton, and vLLM across real-time and batch workloads.
Design and evolve cloud-side inference systems that process live video and events from security devices in real time.
Establish LLM/GenAI serving infrastructure, including model serving patterns, KV-cache, and evaluation pipelines.
Mentor engineers through design and code reviews to elevate the technical bar across the Cloud ML team.
Define SLOs and observability standards while leading incident response and postmortems for critical ML systems.

Requirements

8+ years of software/ML engineering experience with a track record of operating production ML systems at scale.
Deep expertise in cloud ML infrastructure on Kubernetes, specifically with hands-on production experience with Ray.
Strong production experience with AWS (EKS, S3, IAM) and Kafka.
Proven ability to design high-throughput, low-latency inference systems with GPU-aware scheduling and autoscaling.
Proficiency in Python and strong staff-level technical leadership capabilities.
Must be based in Boston, MA to support the hybrid work model.

Nice to have

Hands-on production experience with LLM serving (vLLM, TGI, TensorRT-LLM, SGLang).
Experience with real-time video or streaming ML pipelines (Kafka, Kinesis, Flink).
Background in production CV workloads, model formats, and GPU/accelerator tradeoffs.
Experience with model lifecycle tooling such as MLflow or Weights & Biases.
Open source contributions to the ML infrastructure ecosystem.

Culture & Benefits

Hybrid work model allowing teams to split time between a state-of-the-art office and home.
Comprehensive total rewards package including medical, retirement, and lifestyle benefits.
Annual bonus program and equity participation.
Free hirify.global system and professional monitoring for your home.
Inclusive environment with Employee Resource Groups (ERGs) for networking and mentorship.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →