Назад
8 часов назад

Staff Machine Learning Engineer (ML Efficiency)

Формат работы
remote (только Europe)
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
UK/Netherlands
Вакансия из списка Hirify.GlobalВакансия из Hirify RU Global, списка компаний с восточно-европейскими корнями
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Staff Machine Learning Engineer (ML Efficiency): Design and build systems that improve the efficiency of ML training and inference workloads, including tooling for debugging, profiling, optimization, and monitoring. Focus on GPU/resource utilization, distributed training and serving performance, and building benchmarking frameworks and dashboards to accelerate experimentation while reducing infrastructure costs.

Company

Reddit is a community platform built on shared interests, trust, and open conversations.

What you will do

  • Design and build systems that improve the efficiency of ML training and inference workloads.
  • Develop tooling to help ML engineers debug, profile, optimize, and monitor model performance.
  • Improve GPU and resource utilization via scheduling, resource management, caching, and workload optimization.
  • Optimize distributed training infrastructure, data pipelines, and model serving architectures.
  • Build benchmarking frameworks and performance dashboards for training and serving systems.
  • Lead cross-functional initiatives and drive technical strategy for ML platform scalability, reliability, and cost efficiency.

Requirements

  • BS, MS, or PhD in Computer Science or a related field.
  • 5+ years of software engineering experience.
  • Strong proficiency in Python.
  • Experience building distributed systems at scale.
  • Experience with machine learning infrastructure, training systems, or model serving platforms.
  • Location: must be able to work remotely from the UK or the Netherlands.

Nice to have

  • Experience with large-scale recommendation, ranking, generative AI, or foundation model systems.
  • Experience with distributed training frameworks such as PyTorch Distributed, Ray, Tensorflow, or Spark.
  • Familiarity with GPU architectures and performance analysis tools.
  • Experience optimizing cloud infrastructure costs across large ML workloads.
  • Experience building real-time ML inference applications.

Culture & Benefits

  • Flexible first workforce with remote work from the UK or the Netherlands.
  • Global benefit programs covering workspace, professional development, and caregiving support.
  • Private pension plan with employer matching and a 100% employer-sponsored group medical plan.
  • Flexible vacation and paid volunteer time off, plus generous paid parental leave.
  • Family planning support, gender-affirming care, and mental health & coaching benefits.

Hiring process

  • Interviews may be recorded, transcribed, and summarized by AI in select roles/locations; opt-out is available.
  • Interviews include collection of personal information categories for evaluating employment or independent contractor roles.
  • Recordings are deleted promptly after a hiring decision.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →