Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Staff Machine Learning Engineer (ML Efficiency): Design and build systems that improve the efficiency of ML training and inference workloads, including tooling for debugging, profiling, optimization, and monitoring. Focus on GPU/resource utilization, distributed training and serving performance, and building benchmarking frameworks and dashboards to accelerate experimentation while reducing infrastructure costs.
Company
Reddit is a community platform built on shared interests, trust, and open conversations.
What you will do
- Design and build systems that improve the efficiency of ML training and inference workloads.
- Develop tooling to help ML engineers debug, profile, optimize, and monitor model performance.
- Improve GPU and resource utilization via scheduling, resource management, caching, and workload optimization.
- Optimize distributed training infrastructure, data pipelines, and model serving architectures.
- Build benchmarking frameworks and performance dashboards for training and serving systems.
- Lead cross-functional initiatives and drive technical strategy for ML platform scalability, reliability, and cost efficiency.
Requirements
- BS, MS, or PhD in Computer Science or a related field.
- 5+ years of software engineering experience.
- Strong proficiency in Python.
- Experience building distributed systems at scale.
- Experience with machine learning infrastructure, training systems, or model serving platforms.
- Location: must be able to work remotely from the UK or the Netherlands.
Nice to have
- Experience with large-scale recommendation, ranking, generative AI, or foundation model systems.
- Experience with distributed training frameworks such as PyTorch Distributed, Ray, Tensorflow, or Spark.
- Familiarity with GPU architectures and performance analysis tools.
- Experience optimizing cloud infrastructure costs across large ML workloads.
- Experience building real-time ML inference applications.
Culture & Benefits
- Flexible first workforce with remote work from the UK or the Netherlands.
- Global benefit programs covering workspace, professional development, and caregiving support.
- Private pension plan with employer matching and a 100% employer-sponsored group medical plan.
- Flexible vacation and paid volunteer time off, plus generous paid parental leave.
- Family planning support, gender-affirming care, and mental health & coaching benefits.
Hiring process
- Interviews may be recorded, transcribed, and summarized by AI in select roles/locations; opt-out is available.
- Interviews include collection of personal information categories for evaluating employment or independent contractor roles.
- Recordings are deleted promptly after a hiring decision.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →