Назад
Company hidden
2 месяца назад

Senior / Lead Machine Learning Engineer (AI Serving)

Тип работы
fulltime
Грейд
senior/lead
Английский
c1
Страна
Germany
Релокация
US
Вакансия из списка Hirify.GlobalВакансия из Hirify RU Global, списка компаний с восточно-европейскими корнями
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Senior / Lead Machine Learning Engineer (AI Serving): Developing real-time multimodal models and a high-performance orchestration platform with an accent on inference optimization and model acceleration. Focus on squeezing performance from NVIDIA GPUs, implementing quantization and distillation, and scaling distributed systems to handle thousands of concurrent connections.

Location: Must be based in Germany. Potential relocation to the San Francisco Bay Area (USA) may be available in the future.

Company

hirify.global is a product-oriented research lab developing real-time multimodal models and an orchestration platform optimized for high-frequency queries.

What you will do

  • Optimize inference using modern serving frameworks such as vLLM or TRT-LLM.
  • Implement model acceleration techniques, including quantization, distillation, caching strategies, and speculative decoding.
  • Build high-performance systems using C++, CUDA, Rust, or highly optimized Python to maximize GPU utilization.
  • Scale distributed systems using Kubernetes and Ray for multi-GPU/multi-node inference.
  • Take ownership of the full cycle from research model hand-off to containerization and stable production deployment.

Requirements

  • Deep expertise in inference optimization and modern serving frameworks.
  • Proficiency in high-performance languages (C++, CUDA, Rust, or optimized Python).
  • Experience with Kubernetes, Ray, and handling thousands of concurrent connections.
  • PhD in CS, Physics, Math, or equivalent practical experience building backend or ML systems.
  • Professional fluency in English (written and spoken) is required for daily collaboration with US-based teams.
  • Must be based in Germany.

Culture & Benefits

  • Research-driven environment with a flat structure, fast iterations, and minimal process theater.
  • Engineering culture that prioritizes impact and shipping stable code over purely theoretical optimizations.
  • High degree of autonomy and ownership over architectural decisions to solve latency and throughput problems.
  • Support for open-source contributions that advance the field of AI.
  • Potential for full U.S. visa and relocation support to the San Francisco Bay Area, subject to business needs.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →