Назад
Company hidden
4 месяца назад

Senior ML Engineer (AI)

Формат работы
remote (только Europe)
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
UK/CR/Netherlands +2 еще
Вакансия из списка Hirify.GlobalВакансия из Hirify RU Global, списка компаний с восточно-европейскими корнями
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Senior ML Engineer (AI): Building a high-performance inference and fine-tuning platform for foundation models with an accent on maximizing throughput, minimizing latency, and optimizing cost-per-token across tens of thousands of GPUs. Focus on identifying LLM inference bottlenecks, implementing novel speculative decoding architectures, and productionizing low-precision training and inference pipelines.

Location: Remote - Europe, with R&D hubs in Amsterdam, Berlin, London, Prague, and Israel.

Company

hirify.global is a cloud computing company leading the AI economy by creating tools and resources for AI/ML solutions, without massive infrastructure costs or the need for large in-house AI/ML teams.

What you will do

  • Optimize LLM inference to achieve production speedups and maximum performance for various LLM architectures at scale.
  • Implement novel speculative decoding architectures and contribute to open-source inference engines.
  • Design and productionize low-precision training and inference pipelines (FP8, NVFP4/MXFP4) with measurable gains.
  • Profile GPU workloads to identify bottlenecks and drive performance improvements.
  • Contribute to building a high-performance inference and fine-tuning platform designed to push foundation models to their hardware limits.

Requirements

  • Profound understanding of theoretical machine learning foundations and transformer architecture.
  • Experience profiling GPU workloads using Nsight, PyTorch profiler, or similar tools.
  • Understanding of GPU memory hierarchy and compute/memory tradeoffs.
  • Familiarity with important ideas in LLM space, such as MHA, RoPE, KV-cache, Flash Attention, and quantisation.
  • Understanding of performance aspects of large neural network training (sharding strategies, custom kernels, hardware features).
  • Strong software engineering skills, primarily in Python.
  • Deep experience with modern deep learning frameworks.
  • Proficiency in contemporary software engineering approaches, including CI/CD, version control, and unit testing.
  • Strong communication and leadership abilities.

Nice to have

  • Experience working with open-source inference engines (vLLM, SGLang, TensorRT-LLM), including contributions.
  • Experience with kernel languages or DSLs such as Triton, Cute, CUTLASS, CUDA.
  • Track record of building and delivering products in a dynamic startup-like environment.
  • Strong engineering skills, including experience in developing large distributed systems or high-load web services.
  • Open-source projects that showcase your engineering prowess.
  • Excellent command of the English language.

Culture & Benefits

  • Competitive salary and comprehensive benefits package.
  • Opportunities for professional growth within hirify.global.
  • Flexible working arrangements.
  • A dynamic and collaborative work environment that values initiative and innovation.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →