Назад
Company hidden
1 день назад

Senior ML Engineer (AI)

Формат работы
remote (только Europe)
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
UK, CR, Netherlands, Israel, Germany
Вакансия из списка Hirify.GlobalВакансия из Hirify RU Global, списка компаний с восточно-европейскими корнями
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Senior ML Engineer (AI): Building a high-performance inference and fine-tuning platform for foundation models with an accent on maximizing throughput, minimizing latency, and optimizing cost-per-token across tens of thousands of GPUs. Focus on identifying LLM inference bottlenecks, implementing novel speculative decoding architectures, and productionizing low-precision training and inference pipelines.

Location: Remote - Europe, with R&D hubs in Amsterdam, Berlin, London, Prague, and Israel.

Company

hirify.global is a cloud computing company leading the AI economy by creating tools and resources for AI/ML solutions, without massive infrastructure costs or the need for large in-house AI/ML teams.

What you will do

  • Optimize LLM inference to achieve production speedups and maximum performance for various LLM architectures at scale.
  • Implement novel speculative decoding architectures and contribute to open-source inference engines.
  • Design and productionize low-precision training and inference pipelines (FP8, NVFP4/MXFP4) with measurable gains.
  • Profile GPU workloads to identify bottlenecks and drive performance improvements.
  • Contribute to building a high-performance inference and fine-tuning platform designed to push foundation models to their hardware limits.

Requirements

  • Profound understanding of theoretical machine learning foundations and transformer architecture.
  • Experience profiling GPU workloads using Nsight, PyTorch profiler, or similar tools.
  • Understanding of GPU memory hierarchy and compute/memory tradeoffs.
  • Familiarity with important ideas in LLM space, such as MHA, RoPE, KV-cache, Flash Attention, and quantisation.
  • Understanding of performance aspects of large neural network training (sharding strategies, custom kernels, hardware features).
  • Strong software engineering skills, primarily in Python.
  • Deep experience with modern deep learning frameworks.
  • Proficiency in contemporary software engineering approaches, including CI/CD, version control, and unit testing.
  • Strong communication and leadership abilities.

Nice to have

  • Experience working with open-source inference engines (vLLM, SGLang, TensorRT-LLM), including contributions.
  • Experience with kernel languages or DSLs such as Triton, Cute, CUTLASS, CUDA.
  • Track record of building and delivering products in a dynamic startup-like environment.
  • Strong engineering skills, including experience in developing large distributed systems or high-load web services.
  • Open-source projects that showcase your engineering prowess.
  • Excellent command of the English language.

Culture & Benefits

  • Competitive salary and comprehensive benefits package.
  • Opportunities for professional growth within hirify.global.
  • Flexible working arrangements.
  • A dynamic and collaborative work environment that values initiative and innovation.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Текст вакансии взят без изменений

Источник - загрузка...