Назад
Company hidden
1 день назад

GPU Engineer (AI)

Формат работы
hybrid
Тип работы
fulltime
Английский
b2
Страна
France
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

GPU Engineer (AI): Building the fastest LLM inference engine on standard datacenter GPUs with an accent on low-level GPU optimization and monokernel development. Focus on writing optimized CUDA/HIP kernels, implementing inter-GPU collectives, and scaling the stack for MoE models.

Location: Hybrid (Paris, France) - Must spend at least 50% of time in the office

Company

hirify.global builds the fastest LLM inference engine on standard datacenter GPUs.

What you will do

  • Contribute to a monokernel pipeline covering the full decode pass across AMD and NVIDIA architectures.
  • Develop low-level GPU optimizations, including grid synchronizations and inter-GPU collectives.
  • Create optimized GEMM and attention kernels for specific batch sizes and context lengths.
  • Build profiling infrastructure with custom instrumentation and device-timestamp frameworks.
  • Scale the stack to support third-party MoE models such as DeepSeek v4 and Qwen 3.
  • Develop AI agents for autonomous GPU engineering research and kernel optimization.

Requirements

  • Proven experience writing GPU kernels where performance was the central constraint (code samples required).
  • Deep understanding of hardware below the framework level, including inline PTX or CDNA ISA.
  • Knowledge of latency-sensitive execution paths and GPU internals.
  • Must be based in Paris, France, or able to be present in the office at least 50% of the time.
  • A degree from a top engineering school or a PhD with concrete GPU work.

Culture & Benefits

  • Direct access to AMD and NVIDIA datacenter GPUs from day one.
  • High-impact environment where technical judgment and creativity shape key decisions.
  • Compensation aligned with top technical profiles in the Paris AI market, including equity.
  • Opportunity to solve critical problems directly influencing model execution speed.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →