GPU Engineer (AI)

Формат работы

hybrid

Тип работы

fulltime

Английский

Страна

France

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

GPU Engineer (AI): Building the fastest LLM inference engine on standard datacenter GPUs with an accent on low-level GPU optimization and monokernel development. Focus on writing optimized CUDA/HIP kernels, implementing inter-GPU collectives, and scaling the stack for MoE models.

Location: Hybrid (Paris, France) - Must spend at least 50% of time in the office

Company

hirify.global builds the fastest LLM inference engine on standard datacenter GPUs.

What you will do

Contribute to a monokernel pipeline covering the full decode pass across AMD and NVIDIA architectures.
Develop low-level GPU optimizations, including grid synchronizations and inter-GPU collectives.
Create optimized GEMM and attention kernels for specific batch sizes and context lengths.
Build profiling infrastructure with custom instrumentation and device-timestamp frameworks.
Scale the stack to support third-party MoE models such as DeepSeek v4 and Qwen 3.
Develop AI agents for autonomous GPU engineering research and kernel optimization.

Requirements

Proven experience writing GPU kernels where performance was the central constraint (code samples required).
Deep understanding of hardware below the framework level, including inline PTX or CDNA ISA.
Knowledge of latency-sensitive execution paths and GPU internals.
Must be based in Paris, France, or able to be present in the office at least 50% of the time.
A degree from a top engineering school or a PhD with concrete GPU work.

Culture & Benefits

Direct access to AMD and NVIDIA datacenter GPUs from day one.
High-impact environment where technical judgment and creativity shape key decisions.
Compensation aligned with top technical profiles in the Paris AI market, including equity.
Opportunity to solve critical problems directly influencing model execution speed.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →