AI Inference Engineer

Формат работы

onsite

Тип работы

fulltime

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

AI Inference Engineer: Developing and optimizing APIs and systems for real-time AI model inference with an accent on large-scale deployment, benchmarking, and reliability. Focus on implementing LLM inference optimizations, GPU kernel programming, and improving system observability.

Location: London, United Kingdom

Company

hirify.global is a product company specializing in AI and software development.

What you will do

Develop APIs for AI inference used by internal and external customers
Benchmark and address bottlenecks in the inference stack
Improve system reliability and observability, respond to outages
Explore and implement novel LLM inference optimizations

Requirements

Experience with ML systems and deep learning frameworks such as PyTorch, TensorFlow, ONNX
Familiarity with LLM architectures and inference optimization techniques like continuous batching and quantization
Understanding of GPU architectures or experience with CUDA kernel programming
Location: Must be based in London or able to work onsite
English: B2 level or higher required

Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →