AI Inference Engineer (AI)

200 000 - 350 000$

Формат работы

onsite

Тип работы

fulltime

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

AI Inference Engineer (AI): Developing and optimizing large-scale machine learning model deployment for real-time inference with an accent on API development, performance benchmarking, and system reliability. Focus on exploring novel research, implementing LLM inference optimizations, and addressing system bottlenecks.

Location: San Francisco

Salary: $200K – $350K

Company

hirify.global is seeking an AI Inference engineer to join their growing team.

What you will do

Develop APIs for AI inference for both internal and external customers.
Benchmark and address bottlenecks throughout the inference stack.
Improve the reliability and observability of systems and respond to system outages.
Explore novel research and implement LLM inference optimizations.

Requirements

Experience with ML systems and deep learning frameworks (e.g., PyTorch, TensorFlow, ONNX).
Familiarity with common LLM architectures and inference optimization techniques (e.g., continuous batching, quantization).
Understanding of GPU architectures or experience with GPU kernel programming using CUDA.

Culture & Benefits

Full-time U.S. employees enjoy a comprehensive benefits program including equity, health, dental, vision, retirement, fitness, commuter, and dependent care accounts.
Full-time employees outside the U.S. enjoy a comprehensive benefits program tailored to their region of residence.

Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →