Machine Learning Engineer (AI)

Формат работы

remote (Global)

Тип работы

fulltime

Английский

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Machine Learning Engineer (AI): Developing and optimizing high-performance model inference systems with an accent on latency, throughput, and cost efficiency. Focus on profiling GPU/CPU pipelines, implementing quantization techniques, and productionizing cutting-edge model architectures for real-world scale.

Location: Remote (World)

Company

hirify.global is a Series A startup focused on pushing the limits of model inference performance at scale.

What you will do

Optimize inference latency, throughput, and cost for large-scale ML models in production.
Profile and resolve bottlenecks in GPU/CPU inference pipelines, including memory, kernels, batching, and IO.
Implement and tune advanced techniques such as quantization (fp16, bf16, int8, fp8), KV-cache optimization, and speculative decoding.
Collaborate with research engineers to transition new model architectures into production-grade systems.
Build and maintain inference-serving systems using Triton, custom runtimes, or bespoke stacks.
Benchmark performance across diverse hardware (NVIDIA/AMD GPUs, CPUs) and cloud environments.

Requirements

Strong experience in ML inference optimization or high-performance ML systems.
Deep understanding of deep learning internals, including attention mechanisms, memory layout, and compute graphs.
Hands-on proficiency with PyTorch and experience in model deployment.
Familiarity with GPU performance tuning using CUDA, ROCm, Triton, or kernel-level optimizations.
Proven experience scaling inference for real users beyond research benchmarks.
Ability to work in a fast-moving startup environment with high ownership and ambiguity.

Nice to have

Experience with LLM or long-context model inference.
Knowledge of inference frameworks such as TensorRT, ONNX Runtime, vLLM, or Triton.
Experience optimizing across different hardware vendors.
Contributions to open-source ML systems or inference tooling.
Background in distributed systems or low-latency services.

Culture & Benefits

Real ownership over performance-critical systems with direct impact on unit economics.
Competitive compensation package including meaningful equity at Series A.
Close collaboration with research, infrastructure, and product teams.
Engineering culture that prioritizes technical quality over hype.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Machine Learning Engineer (AI)

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Nice to have

Culture & Benefits

Похожие вакансии

ML Research Engineer

Research Engineer, Code RL (AI)

ML/RL Research Engineer (AI)

Senior ML Engineer (Python, LLM)

Senior Machine Learning Engineer (AI)

Machine Learning Engineer (Recommendation)

Разработка

Game Dev

Design и Creative

Аналитика

Менеджмент

People & Business