Назад
Company hidden
4 часа назад

Senior AI/ML Engineer (Inference)

Формат работы
remote (только USA)
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Senior AI/ML Engineer (LLM Inference): Building state-of-the-art inference capabilities for generative AI models on latest hardware with an accent on LLM deployment, serving runtimes, and lifecycle management. Focus on developing high-scale production ML pipelines, optimizing performance across diverse environments, and integrating with frameworks like vLLM and SGLang.

Location: Fully distributed and remote-first

Company

Microsoft-backed startup developing a platform for deploying AI inference at scale to any cloud, edge, or on-prem environment.

What you will do

  • Partner with customers and hardware teams to solve inference problems and build capabilities.
  • Negotiate tradeoffs with product managers and break down features into incremental deliverables.
  • Mentor junior team members and propose workflow improvements.
  • Actively participate in discussions, provide proactive communication, and deliver continuous feedback.
  • Experiment with new ideas, adapt to evolving priorities, and bring clarity to complex challenges.

Requirements

  • Excellent verbal and written communication skills.
  • Strong Python experience, including packaging with Pip, uv, etc.
  • Experience with vLLM, SGLang, and other modern serving frameworks.
  • Experience developing high-scale production ML pipelines.
  • Strong foundation in deploying state-of-the-art generative AI models.
  • At least four years of related experience.

Nice to have

  • Experience developing in Rust.
  • Experience with containers, Docker, and Kubernetes.
  • Experience in AWS, GCP, and Azure.
  • Experience with AzureML, Google Vertex, Databricks, or Sagemaker.
  • Experience with GPU environments like CUDA, ROCm, or OpenVino.
  • Prior startup or high-velocity environment experience.

Culture & Benefits

  • Fully distributed and remote-first team emphasizing self-motivation, collaboration, and fast-paced innovation.
  • Unlimited time off policy.
  • For US employees: Medical, Dental, Vision starting at $1, One Medical, Life insurance, FSAs, Pet insurance, 401K.

Hiring process

  • Resume review and initial 45-min screen with technical/behavioral questions.
  • 90-min technical interview (possibly with take-home) by engineering team.
  • Behavioral interview on team fit, communication, and decision-making.
  • Hiring manager interview on perseverance, conflict resolution, and career growth.
  • Offer; candidates encouraged to ask questions at each stage.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →