Назад
Company hidden
обновлено 11 часов назад

Lead Engineer, Inference Platform (AI)

137 000 - 270 000$
Формат работы
hybrid
Тип работы
fulltime
Грейд
lead
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Lead Engineer, Inference Platform (AI Engineering): Building the inference platform for embedding models powering semantic search, retrieval, and AI-native features in hirify.global Atlas with an accent on real-time, high-scale, and low-latency inference. Focus on guiding technical direction, mentoring junior engineers, and ensuring delivery of impactful features in a multi-tenant, cloud-native environment.

Location: Must be based in Palo Alto or Seattle for our hybrid working model.

Salary: $137,000—$270,000 USD

Company

hirify.global is redefining the database for the AI era with a unified database platform, hirify.global Atlas, available across AWS, Google Cloud, and Microsoft Azure.

What you will do

  • Partner with Search Platform and Voyage.ai AI engineers and researchers to productionize embedding models and rerankers, supporting batch and real-time inference.
  • Lead projects around performance optimization, GPU utilization, autoscaling, and observability for the inference platform.
  • Design and build components of a multi-tenant inference service that integrates with Atlas Vector Search.
  • Contribute to platform features like model versioning, safe deployment pipelines, latency-aware routing, and model health monitoring.
  • Collaborate with peers across ML, infra, and product teams to define architectural patterns and operational practices.
  • Guide decisions on model serving architecture using tools like vLLM, ONNX Runtime, and container orchestration in Kubernetes.

Requirements

  • 8+ years of engineering experience in backend systems, ML infrastructure, or scalable platform development, with technical leadership experience.
  • Expertise in serving embedding models in production environments.
  • Strong systems skills in languages like Go, Rust, C++, or Python.
  • Comfortable working on cloud-native distributed systems, with a focus on latency, availability, and observability.
  • Familiarity with inference runtimes and vector search systems.
  • Proven ability to collaborate across disciplines and experience levels.
  • Experience with high-scale SaaS infrastructure, particularly in multi-tenant environments.

Nice to have

  • Prior experience working with model teams on inference-optimized architectures.
  • Background in hybrid retrieval, prompt-based pipelines, or retrieval-augmented generation (RAG).
  • Contributions to relevant open-source ML serving infrastructure.

Culture & Benefits

  • Shape the future of AI-native developer experiences.
  • Collaborate with ML experts from Voyage.ai.
  • Solve hard problems in real-time inference, model serving, and semantic retrieval.
  • Work in a culture that values mentorship, autonomy, and strong technical craft.
  • Competitive compensation, equity, and career growth in a hands-on technical leadership role.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Текст вакансии взят без изменений

Источник - загрузка...