Назад
Company hidden
24 часа назад

Member Of Technical Staff - Model Serving / API Backend Engineer (AI)

180 000 - 300 000$
Формат работы
remote/hybrid
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US/Germany
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Member Of Technical Staff - Model Serving / API Backend Engineer (AI): Developing and optimizing production-ready inference services for generative models with an accent on GPU performance, API scalability, and low-latency serving. Focus on bridging the gap between frontier research checkpoints and scalable production endpoints to enable rapid deployment of AI capabilities.

Location: Hybrid (Freiburg, Germany or San Francisco, USA) or Remote with a mandatory monthly in-person week at the offices

Salary: $180,000–$300,000 USD

Company

Research lab behind foundational generative technologies like Stable Diffusion and FLUX, focusing on expanding human creativity through open science.

What you will do

  • Convert research checkpoints into production-ready inference services.
  • Design and maintain high-performance APIs serving millions of requests.
  • Optimize inference latency and throughput across GPU infrastructure.
  • Build scalable serving architectures to handle unpredictable traffic.
  • Implement reliability, monitoring, and observability for model-serving systems.
  • Prototype and ship demos that showcase new model capabilities rapidly.

Requirements

  • Experience building and operating ML inference services at meaningful scale.
  • Proficiency in Python, FastAPI, and async systems.
  • Expertise in GPU infrastructure, CUDA, and inference optimization.
  • Experience with Docker, Kubernetes, Redis, and Postgres.
  • Strong judgment regarding performance, reliability, and cost tradeoffs.
  • Must be able to commit to a monthly in-person week at company offices.

Nice to have

  • Experience with TensorRT, reduced precision, layer fusion, or model compilation.
  • Frontend demo tooling experience (Streamlit, Gradio, React).
  • CI/CD and automated testing for ML systems.
  • Knowledge of security best practices for API and model serving.

Culture & Benefits

  • Research-driven environment valuing deep science and beautiful products.
  • Low-ego culture where the best idea wins regardless of hierarchy.
  • Distributed team with a focus on meaningful in-person connection.
  • Reasonable travel costs covered for monthly office visits.
  • Opportunity to work on world-leading generative AI models used by millions.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →