Member Of Technical Staff - Model Serving / API Backend Engineer (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Member Of Technical Staff - Model Serving / API Backend Engineer (AI): Developing and optimizing production-ready inference services for generative models with an accent on GPU performance, API scalability, and low-latency serving. Focus on bridging the gap between frontier research checkpoints and scalable production endpoints to enable rapid deployment of AI capabilities.
Location: Hybrid (Freiburg, Germany or San Francisco, USA) or Remote with a mandatory monthly in-person week at the offices
Salary: $180,000–$300,000 USD
Company
Research lab behind foundational generative technologies like Stable Diffusion and FLUX, focusing on expanding human creativity through open science.
What you will do
- Convert research checkpoints into production-ready inference services.
- Design and maintain high-performance APIs serving millions of requests.
- Optimize inference latency and throughput across GPU infrastructure.
- Build scalable serving architectures to handle unpredictable traffic.
- Implement reliability, monitoring, and observability for model-serving systems.
- Prototype and ship demos that showcase new model capabilities rapidly.
Requirements
- Experience building and operating ML inference services at meaningful scale.
- Proficiency in Python, FastAPI, and async systems.
- Expertise in GPU infrastructure, CUDA, and inference optimization.
- Experience with Docker, Kubernetes, Redis, and Postgres.
- Strong judgment regarding performance, reliability, and cost tradeoffs.
- Must be able to commit to a monthly in-person week at company offices.
Nice to have
- Experience with TensorRT, reduced precision, layer fusion, or model compilation.
- Frontend demo tooling experience (Streamlit, Gradio, React).
- CI/CD and automated testing for ML systems.
- Knowledge of security best practices for API and model serving.
Culture & Benefits
- Research-driven environment valuing deep science and beautiful products.
- Low-ego culture where the best idea wins regardless of hierarchy.
- Distributed team with a focus on meaningful in-person connection.
- Reasonable travel costs covered for monthly office visits.
- Opportunity to work on world-leading generative AI models used by millions.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →