Senior Software Developer (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Senior Software Developer (AI): Building a next-generation platform for high-performance AI inference and model serving with an accent on distributed systems and inference optimization. Focus on optimizing SOTA open-source models, implementing cache-aware routing, and scaling serving architectures for production workloads.
Location: Amsterdam, Netherlands. Hybrid setup (remote collaboration with in-person meetings every 1-2 months). Applicants must be authorized to work in the country in which they apply.
Company
is building a full-stack AI cloud platform for the global AI economy, specializing in GPU orchestration and high-performance inference optimization.
What you will do
- Onboard and serve state-of-the-art open-source models (e.g., DeepSeek, GLM, Kimi) into the Token Factory platform.
- Implement advanced inference techniques including cache-aware routing, NUMA-aware deployments, and KV-cache offloading.
- Maintain and extend forks of leading inference frameworks such as vLLM and TRT-LLM.
- Build internal tooling for performance benchmarking, quality testing, and automated rollout pipelines.
- Collaborate with model builders, open-source communities, and hardware vendors to improve serving infrastructure.
Requirements
- Proven experience serving LLMs in production environments.
- Strong programming skills in Python and/or Go.
- Experience designing and operating highly scalable, highly available distributed services.
- Must be authorized to work in the country of application.
Nice to have
- Contributions to vLLM, SGLang, TRT-LLM, or NVIDIA ecosystem open-source projects.
- Deep understanding of KV cache management, speculative decoding, and quantization.
- Hands-on experience with Kubernetes and high-performance networking (InfiniBand, RoCE).
- Experience with LLM evaluation frameworks and performance benchmarking.
Culture & Benefits
- Competitive compensation package.
- Flexible work environment with high levels of ownership and autonomy.
- International team culture with regular in-person synchronization events.
- Opportunity to work on impactful AI infrastructure projects at scale.
- Collaborative and innovative engineering-driven culture.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →