Technical Product Manager (AI Compute Platform)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Technical Product Manager (AI Compute Platform): Building and scaling a full-stack AI cloud platform with an accent on GPU orchestration, cluster lifecycle management, and developer experience. Focus on designing hyperscaler-quality APIs, driving cross-team execution, and translating complex customer requirements into robust infrastructure solutions.
Location: Amsterdam, Netherlands or Remote (Europe)
Company
is a leading AI cloud platform provider building high-end, training-optimized infrastructure for the global AI economy, headquartered in Amsterdam.
What you will do
- Own end-to-end product strategy, roadmap, and delivery for a specific slice of the AI compute platform.
- Design and maintain platform contracts, including APIs, system events, and operational behaviors.
- Drive cross-team collaboration across engineering, networking, storage, and observability teams.
- Conduct structured discovery through customer interviews and usage analytics to identify and solve pain points.
- Engage with engineering leaders as a technical peer to debate system trade-offs and design quality.
- Define and track success metrics based on measurable customer and platform outcomes.
Requirements
- 6+ years of experience in Product Management, Platform PM, Infrastructure PM, or SRE/Engineering Lead roles.
- Strong technical foundation in cloud infrastructure, including API semantics and control-plane vs data-plane behavior.
- Experience building or operating GPU or HPC infrastructure at scale (thousands of nodes).
- Proven track record of shipping technically complex platform products with measurable impact.
- Strong analytical skills and experience leading discovery-heavy work.
- Must be authorized to work in the country of application.
Nice to have
- Direct experience with frontier AI customers, ML platform teams, or MLOps.
- Hands-on experience with NVIDIA reference architectures, CUDA, and NCCL.
- Background in Kubernetes lifecycle management or Slurm at scale.
- Experience with observability products like Grafana, Datadog, or Honeycomb.
- Familiarity with reliability engineering and fault-tolerant systems.
Culture & Benefits
- Competitive compensation package.
- International environment with talented engineering teams.
- Focus on career growth and continuous learning.
- Collaborative culture emphasizing ownership and bold thinking.
- Opportunity to work on impactful, large-scale AI projects.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →