Senior Site Reliability Engineer (Vehicle SW)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Senior Site Reliability Engineer (Vehicle SW): Ensuring the reliability and observability of an autonomous driving fleet operating on public roads with an accent on incident response, fleet automation, and system hardening. Focus on designing observability stacks, reducing manual intervention through automation, and resolving performance bottlenecks in distributed vehicle software systems.
Location: Hybrid in Leonberg, Germany (minimum 3 days a week in office)
Company
is a leading developer of Embodied AI technology creating mapless and hardware-agnostic AI products for autonomous driving.
What you will do
- Own and improve the reliability, availability, and performance of vehicle software systems across the dev fleet.
- Operate monitoring, logging, and alerting tools to enable fast detection, diagnosis, and recovery.
- Lead incident response and translate root causes into durable fixes and preventative controls.
- Design and deliver automation for fleet operations, deployments, and repetitive workflows.
- Partner with Vehicle SW and platform teams to define SLOs and reliability metrics.
- Harden the production environment through capacity planning and change management.
Requirements
- Proven experience in an SRE or platform operations role for complex distributed systems.
- Strong Linux fundamentals and hands-on experience with Docker and Kubernetes.
- Proficiency in Python, C++, or Rust with a focus on automation.
- Deep troubleshooting skills across networking, distributed systems, and databases.
- Experience designing observability stacks using tools such as Datadog, Prometheus, Grafana, or Splunk.
- Must be based in or able to work from the Leonberg, Germany office (Hybrid).
Nice to have
- Cloud platform experience (AWS, GCP, or Azure) and infrastructure-as-code.
- Experience with real-time or safety-critical systems, hardware-in-the-loop, or robotics.
- Familiarity with fleet operations and telemetry pipelines on edge devices at scale.
- Experience defining and running SLOs/SLIs across multiple teams.
Culture & Benefits
- Hybrid working policy combining in-office collaboration with time spent working from home.
- Inclusive work environment that values diversity and new perspectives.
- Opportunity to contribute to groundbreaking Embodied AI that defines the future of autonomy.
- Fast-paced environment focused on solving complex challenges and delivering high impact.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →