Staff Software Engineer (Databases SRE)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Staff Software Engineer (Databases SRE): Ensuring the reliability of Grafana Cloud's distributed databases (Mimir, Loki, Tempo, Pyroscope) across multiple cloud providers with an accent on SLO definition, automation, and high-SLA customer environments. Focus on reducing SLO burn, designing fault-tolerant patterns, and leading complex incident responses.
Location: Remote (Must be based in the UK, Sweden, Spain, or Germany)
Salary: €94,025 - €112,830 (Spain)
Company
provides an open observability cloud platform used by millions of users and thousands of customers to monitor and ensure the reliability of their systems.
What you will do
- Partner with product engineering squads to own production reliability for high-SLA and complex customer environments.
- Define and evolve per-tenant SLOs and reliability models to proactively reduce budget burn and prevent repeat incidents.
- Design and implement automation to scale reliability practices and eliminate operational toil.
- Lead customer-impacting incident response and conduct high-quality post-incident reviews (PIRs).
- Influence feature design to ensure production scalability and operability across AWS, GCP, and Azure.
- Mentor other engineers and communicate SRE best practices early in the development lifecycle.
Requirements
- 8+ years of engineering experience, with 4+ years specifically in SRE, CRE, or production engineering.
- Strong expertise in Kubernetes across AWS, GCP, or Azure, and IaC tools such as Helm, Terraform, or Jsonnet.
- Proven experience operating multi-tenant systems in production and designing/implementing SLOs.
- Proficiency in programming languages such as Go, Python, or Java.
- Deep knowledge of Linux internals, networking, cloud storage, and system scaling.
- Must be based in the UK, Sweden, Spain, or Germany.
Culture & Benefits
- 100% remote work culture with a global, collaborative environment.
- Transparent communication, open decision-making, and high autonomy.
- Generous leave policy with 30 days of annual leave, including company-wide shutdown days.
- High-trust, low-ego environment that values outcomes over optics.
- Defined career growth pathways and supportive, approachable leadership.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →