Senior Software Engineer - Grafana Databases, Managed Services
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Senior Software Engineer (Grafana Databases, Managed Services): Operating and evolving production-critical shared infrastructure powering Grafana Cloud’s database products (Mimir, Loki, Tempo) with an accent on high-throughput streaming, analytical storage systems, and multi-cloud operations at massive scale. Focus on diagnosing cross-layer failures, designing safe upgrades and rollouts, improving observability and automation, and ensuring reliability under heavy workloads.
Location: Remote (UK time zones only)
Salary: GBP 91,755 - 110,106
Company
is a remote-first open-source company with 20M+ users and 3,000+ enterprise customers using Grafana Cloud and Enterprise stacks for observability.
What you will do
- Operate and evolve 100+ multi-cloud WarpStream clusters and related database infrastructure for metrics, logs, and traces ingestion.
- Diagnose and eliminate cross-layer failure modes like object storage latency, noisy neighbors, and query performance issues.
- Design safe upgrade and rollout strategies at scale across production clusters.
- Improve observability, automation, operational ergonomics, and Kubernetes scheduling dynamics.
- Partner with database and platform teams on scaling, partitioning, fan-out, and performance optimizations.
- Serve as escalation point for incidents, manage vendor relationships, and participate in on-call rotations.
Requirements
- 6+ years engineering experience in SRE, platform, production, infrastructure, or distributed systems roles
- Experience operating distributed systems in production (e.g., Kafka, WarpStream, Postgres, ClickHouse)
- Strong Kubernetes in AWS/GCP/Azure and IaC tools (Helm, Terraform, Jsonnet)
- Solid understanding of distributed systems design, Linux internals, networking, cloud storage
- Proficiency in Go (preferred) or another language; blameless incident response and PIRs
- Clear communication, autonomy, and collaboration across remote global teams
Culture & Benefits
- 100% remote global culture with high trust, autonomy, transparency, and innovation focus.
- RSUs for all roles, equity ownership, and balanced on-call aligned to 12 daylight hours.
- 30 days annual leave including 3 Grafana Shutdown Days; in-person onboarding.
- AI coding assistants with company budget (GPT, Claude, Gemini); modern developer tools.
- Open-source roots, career growth paths, approachable leadership, and empowered teams.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →