TL;DR
Senior Backend Engineer (Grafana/Distributed Systems): Building and optimizing shared, production-critical infrastructure for Grafana Cloud's next-generation database products (Mimir, Loki, Tempo) with an accent on operating 100+ multi-cloud streaming clusters and diagnosing cross-layer failure modes. Focus on designing safe upgrade strategies, improving observability and automation, and partnering with database and platform teams to ensure safe scaling and query performance for high-throughput, multi-consumer workloads.
Location: Remote (Canadian time zones only)
Salary: CAD 164,490–CAD 197,389
Company
hirify.global is a remote-first, open-source powerhouse, providing an open-source visualization tool and observability strategies for over 3,000 companies.
What you will do
- Operate and evolve 100+ multi-cloud streaming clusters and related database infrastructure.
- Diagnose and eliminate cross-layer failure modes.
- Design safe upgrade and rollout strategies at scale.
- Improve observability, automation, and operational ergonomics.
- Partner closely with database and platform teams to ensure safe scaling, partitioning, consumer fan-out, and query performance.
- Serve as a primary escalation point and on-call for relevant incidents.
Requirements
- 6+ years of engineering experience, including meaningful time in SRE, platform engineering, production engineering, infrastructure engineering, or distributed systems roles.
- Experience operating distributed systems in production (e.g., Kafka, Redpanda, WarpStream, Postgres, ClickHouse, Snowflake, or Cassandra).
- Strong Kubernetes experience in AWS, GCP, or Azure, and familiarity with infrastructure-as-code tooling (Helm, Terraform, Jsonnet, etc.).
- Solid understanding of distributed systems design and large-scale system trade-offs.
- Proficiency in at least one programming language (Go preferred).
- Working knowledge of Linux internals, networking, cloud storage, and performance/scaling behavior.
- Experience participating in blameless incident response and writing high-quality post-incident reviews.
- Must be in Canadian time zones only.
Culture & Benefits
- 100% Remote, Global Culture with talent from around the world.
- Tackle meaningful work in a high-growth, ever-evolving environment.
- Expect open decision-making and regular company-wide updates.
- Autonomy and support to ship great work and try new things.
- 30 days annual leave per annum (3 days reserved for Grafana Shutdown Days).
- In-person onboarding for new hires.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →