2 месяца назад

Staff Site Reliability Engineer (Site Experience)

Формат работы

remote (только United_kingdom)

Тип работы

fulltime

Грейд

senior

Английский

Страна

Вакансия из Hirify RU Global, списка компаний с восточно-европейскими корнями
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Staff Site Reliability Engineer (SRE/Infrastructure): Leading reliability engineering for critical user-facing systems with an accent on availability, latency, and scalability at internet scale. Focus on architecting for massive global load, reducing operational risk through automation, and driving engineering standards across the organization.

Location: Remote (United Kingdom)

Company

Reddit is one of the internet's largest platforms, serving as a community of communities built on shared interests and authentic conversations.

What you will do

Drive reliability and operational excellence for APIs, content delivery, feed generation, and real-time experiences.
Design highly available systems to handle massive global load, guiding decisions on failover, redundancy, and capacity planning.
Identify systemic risks and build proactive mitigation strategies to reduce incidents and improve service health.
Automate repetitive operational work to improve deployment safety and remediation workflows.
Lead complex incident response efforts and drive blameless postmortems to ensure sustainable long-term fixes.
Mentor engineers and define company-wide best practices for SLIs/SLOs and release engineering.

Requirements

8+ years of experience in SRE or Infrastructure Engineering operating large-scale distributed systems.
Must be based in the United Kingdom.
Strong programming skills in Go, Python, or similar languages.
Deep understanding of Linux systems, networking, and cloud-native architectures.
Strong experience with observability systems, including metrics, logging, tracing, and alerting.
Demonstrated ability to troubleshoot complex issues across applications, infrastructure, and services.

Nice to have

Experience with Kubernetes, containers, and modern deployment platforms.
Familiarity with Prometheus, Grafana, OpenTelemetry, Envoy, Kafka, ClickHouse, Cassandra, or Redis.
Experience with CDN optimization, edge reliability, or traffic engineering.
Contributions to open-source software or participation in technical communities.

Culture & Benefits

Global benefit programs including professional development and caregiving support.
Private medical and dental schemes.
Group Personal Pension Scheme with employer match.
Flexible vacation and paid volunteer time off.
Generous paid parental leave.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Похожие вакансии

Staff Site Reliability Engineer (Site Experience)

Reddit

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Nice to have

Culture & Benefits

Похожие вакансии

Senior Software Engineer (SRE)

Site Reliability Engineer

Senior Site Reliability Engineer (Live Streaming)

Senior Site Reliability Engineer (Azure/AWS)