Назад
Company hidden
2 дня назад

Staff Site Reliability Engineer (Site Experience)

Формат работы
remote (только United_kingdom)
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
UK
Вакансия из списка Hirify.GlobalВакансия из Hirify RU Global, списка компаний с восточно-европейскими корнями
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Staff Site Reliability Engineer (SRE/Infrastructure): Leading reliability engineering for critical user-facing systems with an accent on availability, latency, and scalability at internet scale. Focus on architecting for massive global load, reducing operational risk through automation, and driving engineering standards across the organization.

Location: Remote (United Kingdom)

Company

hirify.global is one of the internet's largest platforms, serving as a community of communities built on shared interests and authentic conversations.

What you will do

  • Drive reliability and operational excellence for APIs, content delivery, feed generation, and real-time experiences.
  • Design highly available systems to handle massive global load, guiding decisions on failover, redundancy, and capacity planning.
  • Identify systemic risks and build proactive mitigation strategies to reduce incidents and improve service health.
  • Automate repetitive operational work to improve deployment safety and remediation workflows.
  • Lead complex incident response efforts and drive blameless postmortems to ensure sustainable long-term fixes.
  • Mentor engineers and define company-wide best practices for SLIs/SLOs and release engineering.

Requirements

  • 8+ years of experience in SRE or Infrastructure Engineering operating large-scale distributed systems.
  • Must be based in the United Kingdom.
  • Strong programming skills in Go, Python, or similar languages.
  • Deep understanding of Linux systems, networking, and cloud-native architectures.
  • Strong experience with observability systems, including metrics, logging, tracing, and alerting.
  • Demonstrated ability to troubleshoot complex issues across applications, infrastructure, and services.

Nice to have

  • Experience with Kubernetes, containers, and modern deployment platforms.
  • Familiarity with Prometheus, Grafana, OpenTelemetry, Envoy, Kafka, ClickHouse, Cassandra, or Redis.
  • Experience with CDN optimization, edge reliability, or traffic engineering.
  • Contributions to open-source software or participation in technical communities.

Culture & Benefits

  • Global benefit programs including professional development and caregiving support.
  • Private medical and dental schemes.
  • Group Personal Pension Scheme with employer match.
  • Flexible vacation and paid volunteer time off.
  • Generous paid parental leave.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →