Назад
Company hidden
1 час назад

Senior Site Reliability Engineer (Telecom)

Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
Germany
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Senior Site Reliability Engineer (DevOps/Telecom): Strengthening the stability, scalability, and reliability of global infrastructure and services across cloud and on-prem environments with an accent on SLIs/SLOs, redundancy testing, and automated recovery. Focus on building self-healing workflows, enhancing observability with OpenTelemetry and Prometheus, and reducing operational toil.

Location: Must be based in Berlin, Germany

Company

A technology-driven global mobile communications provider delivering connectivity solutions via an in-house eSIM platform and core network.

What you will do

  • Define, measure, and maintain SLIs and SLOs for core infrastructure and customer-facing services.
  • Plan and execute redundancy and resilience testing across service, infrastructure, and networking layers.
  • Design and implement automated recovery mechanisms, self-healing workflows, and intelligent alerting systems.
  • Drive incident response, root-cause analysis, and blameless post-mortems to ensure continuous improvement.
  • Enhance observability using Prometheus, Grafana, Loki, and OpenTelemetry.
  • Contribute to cloud cost-optimization and perform capacity planning and resilience audits.

Requirements

  • Minimum 5 years of experience in Site Reliability, Systems, or Infrastructure Engineering, with 2+ years in a dedicated SRE role.
  • Strong expertise in Linux systems engineering, distributed systems, and networking (BGP, DNS, routing, load balancing).
  • Hands-on experience with Kubernetes, container orchestration, and service mesh architectures.
  • Proficiency in Python, Go, and Bash for automation and reliability tooling.
  • Experience with AWS (EKS, EC2, VPC) and Infrastructure as Code tools like Terraform.
  • Must be based in or be able to work from Berlin, Germany.

Nice to have

  • Experience in telecom or carrier-grade large-scale distributed systems environments.
  • Hands-on experience with chaos engineering and automated failure-scenario validation.
  • Background in capacity planning, traffic engineering, and multi-region failover.
  • Familiarity with security and resilience standards such as ISO 27001 or NIST SP 800-53.

Culture & Benefits

  • Rapid career growth in a company expanding over 100% year-on-year.
  • High-impact exposure to transactions shaping the future of the telco industry.
  • Collaboration with a talented international team and renowned external advisors.
  • Opportunities to work in different hirify.global offices worldwide.
  • Supportive and transparent environment with an open communication culture.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →