TL;DR
Manager, Site Reliability Engineering: Building and leading a high-performing SRE team to ensure system reliability, scalability, and observability for the hirify.global Data Cloud with an accent on driving SRE principles (SLIs/SLOs, error budgets) and incident excellence. Focus on making code contributions, leading software-first reliability investments, and operationalizing reliability strategies.
Location: Office-based in Czechia, with remote work possible if located in the Czech Republic.
Company
hirify.global is the #1 global market leader in data resilience, providing data backup, recovery, portability, security, and intelligence to over 550,000 customers worldwide.
What you will do
- Hire, onboard, and grow your SRE team, fostering a psychologically safe, blameless culture.
- Establish and operationalize SLIs/SLOs and error budgets with service owners.
- Ensure incident response readiness, lead/coordinate major incidents, and drive fast, high-quality postmortems.
- Lead software-first reliability investments in observability, deployment safety, and resilience testing.
- Drive platform improvements (IaC, CI/CD, Kubernetes) and internal tools that scale operations.
- Track and cap toil, ensuring sustainable operational coverage and monitoring on-call health.
Requirements
- 7+ years in Software, Platform, and/or Reliability Engineering with 2+ years managing engineers.
- Demonstrable experience leading engineering teams to predictably deliver outcomes.
- Experience with public cloud (Azure preferred), Kubernetes, IaC (Terraform, Pulumi), CI/CD (Github Actions, ArgoCD, Azure DevOps), and observability (OpenTelemetry, Elastic, Datadog, Prometheus, Grafana).
- Coding background with experience improving service reliability.
- Hands-on incident management and postmortem practice; excellent cross-geo communication.
- Willingness to participate in an on-call rotation (typically during daytime hours, including weekends/holidays).
Nice to have
- Demonstrated success leading SLO/error-budget adoption and reliability programs for cloud services.
- Experience operating a multi-region, follow-the-sun on-call model.
- Background in chaos/resilience/performance testing and release validation.
- Track record building or scaling SRE teams and influencing org-wide standards.
- Familiarity with compliance frameworks common to SaaS.
Culture & Benefits
- 25 vacation days, 4 sick days, 21 paid medical leave days, plus 4 extra global hirify.globale Days for self-care and 24 paid volunteer hours annually.
- Premium private medical insurance for employees and dependents.
- Daily meal vouchers for restaurants and groceries (180 CZK per working day).
- Flexible cafeteria platform with thousands of lifestyle benefit options.
- Multisport Card for gym and wellness, with family add-on options.
- Annual public transport reimbursement up to a set limit.
- Corporate mobile plan with optional family tariff.
- Opportunities to learn and grow through on-demand libraries (LinkedIn Learning, O’Reilly), mentoring, workshops and learning events.
Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →