Senior Site Reliability Engineer

Формат работы

hybrid

Тип работы

fulltime

Грейд

senior

Английский

Страна

Spain/Europe

Релокация

Spain

Описание вакансии

Текст:

TL;DR

Senior Site Reliability Engineer: Strengthening infrastructure and enhancing the ability to deploy, monitor, and scale SaaS platform systems with an accent on cloud-native systems, Kubernetes, and infrastructure-as-code. Focus on defining SLOs, improving incident response, and embedding reliability principles into service design.

Location: Hybrid work model, with offices near the Bernabeu Stadium in Madrid.

Company

hirify.global is a leader in digital employee experience management software, providing unprecedented insight to IT leaders for proactive optimization of employee experience.

What you will do

Implement and manage cloud-native AWS systems and automate operations.
Operate and enhance Kubernetes clusters, deployment pipelines, and service meshes.
Design, build, and maintain the infrastructure for a multi-tenant SaaS platform, focusing on reliability, security, and scalability.
Define and maintain SLOs, SLAs, and error budgets, and proactively address availability and performance issues.
Participate in a shared on-call rotation, acting as Incident Commander and refining incident response processes.
Work closely with software engineers to embed observability, fault tolerance, and reliability principles into service design.

Requirements

Minimum Bachelor’s degree in Computer Science or equivalent practical experience.
5+ years of experience as a Site Reliability Engineer or Platform Engineer.
Strong hands-on experience with public cloud services (AWS) and supporting SaaS products.
Proficiency in programming/scripting (Python, Go, Bash) and infrastructure-as-code (Terraform).
Experience with Kubernetes, Docker, CI/CD pipelines, and monitoring solutions (Datadog).
Comfortable with participating in a rotating on-call schedule and leading post-incident reviews.
Deep understanding of Linux systems, networking, and common troubleshooting practices.
Excellent written and verbal skills in English.

Nice to have

Exposure to compliance standards such as SOC 2, ISO 27001, or HIPAA; FedRAMP experience.
Experience with chaos engineering or resilience testing practices.

Culture & Benefits

Permanent contract and a competitive compensation package.
Private Health Insurance (Sanitas) and daily meal vouchers of 11 EUR.
Hybrid work model and flexible hours, plus unlimited vacation, 23 holidays, and 3 volunteer days.
Up to 25 EUR per month for a gym subscription and flexible compensation for childcare & public transportation.
Reimbursement of up to 50% of the cost of English & Spanish classes.
Regular company and team events.
A relocation package for candidates coming from another country.