Senior Site Reliability Engineer

Формат работы

remote (только USA)

Тип работы

fulltime

Грейд

senior

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Senior Site Reliability Engineer: Building tools, automation, and observability for resilient high-scale systems supporting fan engagement platforms with an accent on metrics, alerting, and incident response. Focus on defining SLIs/SLOs, streamlining CI/CD pipelines, automating reliability checks, and driving operational excellence through blameless postmortems and capacity planning.

Location: Remote (US-based, US work authorization required). Hybrid/flexible work environment.

Company

Growth-stage company providing fan engagement platforms for high school sports, including ticketing, streaming, fundraising, and more, trusted by thousands of US schools.

What you will do

Assess and improve system visibility by reviewing dashboards, metrics, logs, and implementing targeted enhancements.
Refine monitoring, alerting, and dashboards for critical services to enable faster issue detection and response.
Integrate observability and telemetry into build, deploy, and release processes.
Define SLIs/SLOs for core user flows and align teams on reliability standards.
Streamline incident response, automate routine tasks, and participate in on-call rotations.
Partner with engineering teams to implement reliability best practices, release automation, and proactive incident prevention.

Requirements

Solid experience in Python for automation and operational tasks
Proficiency in at least one of Java, C++, or Go
Strong knowledge of Linux, cloud infrastructure (AWS, GCP, Azure), Docker, Kubernetes, Terraform
Experience with CI/CD pipelines, version control, automated testing, observability tools (Prometheus, Grafana, ELK, Datadog)
Proven experience with SLAs/SLOs, critical user journeys, incident facilitation, and cross-functional collaboration
Problem-solving mindset treating reliability as a shared responsibility

Nice to have

Experience with end-to-end/integration tests, performance testing, chaos engineering
Contributions to developer tooling or reliability frameworks
Exposure to security, compliance, change management
Relevant certifications

Culture & Benefits

Accountability, collaboration, growth, and fairness-focused culture
Multiple medical, dental, vision, life, and disability insurance plans
401K with company match, company equity (stock options), Employee Emergency Fund
Open PTO policy
Must be full-time employee for health benefits eligibility

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Senior Site Reliability Engineer

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Nice to have

Culture & Benefits

Похожие вакансии

Site Reliability Engineer

Senior SRE (Web3)

Sr. AWS DevOps Engineer (Kubernetes)

Middle / Senior / Team Lead Site Reliability Engineer (Go/K8s)

Staff Site Reliability Engineer

Senior DevOps Engineer (AI)