Назад
Company hidden
8 часов назад

Lead Site Reliability Engineer (SaaS)

136 000 - 177 000$
Формат работы
remote (только USA)
Тип работы
fulltime
Грейд
lead
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Lead Site Reliability Engineer (SaaS): Owning reliability outcomes for a modern split-plane, multi-region SaaS platform with an accent on system design, reliability strategy, and cross-team execution. Focus on driving SLO attainment, MTTR reduction, and scaling reliability engineering across the platform.

Location: Must be based in the United States

Salary: $136,000–$177,000

Company

hirify.global is a leading analytics platform provider focused on transforming how businesses leverage data, automation, and AI.

What you will do

  • Define and drive reliability strategy for control-plane and data-plane systems, including multi-region resilience and failover design.
  • Establish and operationalize SLOs, SLAs, and error budgets to guide engineering tradeoffs.
  • Lead initiatives to improve MTTR, incident prevention, and overall service health.
  • Own end-to-end incident management, driving systemic fixes and long-term reliability improvements.
  • Lead architecture reviews to ensure scalability, reliability, and cost efficiency.
  • Mentor senior engineers and champion automation, including AI-driven reliability improvements.

Requirements

  • 6+ years of experience leading the delivery of complex, distributed systems or SaaS platforms.
  • Strong experience with multi-region, split-plane architectures.
  • Proficiency in Python, Java, C++, or JavaScript.
  • Deep expertise in Kubernetes (multi-cluster), CI/CD, GitOps, and Infrastructure as Code.
  • Proven track record in SLO/SLA design, observability, and incident management.
  • Must be eligible to work in compliance with U.S. export controls.

Nice to have

  • Experience with chaos engineering and large-scale reliability automation.
  • Expertise in modern observability platforms like Datadog or Grafana.

Culture & Benefits

  • Comprehensive benefits package including medical, retirement, and wellness programs.
  • Commitment to a diverse, equitable, and inclusive workplace.
  • Support for a growth mindset and career development at all stages.
  • Flexible time off and employee discounts.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →