Эта вакансия в архиве

Посмотреть похожие вакансии ↓
Company hidden
1 месяц назад

Senior Site Reliability Engineer (Resilience) - Platform Resilience

154 800 - 195 600$
Формат работы
remote (только USA)
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US

Описание вакансии

Текст:
/

TL;DR

Senior Site Reliability Engineer (Resilience) - Platform Resilience: Building and maintaining highly reliable, scalable multi-cloud infrastructure powering mission-critical SaaS services with an accent on automation, observability, and incident prevention strategies. Focus on designing tools and frameworks, conducting root cause analysis, and enhancing system resilience in large-scale distributed environments.

Location: Remote in the United States

Salary: $154,800 to $195,600 USD

Company

Global Platform Engineering organization supporting large-scale distributed SaaS and platform services across multi-cloud environments.

What you will do

  • Design, build, and maintain reliable multi-cloud platform infrastructure for large-scale SaaS services
  • Lead initiatives on automation, reliability engineering, and system resilience improvements
  • Develop tools, software, and automation frameworks to boost infrastructure efficiency
  • Respond to incidents via root cause analysis, problem management, and prevention
  • Participate in global on-call rotation with follow-the-sun model
  • Collaborate with engineering teams on infrastructure challenges and observability enhancements
  • Drive infrastructure-as-code practices, documentation, and operational excellence

Requirements

  • Experience as Site Reliability Engineer, Platform Engineer, or Software Engineer in large-scale distributed systems
  • Strong software engineering background for designing automation and infrastructure solutions
  • Hands-on experience with public cloud platforms and managed Kubernetes environments
  • Proficiency in at least one programming language (e.g., Go, Python)
  • Strong knowledge of Linux systems administration, containerized environments (Docker), and cloud-native architectures
  • Familiarity with observability tools (Prometheus, Grafana), incident response, and reliability best practices
  • Strong communication skills for globally distributed teams

Nice to have

  • Experience with Infrastructure-as-Code tools such as Terraform or Crossplane
  • Experience operating or supporting SaaS platforms in production
  • Experience building or scaling Kubernetes across multiple cloud providers

Culture & Benefits

  • Competitive base salary with equity participation
  • Company-matched 401(k) up to 6%
  • Comprehensive health coverage, paid parental leave (minimum 16 weeks), and generous PTO
  • Remote-friendly global work environment with flexible arrangements
  • Focus on employee well-being, work-life balance, volunteer time off, and inclusive culture