Эта вакансия в архиве
Посмотреть похожие вакансии ↓1 месяц назад
Senior Site Reliability Engineer (Resilience) - Platform Resilience
154 800 - 195 600$
Описание вакансии
Текст:
TL;DR
Senior Site Reliability Engineer (Resilience) - Platform Resilience: Building and maintaining highly reliable, scalable multi-cloud infrastructure powering mission-critical SaaS services with an accent on automation, observability, and incident prevention strategies. Focus on designing tools and frameworks, conducting root cause analysis, and enhancing system resilience in large-scale distributed environments.
Location: Remote in the United States
Salary: $154,800 to $195,600 USD
Company
Global Platform Engineering organization supporting large-scale distributed SaaS and platform services across multi-cloud environments.
What you will do
- Design, build, and maintain reliable multi-cloud platform infrastructure for large-scale SaaS services
- Lead initiatives on automation, reliability engineering, and system resilience improvements
- Develop tools, software, and automation frameworks to boost infrastructure efficiency
- Respond to incidents via root cause analysis, problem management, and prevention
- Participate in global on-call rotation with follow-the-sun model
- Collaborate with engineering teams on infrastructure challenges and observability enhancements
- Drive infrastructure-as-code practices, documentation, and operational excellence
Requirements
- Experience as Site Reliability Engineer, Platform Engineer, or Software Engineer in large-scale distributed systems
- Strong software engineering background for designing automation and infrastructure solutions
- Hands-on experience with public cloud platforms and managed Kubernetes environments
- Proficiency in at least one programming language (e.g., Go, Python)
- Strong knowledge of Linux systems administration, containerized environments (Docker), and cloud-native architectures
- Familiarity with observability tools (Prometheus, Grafana), incident response, and reliability best practices
- Strong communication skills for globally distributed teams
Nice to have
- Experience with Infrastructure-as-Code tools such as Terraform or Crossplane
- Experience operating or supporting SaaS platforms in production
- Experience building or scaling Kubernetes across multiple cloud providers
Culture & Benefits
- Competitive base salary with equity participation
- Company-matched 401(k) up to 6%
- Comprehensive health coverage, paid parental leave (minimum 16 weeks), and generous PTO
- Remote-friendly global work environment with flexible arrangements
- Focus on employee well-being, work-life balance, volunteer time off, and inclusive culture