Sr. Site Reliability Engineer
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Sr. Site Reliability Engineer (SRE): Creating and evolving systems that automatically run a suite of products and services reliably and consistently, with an accent on SLO/SLA success criteria, observability, and incident-driven reliability improvements. Focus on building recoverability, eliminating single points of failure, and delivering automation and developer experience tooling across Kubernetes-based, distributed, event-driven platforms.
Location: Seattle, Washington, United States
Salary: $175,000-$200,000 annually
Company
is a Morningstar company building enterprise products and services.
What you will do
- Build and maintain internal platform services, Kubernetes operators, and observability tooling for enterprise reliability at scale.
- Define service level objectives (SLOs), error budgets, and SLIs; ensure systems consistently meet or exceed targets.
- Implement recoverability across services (DR, backups/recovery, multi-AZ/multi-region cloud constructs) and improve failover readiness.
- Design high-availability and scalability patterns (clustering, load balancing) for containerized cloud-native environments.
- Develop reusable observability systems (monitoring, telemetry, tracing), including alerting and dashboards.
- Operate and continuously improve reliability, scalability, performance, security, and uptime; participate in 24/7 on-call response.
Requirements
- 5+ years building and maintaining Linux/UNIX-based systems in cloud environments (preferably GCP & AWS).
- 5+ years in Reliability Engineering, DevOps, or infrastructure roles using infrastructure-as-code (Terraform, Puppet, Ansible, Chef).
- 5+ years coding in an object-oriented language such as Java, Python, Go, or Kotlin.
- 2+ years with containers and orchestration platforms including Kubernetes and Docker.
- Deep knowledge of infrastructure systems, networking, and security in cloud environments; experience with scalability, recoverability, and capacity planning.
- Must be authorized to work in the United States without visa sponsorship now or in the future.
Culture & Benefits
- Comprehensive health benefits plus life and disability insurance.
- Paid sabbatical after four years, paid family and paternity leave, and generous vacation/sick/volunteer days.
- Annual educational stipend and tuition reimbursement; robust training programs.
- 401k match and shared ownership employee stock program; monthly transportation stipend.
- Role is expected to be in the office 5 days a week.
Hiring process
- Interviews to evaluate reliability engineering experience, system design, and operational ownership.
- Assessment of collaboration and communication in stressful incident scenarios.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →