Senior Site Reliability Engineer (SRE)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Senior Site Reliability Engineer (SRE) (AWS/SRE): Building and evolving reliability practices for production systems at scale with an accent on SLOs, incident response, and observability strategies. Focus on improving system availability, scalability, and reducing operational overhead through automation.
Location: Remote (Brazil, Mexico, Argentina, Colombia)
Company
is a professional services and talent sourcing company specializing in providing high-quality IT experts for diverse production environments.
What you will do
- Design, implement, and improve SRE practices across production environments to enhance reliability and scalability.
- Define and manage Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Error Budgets.
- Lead incident response and incident command processes, including root cause analysis and postmortems.
- Build and evolve observability strategies encompassing monitoring, logging, alerting, and distributed tracing.
- Collaborate with engineering teams to improve application performance and production readiness.
- Develop automation solutions to reduce operational overhead and improve overall system health.
Requirements
- 5+ years of professional experience in SRE, DevOps, or Production Engineering roles.
- Proven experience in defining and managing SLOs, SLIs, and Error Budgets.
- Experience leading or actively participating in Incident Command and Response processes.
- Hands-on experience with monitoring, logging, alerting, and distributed tracing.
- Strong experience with cloud platforms, specifically AWS.
- Must be based in one of the supported LATAM locations (Brazil, Mexico, Argentina, Colombia).
Nice to have
- Experience with Kubernetes and containerized environments.
- Proficiency with Terraform or other Infrastructure as Code (IaC) tools.
- Experience building and maintaining CI/CD pipelines.
- Knowledge of distributed microservices architectures and performance engineering.
- Experience mentoring other engineers on reliability practices.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →