Senior Reliability Engineer (AWS/Kubernetes)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Senior Reliability Engineer (AWS/Kubernetes): Operating, observing, and improving reliability of distributed systems on AWS and Kubernetes with an accent on observability, operational maturity, and automated responses to system behavior. Focus on designing observability strategies, defining SLIs/SLOs, enhancing autoscaling/self-healing mechanisms, and performing root cause analysis for production incidents.
100% Remote
Company
Leading nearshore staff augmentation provider headquartered in New York with 600+ professionals based in Latin America, partnering with U.S. companies on digital transformation projects.
What you will do
- Design and improve observability strategies including metrics, logs, traces, alerts, and dashboards.
- Analyze system behavior in production to identify failure modes, bottlenecks, and risks.
- Maintain AWS CDK/CDK8s constructs and core platform components like VPC, EKS, RDS, OpenSearch, MSK.
- Operate Kubernetes addons such as ingress controllers, cert-manager, autoscalers, monitoring stacks.
- Define SLIs, SLOs, alerting strategies, and automate operational responses including self-healing.
- Collaborate on incident investigations, root cause analysis, and CI/CD for IaC/observability.
Requirements
- 5+ years in SRE, Platform Engineering, or Infrastructure with production systems experience.
- Strong observability operations: metrics, logs, traces, dashboards, alerts for complex systems.
- Hands-on with AWS (VPC, IAM, RDS, MSK, S3, CloudWatch) and Kubernetes (Helm, RBAC, ServiceAccounts).
- Fluency in Python and IaC with AWS CDK, CDK8s or equivalent.
- Prometheus, Grafana, alert tuning, incident-driven monitoring improvements.
- Experience improving existing systems for operational excellence and reliability.
Nice to have
- Experience with Spark on Kubernetes, Argo, or Kafka-based batch pipelines.
Culture & Benefits
- 100% remote work with autonomy focused on results.
- Competitive USD compensation.
- Paid time off for well-being.
- Work with top U.S. companies on high-impact projects.
- Diverse multicultural team across 25+ countries emphasizing work-life balance.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →