TL;DR
Site Reliability Engineer (AWS/Kubernetes): Ensuring the reliability, scalability, and performance of custom platforms running on AWS infrastructure and Kubernetes with an accent on Tier 3 issue resolution and proactive improvements to platform stability. Focus on implementing automation, tooling, and process improvements to prevent recurring issues and continuously enhance customer experience.
Location: Remote within Mexico
Company
hirify.global is a digital engineering and modernization partner of some of the world’s leading enterprises and digital native companies.
What you will do
- Troubleshoot and resolve Tier 3 platform issues for AWS-based custom applications.
- Collaborate with engineering teams to prepare Operations for new releases and feature enhancements.
- Identify recurring issues and implement automation and tooling to prevent reoccurrence.
- Design and implement strategies to improve platform reliability, scalability, and performance.
- Participate in incident response, root cause analysis, and post-mortem reviews.
- Contribute to operational documentation, runbooks, and readiness plans.
Requirements
- Hands-on experience supporting and operating AWS cloud environments.
- Strong knowledge of Kubernetes and container orchestration concepts.
- Proficiency in Python or Go for automation and scripting.
- Experience with platform support, troubleshooting, and performance optimization.
- Familiarity with CI/CD pipelines, monitoring, and observability tools.
- Strong problem-solving abilities with an engineering-focused mindset.
Culture & Benefits
- hirify.global hires professionals based solely on their skills and qualifications, and does not discriminate.
Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →