16 часов назад
Site Reliability Engineer (AWS)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
Текст:
TL;DR
Site Reliability Engineer (SRE/DevOps): Maintaining 24/7 service reliability and incident response for global software products with an accent on operational automation and observability. Focus on engineering away repetitive toil, optimizing MTTD/MTTR, and implementing self-healing infrastructure.
Location: Remote (United Kingdom)
Company
is a global innovation powerhouse (NASDAQ: ) providing AI, cloud, and digital software for customer experience and financial crime prevention.
What you will do
- Act as a primary or escalation responder in a 24x7 on-call rotation for major incident response and mitigation.
- Design and maintain alerting strategies and service health monitoring using Grafana, Prometheus, Datadog, Splunk, or CloudWatch.
- Automate repetitive operational tasks and develop scripts in Python, Bash, or Go to reduce manual toil.
- Support and troubleshoot Linux-based systems, Kubernetes, and cloud platforms (AWS, Azure, GCP).
- Drive blameless post-incident reviews (PIRs) and track corrective actions to improve system reliability.
- Partner with engineering teams to optimize system design for better operational readiness.
Requirements
- Must be based in the United Kingdom.
- Strong experience in Linux systems administration and production support.
- Proficiency with cloud infrastructure (AWS preferred) and container orchestration (Kubernetes, Docker).
- Scripting or programming experience in Python, Bash, Go, or similar.
- Solid understanding of networking fundamentals, including DNS, TCP/IP, and load balancing.
- Experience working in 24x7 NOC or production operations environments.
hirify.global-to-have"> to have
- Experience defining and operating according to SLIs and SLOs.
- Proficiency with Infrastructure as Code (IaC) tools such as Terraform and Ansible.
- Exposure to security, compliance, or regulated environments.
- Prior experience migrating from a traditional NOC to an SRE model.
Culture & Benefits
- Remote work arrangement.
- Opportunity to work at a market leader used by 85 of the Fortune 100 corporations.
- Ambitious, high-standard environment focused on challenging limits.
- Equal opportunity employer with a diverse global team across 30+ countries.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →
Похожие вакансии
2 дня назад
Site Reliability Engineer (Cybersecurity)
7 дней назад
Staff Site Reliability Engineer (Cloud)
165 000 - 200 000$
7 дней назад
Sr. Site Reliability Engineer (Security)
160 000 - 180 000$
7 дней назад
Site Reliability Engineer (Cloud)
6 часов назад
Site Reliability Engineer (Cloud)
6 дней назад
DevOps Engineer (AWS/Kubernetes/Terraform)
4 000€