4 часа назад

Senior Site Reliability Engineer (SRE)

Формат работы

hybrid

Тип работы

fulltime

Грейд

senior

Английский

Страна

India

Загружаем источник...

Мэтч & Сопровод

Покажет вашу совместимость и напишет письмо

Описание вакансии

🚀 Hiring: Senior Site Reliability Engineer (SRE)
📍 Location: Bengaluru (Hybrid)
💼 Experience: 6–10 Years
⚠️ Important:
✔ Only local Bengaluru candidates will be considered
✔ Must be available for face-to-face interview on short notice
______________
🔎 Role Overview
We are looking for a hands-on Senior SRE with deep expertise in Observability, Kubernetes, and Cloud Platforms. This role focuses on building and operating highly reliable, scalable, and observable systems in GCP (preferred) and AWS environments.
______________
🔹 Key Responsibilities
Reliability & Operations
• Design and operate highly available Kubernetes-based systems
• Define & manage SLOs, SLIs, and Error Budgets
• Lead incident response, RCA, and blameless postmortems
• Improve platform reliability through automation
Observability (Core Focus)
• Build centralized observability platforms (metrics, logs, traces)
• Hands-on with Prometheus, Alertmanager, Grafana is Must
• Logging/Tracing using ELK / OpenSearch, Loki, OpenTelemetry
• Cloud-native monitoring (GCP Monitoring preferred)
• Define actionable, low-noise alerting standards
Cloud & Platform Engineering
• Infrastructure on GCP (GKE preferred) / AWS (EKS)
• Kubernetes cluster operations
• Helm deployments & Docker workloads
• Infra automation using Terraform / Ansible / Packer
Automation & Tooling
• Strong Python coding for reliability tooling
• Build internal tools for SLO tracking & incident workflows
• Integrate observability into CI/CD (Jenkins)
Leadership
• Mentor engineers
• Influence reliability architecture
• Collaborate with platform & cloud teams
______________
✅ Mandatory Skills
SRE | Python (Coding) | Kubernetes | ELK | Prometheus | Grafana | AWS/GCP | Docker | Helm | Terraform | Linux | Jenkins CI/CD
⭐ Nice to Have
Splunk | Datadog | Cribl | Vector | OpenTelemetry | Multi-cloud | Platform Security
______________
📅 Project Highlights
✨ Build a centralized observability platform
📉 Reduce MTTR using SLO-driven engineering
🚨 Lead production incident response
⚡ Optimize performance, scalability & cloud cost
______________
📩 Interested?
Share the cv to

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Текст вакансии взят без изменений

Источник -

Senior Site Reliability Engineer (SRE)

Мэтч & Сопровод

Описание вакансии

Похожие вакансии

DevOps Engineer

Senior Staff Engineer (Delivery)

Cloud Infrastructure Engineer (AWS)

SIEM Engineer/Security DevOps (Cybersecurity)

Principal FinOps Engineer (Cloud)