Technical Service Operations Lead (SRE)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Technical Service Operations Lead (SRE): Managing incident response, platform reliability, and operational workflows for a high-availability commerce and payments ecosystem with an accent on incident coordination, blameless post-incident reviews, and observability. Focus on optimizing MTTR/MTTD, driving continuous improvement through data analysis, and mentoring operations engineers in a 24/7 global support environment.
Location: Based in Baku, Azerbaijan
Company
is a global commerce company providing robust tools and services to help video game developers fund, distribute, market, and monetize their games worldwide.
What you will do
- Serve as Incident Commander for major incidents, coordinating response teams and ensuring resolution within SLA targets.
- Own incident communications, providing clear updates to leadership and external partners.
- Facilitate blameless Post-Incident Reviews (PIRs) to identify root causes and assign corrective actions.
- Analyze production incident trends to identify patterns and proactively report recommendations to product teams.
- Enforce incident management frameworks, severity models, and escalation procedures across the organization.
- Mentor and coach Operations Engineers on shift, conducting knowledge transfer and operational audits.
Requirements
- 6+ years of experience in incident management, SRE, NOC leadership, or technical operations.
- Expert-level English communication skills (written and verbal) for executive-level updates and technical reporting.
- Strong ITIL foundation with practical experience in incident, problem, and change management.
- Proficiency with observability stacks (Datadog, Grafana, or Splunk) and APM concepts.
- Experience managing operations for high-availability systems (payments, e-commerce, SaaS, or gaming).
- Ability to participate in 24/7 shift-based operations and rotating weekend on-call.
Nice to have
- Experience in the gaming, payments, or fintech industries.
- Hands-on experience building operations functions from scratch.
- Knowledge of Kubernetes, GCP cloud infrastructure, and microservices architecture.
- Familiarity with AI/ML-assisted operations and anomaly detection.
- ITIL Foundation or higher certification.
Culture & Benefits
- Fast-paced and collaborative environment supporting a global gaming platform.
- Opportunity to influence platform reliability at scale.
- Involvement in a 24/7 follow-the-sun operational model.
- Focus on professional development and building out operational governance.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →