Technical Service Operations Lead (Tso Lead)
ΠΡΡΡ & Π‘ΠΎΠΏΡΠΎΠ²ΠΎΠ΄
ΠΠ»Ρ ΠΌΡΡΡΠ° Ρ ΡΡΠΎΠΉ Π²Π°ΠΊΠ°Π½ΡΠΈΠ΅ΠΉ Π½ΡΠΆΠ΅Π½ Plus
ΠΠΏΠΈΡΠ°Π½ΠΈΠ΅ Π²Π°ΠΊΠ°Π½ΡΠΈΠΈ
TL;DR
Technical Service Operations Lead (TSO Lead): Coordinating incident response and driving continuous improvement for high-availability platforms with an accent on incident management, ITIL knowledge, and observability/monitoring expertise. Focus on identifying trends in production issues, improving partner communication during incidents, and ensuring platform reliability at scale.
Location: Kuala Lumpur
Salary: RM240,000 - RM300,000 a year
Company
is a global commerce company providing tools and services for video game developers to fund, distribute, market, and monetize their games.
What you will do
- Serve as Incident Commander for major incidents, coordinating cross-functional response teams and driving investigation.
- Own incident communications, drafting updates for leadership, customer success, and partners.
- Facilitate Post-Incident Reviews (PIRs) to identify root causes and track corrective actions.
- Analyze incident trends and recurring issues to provide recommendations to product and engineering teams.
- Enforce the incident management framework across the organization.
- Oversee and mentor the Operations Engineer, providing coaching and knowledge transfer.
Requirements
- 6+ years of experience in incident management, SRE, NOC leadership, or technical operations supporting high-availability systems.
- Proven incident management experience, coordinating multi-team response and communicating with executive stakeholders.
- Excellent written and verbal communication skills in English.
- Strong ITIL foundation with practical experience implementing ITIL-aligned workflows.
- Technical depth across the observability stack, including experience with Datadog (or equivalent).
- Analytical mindset with the ability to identify trends and translate them into actionable recommendations.
Nice to have
- Experience in the gaming, payments, or fintech industry.
- Experience with customer/partner-facing incident communications and status page management.
- JIRA Service Management administration experience.
- Familiarity with Datadog Service Catalog, scorecards, and SLOs.
- Background in Kubernetes, cloud infrastructure (GCP preferred), microservices architecture, or distributed systems.
Culture & Benefits
- Latest Mac workplaces and additional hardware.
- Free trainings and participation in specialized conferences.
- Health insurance (Medical, dental and optical) for employee and dependants.
- Flexible hours.
- No dress code.
ΠΡΠ΄ΡΡΠ΅ ΠΎΡΡΠΎΡΠΎΠΆΠ½Ρ: Π΅ΡΠ»ΠΈ ΡΠ°Π±ΠΎΡΠΎΠ΄Π°ΡΠ΅Π»Ρ ΠΏΡΠΎΡΠΈΡ Π²ΠΎΠΉΡΠΈ Π² ΠΈΡ ΡΠΈΡΡΠ΅ΠΌΡ, ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΡ iCloud/Google, ΠΏΡΠΈΡΠ»Π°ΡΡ ΠΊΠΎΠ΄/ΠΏΠ°ΡΠΎΠ»Ρ, Π·Π°ΠΏΡΡΡΠΈΡΡ ΠΊΠΎΠ΄/ΠΠ, Π½Π΅ Π΄Π΅Π»Π°ΠΉΡΠ΅ ΡΡΠΎΠ³ΠΎ - ΡΡΠΎ ΠΌΠΎΡΠ΅Π½Π½ΠΈΠΊΠΈ. ΠΠ±ΡΠ·Π°ΡΠ΅Π»ΡΠ½ΠΎ ΠΆΠΌΠΈΡΠ΅ "ΠΠΎΠΆΠ°Π»ΠΎΠ²Π°ΡΡΡΡ" ΠΈΠ»ΠΈ ΠΏΠΈΡΠΈΡΠ΅ Π² ΠΏΠΎΠ΄Π΄Π΅ΡΠΆΠΊΡ. ΠΠΎΠ΄ΡΠΎΠ±Π½Π΅Π΅ Π² Π³Π°ΠΉΠ΄Π΅ β