Technical Service Operations Lead (TSO Lead)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Technical Service Operations Lead (TSO Lead): Coordinating incident response and driving continuous improvement for a global commerce company with an accent on incident management, ITIL knowledge, and observability/monitoring expertise. Focus on ensuring the reliability and uptime of commerce and payment solutions.
Location: Remote
Salary: $120,000 - $150,000 a year (British Columbia)
Company
is a global commerce company providing tools and services for video game developers to fund, distribute, market, and monetize their games.
What you will do
- Serve as Incident Commander for major incidents, coordinating cross-functional response teams and driving investigation.
- Own all incident communications, drafting and sending timely updates to leadership and customer contacts.
- Facilitate blameless Post-Incident Reviews (PIRs) for major incidents, identifying root causes and tracking corrective actions.
- Analyze incident trends and recurring issues, creating Problem tickets and reporting findings to product and engineering teams.
- Enforce the incident management framework across the organization.
- Oversee and mentor the Operations Engineer, coaching on triage, investigation, and documentation.
Requirements
- 6+ years of experience in incident management, SRE, NOC leadership, or technical operations supporting high-availability systems.
- Proven incident management experience, coordinating multi-team response and communicating with executive stakeholders.
- Excellent written and verbal communication skills in English — ability to draft clear, concise executive updates.
- Strong ITIL foundation with practical experience implementing ITIL-aligned workflows.
- Technical depth across the observability stack, with the ability to interpret logs, traces, and metrics in Datadog (or equivalent).
- Hands-on experience with incident tooling: Datadog, PagerDuty or OpsGenie, JIRA or JIRA Service Management, Slack, and Confluence.
- Analytical mindset with the ability to identify trends and translate them into actionable recommendations.
- Experience with SLA/SLO-driven operations.
- Experience with or strong interest in AI/ML-assisted operations.
- Comfort with 24x7 shift-based operations as part of a follow-the-sun model. Weekend on-call (rotating) is required.
Nice to have
- Experience in the gaming, payments, or fintech industry.
- Experience with customer/partner-facing incident communications and status page management.
- JIRA Service Management administration experience.
- Familiarity with Datadog Service Catalog, scorecards, and SLOs.
- Experience building an operations function from scratch.
- Background in Kubernetes, cloud infrastructure (GCP preferred), microservices architecture, or distributed systems.
- ITIL certification (Foundation or higher) is a plus but not required.
Culture & Benefits
- Comprehensive Benefits Program including medical, dental, and vision.
- PTO.
- Personalized career roadmap for each employee.
- Professional development through training and educational opportunities.
- Supportive environment fostering creativity and collaboration.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →