Operations Engineer (SRE)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Operations Engineer (SRE): Monitoring and investigating production issues across a global gaming commerce platform with an accent on observability, incident response, and performance analysis. Focus on driving resolution for critical system incidents, automating operational processes, and improving system reliability.
Location: Must be based in British Columbia, Canada
Salary: $90,000 - $115,000 per year
Company
is a global commerce company providing robust tools and services to help game developers fund, distribute, market, and monetize their games worldwide.
What you will do
- Monitor the GTO Operational Dashboard in Datadog, correlating signals across APM, logs, and metrics to detect anomalies.
- Triage and investigate production incidents, determining root causes and routing issues to the appropriate engineering teams.
- Manage lower-severity incidents end-to-end, executing runbooks and mitigation procedures.
- Draft incident communications for stakeholders and update status pages during active incidents.
- Analyze incident trends and recurring bugs to compile findings for product and engineering teams.
- Build and maintain operational automation scripts and contribute to runbook development.
Requirements
- Must reside in British Columbia
- 4+ years of experience in SRE, DevOps, or NOC environments supporting high-availability platforms
- Strong troubleshooting skills with the ability to trace issues through logs, metrics, and network paths
- Proficiency in at least one scripting language (Python, Go, or Bash)
- Hands-on experience with observability platforms like Datadog, Grafana, or New Relic
- Fluent in English (written and verbal) for incident communications
- Working knowledge of Kubernetes and cloud infrastructure (GCP/AWS/Azure)
Nice to have
- Experience in gaming, payments, or fintech industries
- Knowledge of database operations (MySQL, PostgreSQL, Redis, Kafka)
- Familiarity with CI/CD pipelines (GitLab CI, ArgoCD)
- Exposure to AI/ML-assisted operations and automated remediation
Culture & Benefits
- Comprehensive benefits program including medical, dental, and vision coverage
- Paid time off (PTO)
- Personalized career roadmap and training opportunities
- Collaborative environment valuing creativity and the transformative power of gaming
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →