TL;DR
SRE Engineer: Will ensure product reliability and predictability, design and implement SLO/SLI, focusing on capacity planning, performance optimization and incident & problem management.
Location: Remote from Moscow
Компания
We attract ambitious professionals from all over Russia to solve breakthrough problems and create innovations together. We already have more than 1700 people
Что делать
- Manage service reliability by designing, implementing, and supporting SLO/SLI and error budget.
- Develop metrics, alerts, dashboards, and runbooks.
- Forecast load and plan resources (capacity planning).
- Identify and eliminate bottlenecks, optimizing performance.
- Participate in incident resolution as Incident Commander and investigate root causes (RCA).
- Automate routine tasks using IaC (Terraform/Ansible) and Python/Go/Bash.
Требования
- Expertise in SRE practices: deep understanding of SLO/SLI, error budget, toil reduction, and automation.
- Ability to conduct code reviews to assess the readiness of new functions and services for production.
- Practical experience in building and implementing quality gates in the CI/CD process for risk management during deployment.
- Expert in Linux operation, including diagnostics at the kernel level.
- Deep knowledge of how networks operate at levels L2-L7.
- Experience with Kubernetes and understanding of its internals for diagnosing complex problems.
Хорошо, если есть
- Systemic thinking and ability to analyze complex failure scenarios, identify root causes, and find ways to eliminate them.
- Experience in writing and reviewing technical documentation (runbooks, postmortems...).
- Experience in communicating with developers and the business, explaining trade-offs between reliability and feature development.
Культура и преимущества
- Attracting ambitious professionals from all over Russia.
- Opportunity to solve breakthrough problems and create innovations.
- Be part of a team of over 1700 people.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →