Lead Site Reliability Developer (AWS/Kubernetes)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Lead Site Reliability Developer (AWS/Kubernetes): Leading reliability consulting work across multiple teams to improve resilience and sustainable engineering practices with an accent on SLO governance, error budgets, and systemic risk reduction. Focus on designing reusable reliability mechanisms, improving observability strategies, and mentoring senior engineers to scale SRE maturity.
Location: Remote (UK)
Company
is the world's largest ticket marketplace and a leading global provider of enterprise tools and services for the live entertainment business.
What you will do
- Lead consulting engagements from discovery to delivery, aligning stakeholders on priorities and measurable outcomes.
- Define reliability targets and trade-offs using SLOs and error budgets in collaboration with product and platform teams.
- Design and implement reusable reliability mechanisms, templates, and tooling for adoption across the organization.
- Drive the observability strategy by improving signal quality, alerting philosophy, and operational dashboards.
- Lead complex incident investigations to identify systemic risks and ensure durable architectural fixes.
- Mentor senior engineers and other consultants through pairing, reviews, and structured coaching.
Requirements
- Deep practical expertise in SRE principles, including SLO governance and error budget policy.
- Strong experience designing and troubleshooting distributed systems with cross-service failure modes.
- Advanced proficiency with Kubernetes and AWS, including governance and cost optimization.
- Solid software engineering fundamentals with the ability to deliver high-quality changes in enterprise codebases.
- Proven ability to lead cross-team technical initiatives and influence stakeholders without direct authority.
- Must be based in the United Kingdom.
Culture & Benefits
- Inclusive environment committed to diversity, equality, and belonging.
- Focus on sustainable work pace to discourage hero culture and prevent burnout.
- Support for professional growth and personal aspirations within a global business.
- Commitment to psychological safety and respectful, direct feedback.
- Flexible remote work arrangements (Work From Home).
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →