Site Reliability Technical Lead (AWS)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Site Reliability Technical Lead (AWS/DevOps): Designing and managing robust, scalable cloud infrastructure for school management tools with an accent on system architecture, observability, and automation. Focus on optimizing platform reliability for high-traffic systems, leading Root Cause Analysis, and mentoring engineers.
Location: Remote (UK). Unable to provide visa sponsorship.
Salary: £80,000 - £90,000
Company
provides MIS and school management tools to transform the way schools work, focusing on reducing burnout and leveraging data for better educational outcomes.
What you will do
- Define and guide system architecture, balancing scalability, maintainability, and security to meet business goals.
- Champion platform reliability and performance by ensuring systems meet SLOs and are fully observable.
- Lead Root Cause Analysis (RCA) and optimize the incident response process and framework.
- Drive automation initiatives across the team to reduce operational toil and improve system efficiency.
- Mentor and coach engineers, promoting high coding standards and automated testing.
- Collaborate with Product and Engineering Managers to align technical direction with product strategy.
Requirements
- Extensive professional experience in SRE, DevOps, or Platform Engineering on complex, scalable systems.
- Deep expertise with AWS and distributed cloud architectures.
- Proven experience operating platforms serving a high volume of requests (~1000 req/sec).
- Advanced proficiency with Terraform, Python, Go, and containerization (Docker, Kubernetes, ECS).
- Deep experience with observability platforms such as DataDog or Prometheus.
- Must have the right to work in the UK; the company is unable to provide visa sponsorship.
Nice to have
- Experience with chaos engineering and reliability testing.
- Knowledge of security best practices and compliance frameworks.
- Background in agile and lean methodologies (Scrum/Kanban).
- Contributions to open-source projects or the SRE community.
Culture & Benefits
- 32 days holiday plus Bank Holidays (including company-wide days over Easter, Summer, and Christmas).
- Comprehensive health benefits: AIG Smart Health virtual GP, Bupa private dental insurance, and mental health support.
- Financial security: Life Assurance (3x annual salary) and Salary sacrifice Pension.
- Family-friendly policies: Enhanced maternity (20 weeks) and paternity (6 weeks) full pay.
- Growth and wellbeing: Dedicated professional development budget and a dedicated wellbeing team.
- Flexible working arrangements and dog-friendly offices.
Hiring process
- Phone screen.
- First stage interview.
- Second stage interview.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →