Senior SRE - Platform (MKI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Senior SRE - Platform (MKI): Designing, building, scaling, and maturing a multi-cloud platform for hosting internal and external services with an accent on reliability automation, incident/problem management, and operational excellence. Focus on leading technical initiatives, improving alerting and major-incident processes, and operating Kubernetes-at-scale infrastructure with strong Linux and cloud engineering practices.
Location: Canada
Salary: $148,300—$185,600 CAD
Company
enables real-time search and AI-powered answers at scale through its cloud-based platform for search, security, and observability.
What you will do
- Lead technical initiatives to automate system engineering efforts and ensure reliability of ’s global infrastructure.
- Grow and maintain the global Platform infrastructure by developing and operating software, tooling, and automations to meet scaling demands.
- Respond to and prevent repeated customer impact through major-incident response and prioritized problem management using a follow-the-sun on-call rotation.
- Improve alerting and major incident management processes, metrics, and systems to diagnose issues and quantify impact for stakeholders.
- Collaborate with engineers to identify, implement, and deliver solutions, ideally using Golang.
- Support team effectiveness through coaching, mentoring, and inclusive communication in a globally distributed environment.
Requirements
- Background in software engineering and experience collaborating with engineers to deliver production solutions, ideally using Golang.
- Production experience with public cloud service providers and managing Kubernetes infrastructure at scale.
- Experience operating a SaaS product in a public cloud using Infrastructure-as-Code tooling (e.g., Crossplane or Terraform).
- Strong Linux system administration skills on distributed systems at scale.
- Proven experience leading and improving alerting and major incident management processes, metrics, and systems (e.g., Stack, Prometheus, Influx).
- Experience diagnosing or designing solutions with the Stack.
Culture & Benefits
- Follow-the-sun on-call rotation with participation in mostly local working hours.
- Competitive pay and eligibility for an employee stock program.
- Health coverage for you and your family in many locations.
- Flexible schedules and vacation time, plus parental leave (minimum 16 weeks).
- RRSP match up to 6% of eligible earnings and additional benefits focused on employee well-being.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →