Senior Site Reliability Engineer II - Infrastructure (AI Native)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Senior Site Reliability Engineer II (Infrastructure, AI): Building and maintaining scalable infrastructure platforms supporting 200+ backend services with an accent on AI-native automation and platform resilience. Focus on designing agentic systems, optimizing Kubernetes clusters at scale, and ensuring high availability for massive request volumes.
Location: Remote (must be based in Canada)
Salary: 163,000 CAD – 194,000 CAD
Company
Family safety and location-sharing platform serving approximately 97.8 million monthly active users across 180 countries.
What you will do
- Scale and maintain infrastructure and services using AI (Claude Code) as a first-class collaborator in the daily workflow.
- Design and build scalable and resilient Platform Infrastructure for Kubernetes clusters with 40,000+ cores.
- Own and resolve complex infrastructure failures, including Kubernetes scheduling edge cases and AWS platform issues.
- Drive cloud cost efficiency by right-sizing resources and building tooling to surface cost anomalies.
- Lead and mentor other engineers on the team while providing technical direction and strategy.
- Develop agentic systems to automate operational workflows and incident response.
Requirements
- Expert-level experience (5+ years) managing large-scale AWS deployments.
- 3+ years of experience programming in Java, Python, or other formal languages.
- Strong Kubernetes experience (3+ years) deploying and managing 10k+ containers at scale.
- Proficiency with Infrastructure as Code tools like Terraform and CloudFormation, and config management tools like Ansible or Chef.
- Strong knowledge of Linux administration, shell scripting, networking, and load-balancer technologies.
- Hands-on experience with AI coding tools such as Claude Code, Cursor, or GitHub Copilot.
Nice to have
- Database knowledge.
Culture & Benefits
- Medical, dental, vision, life, and disability insurance plans.
- RRSP plan with DPSP company matching program.
- Flexible PTO and synchronized company-wide shutdowns during winter and summer.
- Equipment, tools, and reimbursement support for a productive remote environment.
- Free Platinum Membership and Tile products.
Hiring process
- Interview process may include demonstrating proficiency with AI tools or completing exercises without AI assistance, depending on the role.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →