Senior Site Reliability Engineer
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Senior Site Reliability Engineer (SRE): Design, build, and operate shared platform foundations on GCP and Kubernetes with an accent on observability, incident response, and making deployments safe and repeatable. Focus on diagnosing complex distributed systems at high request volume and raising the reliability bar through standards, automation, and on-call readiness.
Location: Remote in the United States (East Coast Time Zone)
Company
builds an AI-powered content operating system for modeling, creating, and automating content workflows.
What you will do
- Design, build, and operate shared platform foundations: GCP infrastructure, Kubernetes, networking/routing, CI/CD, and observability.
- Diagnose and troubleshoot complex distributed systems running at high request volume.
- Ensure observability and analyze stack behavior.
- Modernize edge, caching, and gateway layers (Fastly) and tighten platform observability.
- Improve reliability via dashboards, alert severity, paging standards, on-call readiness, and incident response.
- Build “golden paths” for boring deployments: production readiness checks, safe rollouts, and automation; mentor engineers through reviews and pairing.
Requirements
- Based in the United States, with reasonable overlap with European engineering hours.
- 5+ years of experience as part of an SRE on-call rotation.
- Experience with SRE/DevOps tools, processes, and culture.
- Hands-on experience managing scalable, highly available, cloud-based applications with customer-facing uptime expectations.
- Experience with Kubernetes and building CI/CD pipelines.
- Experience with observability stacks (e.g., Prometheus) and working across CDNs/edge/gateways/caching layers.
Culture & Benefits
- Supportive, trust-based environment focused on long-term professional and personal growth.
- Comprehensive health plans and perks, plus a healthy work-life balance.
- Competitive stock options program and location-based salary.
- Real infrastructure scale with hands-on impact on how the platform runs.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →