Staff Site Reliability Engineer
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Staff Site Reliability Engineer (SaaS): Operate and maintain Develocity instances and supporting infrastructure for high-scale production services with an accent on reliability, performance, and availability. Focus on defining SRE standards, driving automation, building observability, and leading incident response across Kubernetes on AWS.
Location: Remote from anywhere in EST timezone (North America)
Salary: $180-220k (US)
Company
AI-native company building Develocity, a toolchain observability and intelligence platform used by Netflix, Airbnb, Spotify, and others.
What you will do
- Operate and maintain Develocity instances and supporting services in production.
- Define and evolve SRE standards, practices, SLOs, on-call rotations, incident response, and postmortems.
- Lead incident response, blameless retrospectives, and reliability improvements using error budgets.
- Drive automation for deployments, monitoring, self-healing, and operational workflows.
- Build comprehensive observability with logging, metrics, tracing, and alerting.
- Mentor SREs, influence architecture reviews, and partner with engineering for operational excellence.
Requirements
- 7+ years in SRE, DevOps, or equivalent operating production services at scale
- Experience leading reliability initiatives and influencing without authority
- Strong Kubernetes in production and AWS expertise (EKS, RDS, S3, EC2)
- Proficiency with observability tools (Prometheus, Grafana), IaC (Terraform), scripting (Python, Bash)
- Track record in 24/7 on-call, incident management, and SLOs/error budgets
- Strong written and verbal English communication skills
Nice to have
- Experience as founding/early SRE in growing SaaS
- Familiarity with Develocity
- JVM languages (Java, Kotlin)
- Customer-facing incident communications
Culture & Benefits
- Remote-first environment with asynchronous communication and written documentation.
- Follow-the-sun on-call rotation.
- In-person meetings like annual offsite and team gatherings.
- Competitive salaries, equity grants.
- Culture valuing automation over heroics and ownership of outcomes.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →