Site Reliability Engineer
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Site Reliability Engineer (DevOps): Own and improve the reliability, availability, and performance of a large-scale backend platform on AWS with an accent on automation, observability, and incident management. Focus on designing resilient systems, reducing toil, and scaling infrastructure to support millions of users.
Location: San Francisco, CA, onsite 4–5 days per week
Salary: $230,000–$310,000 plus equity
Company
is a product company building AI-powered creative tools for modern communication, valued at $2.1B with over $100M ARR and profitable since 2023.
What you will do
- Own reliability, availability, and performance of production systems primarily on AWS infrastructure
- Build observability infrastructure with metrics, logging, tracing, and alerting
- Design automation to reduce toil, improve deployment safety, and accelerate incident resolution
- Lead incident response, conduct blameless post-mortems, and drive systemic improvements
- Partner with engineering teams on architecture reviews, SLOs/SLIs, and reliability best practices
- Manage and optimize infrastructure including compute, networking, databases, and managed services
Requirements
- Must have 5+ years experience in Site Reliability Engineering, DevOps, or systems engineering with deep AWS expertise
- Strong programming skills in Python, Go, or TypeScript/Node.js for automation
- Experience with infrastructure-as-code tools like Terraform and CloudFormation
- Solid understanding of networking, distributed systems, containerization (Docker, Kubernetes), and database performance
- Strong incident management and debugging skills for complex production issues
- Work onsite in San Francisco 4–5 days per week
Nice to have
- Experience scaling SaaS applications to millions of users
- Background with real-time collaborative systems, Kafka, chaos engineering, or service mesh technologies
- AWS certifications or experience with security/compliance requirements (SOC 2, ISO 27001)
Culture & Benefits
- Warm, quirky, and curious culture valuing creativity and craftsmanship
- Small, passionate teams working on impactful projects
- Flexibility to work from home when focus matters most
- Fast-paced, creative, and occasionally chaotic environment
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →