Senior Site Reliability Engineer (AWS)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Senior Site Reliability Engineer (AWS/SRE): Balancing development velocity with system reliability for an Apple device management platform with an accent on automation, AI-driven tooling, and observability. Focus on eliminating systemic toil, implementing AI agents for operations, and managing complex production issues across AWS infrastructure.
Location: Must be a U.S. citizen located physically in the U.S. Remote with periodic office visits required.
Salary: $113,300 - $205,520 USD
Company
extends the legendary Apple experience to the workplace, providing automated deployment, management, and security for Mac, iPad, and iPhone.
What you will do
- Partner with engineering teams to define SLOs, error budgets, and reliability indicators to inform prioritization.
- Investigate complex production issues across application, data, infrastructure, and network layers using AI correlation.
- Identify and eliminate systemic sources of toil through automation, AI agents, and process changes.
- Develop and maintain clear technical documentation, runbooks, architecture notes, and postmortems.
- Implement guardrails and context for AI agents to perform reliable work within the production environment.
- Drive cross-department collaboration on reliability initiatives and mentor engineers on SRE practices.
Requirements
- Must be a U.S. citizen located physically in the U.S.
- Minimum 5 years of experience in software engineering, SRE, or production operations.
- Strong hands-on experience operating production services on AWS (EC2, S3, EKS, RDS/Aurora, CloudFront).
- Proficiency in a general-purpose language such as Python, Go, or Java to a production standard.
- Experience with observability tools like Grafana, Prometheus, or LogicMonitor.
- Experience writing Infrastructure as Code (IaC) and working within Agile frameworks.
Nice to have
- Experience optimizing SQL queries and database engine tuning.
- Proficiency with CI/CD tooling such as GitHub Actions or Jenkins.
- Exposure to chaos engineering, fault injection, and disaster recovery exercises.
- Familiarity with FinOps practices.
Culture & Benefits
- Open, flexible culture based on respect and trust with a high priority on work-life balance.
- Opportunity to make a meaningful impact for over 75,000 global customers.
- Clear career growth paths under supportive leadership and management.
- Recognized as a "Best Company to Work For" by U.S. News.
- Collaborative environment with small, empowered teams and an engineering-driven culture.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →