Senior Site Reliability Engineer (AWS)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Senior Site Reliability Engineer (AWS): Building and optimizing reliable infrastructure for Apple device management solutions with an accent on automation, agentic AI tooling, and observability. Focus on eliminating toil through AI agents, investigating complex production issues, and scaling SRE practices across the organization.
Location: Must be based in Austin, TX; Eau Claire, WI; or Minneapolis, MN metro areas. Must be a U.S. citizen located physically in the U.S.
Salary: $113,300 - $205,520 USD
Company
provides automation, management, and security solutions for Apple devices in workplace and educational environments.
What you will do
- Define SLOs, error budgets, and supporting indicators to inform prioritization and reliability investment.
- Investigate complex production issues across application, data, infrastructure, and network layers using AI correlation.
- Eliminate systemic toil through automation, AI agents, tooling, and process changes.
- Develop guardrails, repository context, and MCP servers to enable AI agents to perform reliable work.
- Mentor engineers on SRE practices and the effective integration of AI in reliability workflows.
- Collaborate cross-departmentally to influence roadmaps and advise leadership during critical escalations.
Requirements
- Minimum 5 years of experience in software engineering, SRE, or production operations.
- U.S. citizenship and physical residency in the U.S. are mandatory.
- Strong production troubleshooting skills across the full stack (profilers, heap/thread dumps, traces, logs).
- Hands-on experience operating production services on AWS (EC2, S3, EKS, RDS/Aurora, CloudFront).
- Proficiency with observability tools such as Grafana, Prometheus, or LogicMonitor.
- Experience with Infrastructure as Code (IaC) and automation in Python, Go, or Java.
- Practical experience using agentic development tools like Claude Code, Cursor, or Copilot.
Nice to have
- Experience optimizing SQL queries and database engine tuning.
- Proficiency with CI/CD tooling (GitHub Actions, Jenkins).
- Exposure to chaos engineering, fault injection, and disaster recovery exercises.
- Familiarity with FinOps practices.
Culture & Benefits
- Recognized as a "Best Company to Work For" by U.S. News.
- Culture based on trust, ownership, and a spirit of self-improvement.
- Opportunity to impact over 75,000 global customers.
- Clear career paths with supportive leadership and management.
- Collaborative environment within small, empowered engineering teams.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →