Principal Site Reliability Engineer (Azure)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Principal Site Reliability Engineer (Azure): Leading the reliability, scalability, and operational excellence of a cloud-based government software platform with an accent on infrastructure modernization and containerization. Focus on defining SLOs, driving incident response, and mentoring engineering teams to ensure high availability and security in a large-scale SaaS environment.
Location: Must be based in the US
Salary: $160,000–$185,000
Company
provides a robust, cloud-based platform of government software solutions designed to improve efficiency, transparency, and citizen engagement for communities of all sizes.
What you will do
- Lead platform modernization initiatives, transitioning from VM-based architectures to containerized, cloud-native services.
- Define and operate service level objectives (SLOs), SLAs, and error budgets to drive risk-based decision making.
- Design and maintain automation and tooling to improve system reliability, scalability, and developer productivity.
- Drive Root Cause Analysis (RCA) for production incidents and facilitate blameless postmortems.
- Partner with Security and Compliance teams to ensure operations meet SOC 2, HIPAA, FedRAMP, and PCI-DSS standards.
- Mentor engineers across the organization and influence best practices in cloud engineering.
Requirements
- 8+ years of experience in SRE, Software Engineering, or Cloud Infrastructure within a SaaS environment.
- Must be based in the US.
- Hands-on experience operating large-scale SaaS platforms on Microsoft Azure.
- Technical leadership experience in containerized environments, specifically Kubernetes.
- Proficiency in automation and scripting using Python, PowerShell, or Bash.
- Strong expertise in modern observability platforms (metrics, logging, distributed tracing).
Nice to have
- Experience with Infrastructure-as-Code (Terraform) and configuration management (Ansible).
- Knowledge of GitOps methodologies (Argo CD or Flux).
- Experience with public-sector compliance frameworks (FedRAMP, StateRAMP).
- Familiarity with AI-assisted engineering tools like GitHub Copilot or Claude Code.
Culture & Benefits
- Comprehensive medical, dental, and vision plans.
- 401(k) retirement savings plan with company match.
- Flexible time off and family planning benefits.
- Health savings account (HSA) with company contributions.
- Commitment to diversity, equity, and inclusion in the workplace.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →