IT Infrastructure Operations Engineer - Lead
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
IT Infrastructure Operations Engineer - Lead (SRE): Building and operating highly reliable infrastructure and automation for physical security systems with an accent on SLIs, SLOs, error budgets, and toil reduction. Focus on leading incident response, driving automation strategies, and implementing Infrastructure-as-Code for scalability and resilience.
Location: Atlanta, GA (onsite)
Salary: $114,000 - $180,000 USD
Company
IT services provider supporting physical security systems for clients like Google.
What you will do
- Lead and mentor a team of Automation Engineers, fostering ownership and continuous improvement.
- Define and track SLIs/SLOs with client IT leadership and manage error budgets.
- Handle Sev 1/2 incidents as primary escalation point, conduct blameless post-mortems, and drive reliability improvements.
- Manage project backlog, prioritize reliability and toil reduction initiatives.
- Drive automation to reduce manual tasks and oversee IaC with Terraform, Ansible, Puppet, or Chef.
- Ensure observability via monitoring, alerting, logging; manage 24x5 on-call rotations.
- Collaborate with stakeholders to integrate SRE practices and reduce MTTR.
Requirements
- 8+ years in SRE or Infrastructure Engineering, 3+ years in technical leadership managing teams
- Hands-on with Linux/Windows admin, IaC tools (Terraform, Ansible, Chef, Puppet), scripting (Python, Bash, PowerShell)
- Deep SRE knowledge: SLIs/SLOs, error budgets, toil elimination, observability (Prometheus, Grafana, ELK, Datadog)
- Proven incident management, post-mortems, reliability improvements
- Networking fundamentals, Cisco admin, NETCONF/RESTCONF, security compliance
- Strong leadership, communication, stakeholder management
Culture & Benefits
- Comprehensive medical (UHC PPO/HSA/Surest, Kaiser HMO for CA), dental, vision via UHC
- Flexible Spending Accounts, commuter benefits, 401k plan
- Life/disability insurance, critical illness coverage, tuition reimbursement after 6 months
- Paid Time Off (up to 120 hours), holidays, wellness days, EAP
- Continuing education via Udemy/Coursera, corporate wellness program
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →