Senior Site Reliability Engineer (SRE) (AI)
ΠΡΡΡ & Π‘ΠΎΠΏΡΠΎΠ²ΠΎΠ΄
ΠΠ»Ρ ΠΌΡΡΡΠ° Ρ ΡΡΠΎΠΉ Π²Π°ΠΊΠ°Π½ΡΠΈΠ΅ΠΉ Π½ΡΠΆΠ΅Π½ Plus
ΠΠΏΠΈΡΠ°Π½ΠΈΠ΅ Π²Π°ΠΊΠ°Π½ΡΠΈΠΈ
TL;DR
Senior Site Reliability Engineer (SRE) (AI): Design, build, and maintain scalable infrastructure and automation tools for traditional and AI-based systems with an accent on reliability, observability, and operational excellence. Focus on implementing CI/CD pipelines, supporting AI/ML model lifecycles, and leading incident response processes.
Location: Atlanta, GA (onsite)
Salary: $99,090 - $123,860 USD
Company
Financial services company focused on providing access to financial opportunities for individuals and communities.
What you will do
- Design, build, and maintain scalable infrastructure and automation for traditional and AI systems.
- Develop software to improve reliability and reduce manual operations.
- Implement and manage CI/CD pipelines, including AI model deployment.
- Monitor performance, availability, and security with observability tools.
- Collaborate with data science and ML teams on model training, serving, and lifecycle management.
- Lead incident response, root cause analysis, and postmortems.
- Advocate SRE principles across engineering and AI teams.
Requirements
- 5+ years in SRE, DevOps, or software engineering.
- Strong programming in Python, Java, etc.
- Experience with AI/ML workloads (model training, inference, GPU orchestration).
- Deep knowledge of Linux, cloud platforms (primarily Azure, AWS), container orchestration.
- Infrastructure-as-code (Terraform, Ansible, GitHub Actions).
- Monitoring/logging (Dynatrace), networking, security, distributed systems.
- Excellent communication and collaboration.
Nice to have
- AI model observability, drift detection, performance monitoring.
- Open-source contributions in SRE, DevOps, or ML infrastructure.
- Cloud platform certifications.
Culture & Benefits
- Competitive compensation and incentive opportunities.
- Health, dental, vision, life insurance.
- 401(k) with up to 6% company match; employer-paid retirement plan (4%).
- Tuition reimbursement up to $5,250/year.
- 20 days PTO, 9 company holidays, flexible Diversity Celebration Day.
- 40 hours paid volunteer time per year.
ΠΡΠ΄ΡΡΠ΅ ΠΎΡΡΠΎΡΠΎΠΆΠ½Ρ: Π΅ΡΠ»ΠΈ ΡΠ°Π±ΠΎΡΠΎΠ΄Π°ΡΠ΅Π»Ρ ΠΏΡΠΎΡΠΈΡ Π²ΠΎΠΉΡΠΈ Π² ΠΈΡ ΡΠΈΡΡΠ΅ΠΌΡ, ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΡ iCloud/Google, ΠΏΡΠΈΡΠ»Π°ΡΡ ΠΊΠΎΠ΄/ΠΏΠ°ΡΠΎΠ»Ρ, Π·Π°ΠΏΡΡΡΠΈΡΡ ΠΊΠΎΠ΄/ΠΠ, Π½Π΅ Π΄Π΅Π»Π°ΠΉΡΠ΅ ΡΡΠΎΠ³ΠΎ - ΡΡΠΎ ΠΌΠΎΡΠ΅Π½Π½ΠΈΠΊΠΈ. ΠΠ±ΡΠ·Π°ΡΠ΅Π»ΡΠ½ΠΎ ΠΆΠΌΠΈΡΠ΅ "ΠΠΎΠΆΠ°Π»ΠΎΠ²Π°ΡΡΡΡ" ΠΈΠ»ΠΈ ΠΏΠΈΡΠΈΡΠ΅ Π² ΠΏΠΎΠ΄Π΄Π΅ΡΠΆΠΊΡ. ΠΠΎΠ΄ΡΠΎΠ±Π½Π΅Π΅ Π² Π³Π°ΠΉΠ΄Π΅ β