TL;DR
Cloud/Site Reliability Engineer: Evolving cloud-hosted environments to be more self-aware, self-healing, and scalable, ensuring high availability and performance of applications and services. Focus on proactive monitoring, root cause analysis, and automation-driven remediation.
Location: Providence, RI, US
Salary: $117,880 - $240,000
Company
Brightstar is an innovative, forward-thinking global leader in lottery that builds on our renowned expertise in delivering secure technology and producing reliable, comprehensive solutions for our customers.
What you will do
- Design and refine monitoring strategies using tools like Dynatrace, Prometheus, and ELK.
- Develop and implement self-healing capabilities that proactively detect and remediate issues, minimizing manual intervention and downtime.
- Analyze operational workflows to identify repetitive tasks and transform them into scalable, automated solutions.
- Manage Cloud infrastructure and services.
- Participate in 24x7 On-Call rotation with after-hours support for critical incident response.
Requirements
- Hands-on experience in cloud operation or site reliability engineering field.
- Practical experience in public cloud infrastructure and services management (Azure / AWS public cloud knowledge would be preferred).
- Proficiency in scripting and automation (Terraform, PowerShell, Python, Bash).
- Experience with Infrastructure as Code (IaC) and GitOps principles.
- Hands-on experience on K8s and containers orchestration.
- Expertise in monitoring tools (Dynatrace, Datadog, Prometheus, ELK).
Nice to have
- Apply Agentic AI techniques to drive intelligent automation, optimize cloud services, accelerate troubleshooting and root-cause analysis, and enhance system resilience and recoverability.
- Familiarity with AI/ML Ops or AI-assisted observability tools.
- Thorough understanding of Java application workloads, and Java performance related topics.
Culture & Benefits
- Be part of a forward-thinking Cloud Infrastructure Engineering, Operations & Automation team that values prevention over reaction, automation over repetition, and collaboration over silos.
- 401(k) Savings Plan with Company contributions.
- Health, dental, and vision insurance.
- Paid time off, wellness programs, and identity theft insurance.
Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →