Senior Site Reliability Engineer (SaaS)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Senior Site Reliability Engineer (SaaS): Managing and optimizing production infrastructure for a global Unified Communications platform with an accent on transitioning from reactive operations to a proactive, automation-first SRE model. Focus on eliminating toil through automation, driving incident response strategy, and maintaining platform reliability at scale.
Location: Manila, Philippines
Company
is a global provider of Unified Communications and CX solutions, empowering businesses with voice, fax, messaging, and collaboration services.
What you will do
- Own platform reliability across global UC infrastructure, driving the overall reliability strategy and incident response.
- Triage and resolve critical infrastructure failures and act as the senior escalation point for the NOC.
- Build automation to eliminate toil and redesign maintenance processes to prevent systemic failures.
- Lead blameless post-mortems and translate technical production events into business-readable communication.
- Define and track SLIs, SLOs, and SLAs to drive data-driven reliability investments.
- Mentor junior engineers and provide technical leadership for complex infrastructure projects.
Requirements
- 6+ years of experience in SRE, platform operations, or infrastructure engineering.
- Mastery of Linux systems administration and distributed systems.
- Hands-on experience with at least one major cloud provider (OCI, AWS, GCP, or Azure).
- Proficiency in Python or Bash for automation and log parsing.
- Strong incident response discipline and experience with on-call rotations.
- Demonstrated technical leadership and an AI-forward mindset.
Nice to have
- Experience with Oracle Cloud Infrastructure (OCI).
- Familiarity with VoIP and SIP infrastructure (registration, trunking, signaling).
- Knowledge of observability tools: Prometheus, Grafana, and PagerDuty.
- Experience with Ansible for configuration management and deployment automation.
Culture & Benefits
- Shared on-call rotation (approximately 1 week per month) with a supportive escalation culture.
- Blameless culture focusing on systemic improvements rather than individual errors.
- Collaborative environment working across NOC, Engineering, Sales, and Professional Services.
- Commitment to equal employment opportunities (EEO).
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →