Назад
Company hidden
обновлено 6 дней назад

Site Reliability Engineer

Формат работы
remote (Global)
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
Malaysia
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Site Reliability Engineer (CloudBlue): Ensuring reliability, scalability, and observability of multi-tenant SaaS platforms for cloud commerce with an accent on monitoring, high availability, and incident response. Focus on designing fault-tolerant architectures, automating toil reduction, and improving Kubernetes-based systems resilience.

Remote opportunity, welcoming applications globally but prioritizing candidates based in Malaysia due to team needs and coverage.

Company

Fast-growing web hosting company providing cloud services, website builders, and CloudBlue platform for service providers worldwide.

What you will do

  • Define and implement SLIs, SLOs, and error budgets for critical services.
  • Design high-availability architectures with redundancy, failover, and disaster recovery.
  • Build and operate observability stack using Datadog, Grafana, and Elastic Stack.
  • Lead incident response, postmortems, and reliability improvements.
  • Conduct capacity planning, load testing, and performance optimization.
  • Automate processes and promote SRE best practices across teams.

Requirements

  • 3+ years as SRE, DevOps, or Production Engineer with production ownership.
  • Experience with highly available multi-tenant SaaS platforms.
  • Hands-on with Datadog, Grafana, Elasticsearch/Kibana.
  • Strong Linux, networking, distributed systems knowledge.
  • Docker, Kubernetes, Python/Bash scripting.
  • On-call rotations and incident response experience.
  • Strong written and spoken English.

Nice to have

  • Defining SLIs/SLOs and error budgets at scale.
  • Hyperscale or service-provider platforms.
  • Cloud experience, preferably Azure.
  • Hybrid/on-premises integrations.
  • Chaos engineering and resilience testing.

Culture & Benefits

  • Competitive salary and career advancement opportunities.
  • Flexible work arrangements for work/life balance.
  • Friendly culture built on trust, respect, and diversity.
  • 24/7 award-winning customer support in four languages.
  • Professional development and growth in a rapidly expanding team.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →