Site Reliability Engineer
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Site Reliability Engineer (DevOps): Managing the reliability, availability, and performance of production systems for a distributed workflow orchestration platform with an accent on Kubernetes, observability, and infrastructure automation. Focus on designing scalable systems, leading incident response, and optimizing performance for mission-critical workflows.
Location: Must be based in Australia or EMEA
Salary: $180,000–$250,000
Company
provides a platform for building reliable, scalable, event-driven applications based on the open-source Conductor project.
What you will do
- Own the reliability, availability, and performance of production systems in cloud environments.
- Define and monitor SLIs/SLOs and manage error budgets across the platform.
- Lead incident response efforts including detection, triage, mitigation, and postmortems.
- Improve observability through logging, monitoring, alerting, and dashboards.
- Automate operational workflows and reduce manual toil.
- Partner with engineering teams to improve system resiliency and scalability.
Requirements
- Must be based in Australia or EMEA
- 5+ years of experience in SRE, DevOps, or Platform Engineering.
- Strong experience with cloud platforms (AWS, GCP, or Azure).
- Hands-on experience with Kubernetes and containerized environments.
- Strong understanding of distributed systems and microservices architecture.
- Proficiency with infrastructure automation and scripting (Terraform, Python, Bash).
Nice to have
- Experience supporting large-scale SaaS or cloud-native platforms.
- Familiarity with workflow orchestration technologies like Conductor, Temporal, or Camunda.
- Experience with Kafka or event-driven architectures.
- Knowledge of security best practices and cloud infrastructure hardening.
Culture & Benefits
- Remote-friendly culture with strong engineering autonomy.
- Comprehensive health coverage including medical, dental, and vision.
- Flexible PTO policy.
- Support for personal development.
- Opportunity to work on complex distributed systems at scale.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →