Назад
Company hidden
обновлено 1 день назад

Site Reliability Engineer (AI)

Формат работы
remote (только Saudi_arabia)
Тип работы
fulltime
Грейд
middle
Английский
b2
Страна
SA
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Site Reliability Engineer (AI): Designing and maintaining highly available, fault-tolerant cloud infrastructure for an AI-native customer experience platform with an accent on scalability, performance, and resilience. Focus on automating operational tasks, optimizing Kubernetes clusters, and implementing robust observability to ensure system stability under heavy load.

Location: Remote (based in Riyadh, Saudi Arabia)

Company

hirify.global is an AI-native platform for customer experience intelligence that manages entire customer lifecycles autonomously.

What you will do

  • Design and maintain highly available, scalable, and fault-tolerant cloud infrastructure.
  • Manage and optimize cloud environments across AWS, GCP, or Azure using Terraform.
  • Operate and scale Kubernetes clusters in production environments.
  • Implement and refine monitoring and observability systems using tools like Prometheus, Grafana, or Datadog.
  • Automate repetitive operational tasks and improve CI/CD deployment reliability.
  • Respond to incidents, lead root cause analysis, and proactively eliminate single points of failure.

Requirements

  • Approximately 3 years of experience in SRE, DevOps, or infrastructure engineering.
  • Hands-on experience with Kubernetes in production and cloud environments (AWS, GCP, or Azure).
  • Proficiency in Infrastructure as Code tools like Terraform.
  • Strong scripting skills in Python or Bash for workflow automation.
  • Solid understanding of networking, load balancing, and high-availability design.
  • Experience with CI/CD pipelines and monitoring tools like Prometheus, Grafana, or ELK.

Nice to have

  • Experience with RabbitMQ or Redis in production.
  • Familiarity with Ansible or AWX.
  • Exposure to multi-cloud or hybrid environments.
  • Relevant cloud or Linux certifications.

Culture & Benefits

  • Opportunity to work on a mission-critical AI platform.
  • Focus on building systems that simplify complexity and improve engineering effectiveness.
  • Collaborative environment working closely with DevOps and engineering teams.

Hiring process

  • Screening Interview with Talent Acquisition.
  • Technical Interview with SRE Lead.
  • Technical Task.
  • Final Interview with SRE Lead and Cloud DevOps Director.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →