Назад
Company hidden
1 день назад

Staff Site Reliability Engineer (AWS/Kubernetes)

Формат работы
remote (только Colombia)/hybrid
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
Colombia
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Staff Site Reliability Engineer (AWS/Kubernetes): Defining and elevating reliability standards across the healthcare platform with an accent on systemic reliability risks and cross-cutting solutions. Focus on designing scalable SLO frameworks, leading complex incident response, and driving observability strategies across distributed systems.

Location: Must be based in Colombia (Virtual-first environment with hybrid options)

Company

hirify.global is a technology-enabled care platform transforming healthcare delivery by providing convenient, affordable, and effective care on a global scale.

What you will do

  • Define and evolve platform-wide reliability standards, patterns, and tooling.
  • Design and implement cross-cutting mechanisms such as circuit breakers, retry policies, and load shedding.
  • Establish scalable SLO frameworks and lead complex, multi-service incident response as an incident commander.
  • Drive observability strategies using metrics, logs, traces, and alerting systems to reduce time to resolution.
  • Collaborate with Platform Engineering to strengthen Kubernetes (EKS), networking, and data system reliability.
  • Mentor senior engineers through design reviews and promote a culture of proactive operational excellence.

Requirements

  • 8+ years of experience in SRE, infrastructure, or production engineering roles.
  • Deep expertise in AWS environments and Kubernetes (preferably EKS).
  • Hands-on experience with Infrastructure as Code tools such as Terraform or CDKTF.
  • Advanced understanding of distributed systems, networking, and failure modes.
  • Experience designing and managing observability stacks (Prometheus, Grafana, OpenTelemetry).
  • Must be located in Colombia to access local benefits and medical coverage.

Nice to have

  • Experience with service mesh technologies (e.g., Istio) and mTLS.
  • Familiarity with GitOps workflows (e.g., ArgoCD, Flux).
  • Experience working in compliance-driven environments like HIPAA, SOC2, or FedRAMP.
  • Exposure to chaos engineering practices and FinOps (cost-aware infrastructure design).

Culture & Benefits

  • Virtual-first work environment with a hybrid allowance and work-life flexibility.
  • Comprehensive health benefits including Medical Plan Coverage by Colmédica and Pan American.
  • Generous leave policies: 18 weeks maternity leave and dedicated Mental Health Days.
  • Summer Fridays and an annual bonus program.
  • Professional growth opportunities via LinkedIn Learning and tuition reimbursement.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →