Назад
Company hidden
2 дня назад

Staff Site Reliability Engineer

Формат работы
remote (только USA)
Тип работы
fulltime
Грейд
senior
Английский
b2
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Staff Site Reliability Engineer (AWS/Kubernetes): Establish and evolve SRE best practices across the organization, including reliability principles, error budgets, incident response, and observability strategy with an accent on SLIs/SLOs, alerting, dashboards, and automation. Focus on designing software-driven infrastructure solutions, leading large initiatives, and improving platform resilience, scalability, and developer workflows.

Location: Remote

Company

hirify.global is the fastest growing healthcare technology company building products to make prescriptions accessible and affordable, including BlinkRx pharma-to-patient cloud and Quick Save for better access to medications.

What you will do

  • Establish and evolve SRE best practices, including reliability principles, error budgets, incident response, postmortems, and operational readiness.
  • Define and drive observability strategy for system health with SLIs/SLOs, alerting, dashboards, and service indicators.
  • Design and implement software-driven infrastructure solutions to automate processes and reduce toil.
  • Act as technical leader, set priorities, and influence decisions across cloud infrastructure, reliability tooling, and platform architecture.
  • Own large ambiguous initiatives from concept to delivery, aligning stakeholders in engineering, security, and product.
  • Improve platform resilience, scalability, performance, and compliance; identify risks and lead upgrades.
  • Partner with teams to enhance developer workflows, tooling, and operational maturity; provide mentorship and code reviews.
  • Lead incident response, escalation, postmortems, and knowledge sharing through documentation.

Requirements

  • Bachelor’s or Master’s in Computer Science or equivalent; 7+ years in SRE, infrastructure, or platform engineering at scale.
  • Expert troubleshooting across full stack: application, kernel, network; strong Linux and OS fundamentals.
  • Advanced networking: load balancing, proxies, DNS, TCP/IP, NAT, service communication.
  • Experience in Python, Go, Bash; automating operations; building internal tools.
  • Deep cloud experience (AWS preferred, GCP/Azure ok), Kubernetes (EKS, Helm), observability systems, containers, microservices.
  • IaC with Terraform, Pulumi, CloudFormation, or Ansible; holistic infrastructure design for cost, reliability, security.

Culture & Benefits

  • Highly collaborative team of builders and operators inventing new ways in healthcare innovation.
  • Impact millions of patients at intersection of healthcare and finances; build generational company.
  • Relentlessly learning, curious, aggressively collaborative cross-functional environment.
  • Equal opportunity employer valuing diversity.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →