Назад
Company hidden
1 день назад

Site Reliability Engineer III (SaaS)

148 320 - 185 400$
Формат работы
remote (только USA)
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Site Reliability Engineer III (SaaS): Owning the reliability, scalability, and security of hirify.global's production infrastructure on AWS with an accent on supporting a B2B SaaS platform that processes sensitive employee leave data for enterprise customers. Focus on leading incident response, disaster recovery plans, and contributing to SOC 2 audit readiness.

Location: Remote (United States)

Salary: 148,320 - 185,400 USD per year

Company

hirify.global transforms the employee experience with secure, intuitive technology that helps employers bring humanity, certainty, and efficiency to complex workplace moments.

What you will do

  • Architect, implement, and operate scalable, resilient, and secure AWS infrastructure.
  • Lead infrastructure-as-code initiatives to ensure reproducible, auditable, and consistently configured environments.
  • Design, maintain, and improve CI/CD pipelines using Jenkins and GitHub.
  • Own the Datadog observability platform, including dashboards, monitors, alerting thresholds, and log management.
  • Serve as a senior technical responder across the full incident lifecycle and lead blameless postmortems.
  • Refine, implement, and test disaster recovery plans to meet RTO/RPO objectives, while contributing to SOC 2 audit readiness.

Requirements

  • 5+ years of experience in SRE, DevOps, or a related engineering role, with advanced hands-on expertise in AWS production environments.
  • Strong proficiency in infrastructure-as-code tooling such as Terraform, CloudFormation, or CDK, paired with experience building and operating CI/CD pipelines using Jenkins and GitHub.
  • Proficiency in Python, Go, or Bash for automation, alongside hands-on experience with Datadog or a comparable observability platform.
  • Demonstrated experience leading incident response in complex, distributed systems, with working knowledge of SLO/SLI frameworks, error budgets, and disaster recovery planning against defined RTO/RPO objectives.
  • Familiarity with SOC 2 compliance frameworks and experience contributing to audit readiness, access controls, and security control evidence collection.
  • A collaborative, ownership-driven mindset with strong communication skills, a passion for mentoring junior engineers, and a commitment to reducing toil through automation and AI-assisted tooling.

Culture & Benefits

  • Remote-first and results-driven environment with the freedom and flexibility to do your best work.
  • Access to learning resources, leadership programs, and real opportunities to take on new challenges and expand your impact.
  • Comprehensive benefits, a performance-based bonus program, and equity opportunities.
  • Flexible time off, paid holidays, and flexible leave programs designed to support every season of life.
  • Inclusive culture where every voice is valued, collaboration is celebrated, and success is shared.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →