Назад
Company hidden
1 день назад

Senior Site Reliability Engineer (AI)

160 000 - 195 000$
Формат работы
remote (только USA)
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Senior Site Reliability Engineer (Python/AWS): Building and optimizing high-availability infrastructure for a public safety AI platform with an accent on system resilience and observability. Focus on diagnosing root-cause failures in Kubernetes, optimizing high-throughput messaging systems, and implementing AI-driven reliability improvements.

Location: Remote (New York or Boston); must be able to collaborate in-person a few times per quarter.

Salary: $160,000 - $195,000

Company

hirify.global is a leading public safety AI company that provides mission-critical intelligence to first responders and security teams to enable faster emergency response.

What you will do

  • Own performance and reliability outcomes by optimizing connection pooling, database architecture, and traffic routing.
  • Design for system resilience through safer deployment patterns, failover strategies, and redundancy.
  • Implement deep observability using structured logging, metrics, and alerting to detect issues before escalation.
  • Manage production incidents from initial signal to final resolution and root cause implementation.
  • Work across infrastructure-as-code, container orchestration, and application code to drive stability.

Requirements

  • 5+ years of professional engineering experience with deep expertise in Python.
  • Hands-on experience with AWS (networking, managed databases, IAM, DNS routing).
  • Production experience with Kubernetes (EKS, ECS, or Fargate).
  • Strong understanding of distributed systems failure modes (resource exhaustion, replication lag).
  • Experience with high-throughput messaging (RabbitMQ, Kafka, SNS/SQS) and Terraform.
  • Must be based in or near New York or Boston for occasional in-person collaboration.

Nice to have

  • Experience with on-call rotations for mission-critical production systems.
  • Proficiency with Datadog (APM, alerting), Elasticsearch, or OpenSearch.
  • Experience with ArgoCD and GitOps deployments.
  • Experience modernizing legacy CI/CD pipelines (Jenkins, Concourse).

Culture & Benefits

  • Opportunity to work on a mission-driven product that saves lives globally.
  • Competitive salary and benefits package.
  • Equity participation (stock options).
  • Dynamic and flexible startup environment with a highly talented team.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →