Назад
Company hidden
4 дня назад

Staff Site Reliability Engineer (AI/Azure)

183 400 - 245 400$
Формат работы
remote (USA)
Тип работы
fulltime
Грейд
principal
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Staff Site Reliability Engineer (AI/Azure): Leading the design and optimization of scalable, resilient infrastructure for cloud-native AI services on Azure with an accent on continuous delivery, observability, and automation. Focus on architecting solutions, establishing best practices for SLIs/SLOs, and fostering a reliability culture across teams.

Location: US Remote

Salary: $183,400–$245,400 USD

Company

hirify.global is a product company building an AI platform that powers autonomous agents and real-time learning.

What you will do

  • Lead the design, implementation, and optimization of scalable, resilient infrastructure for cloud-native AI services on Azure.
  • Establish continuous delivery (CD) pipelines supporting blue-green deployments, automatic rollbacks, and progressive delivery patterns.
  • Champion observability excellence, defining best practices for metrics, tracing, logging, SLIs, SLOs, and error budgets.
  • Drive automation across the entire lifecycle: infrastructure provisioning, testing, deployment, and recovery.
  • Partner with the engineering team to design reliable, fault-tolerant services and perform resilience and capacity reviews.
  • Mentor engineers and foster a reliability culture across teams to enable self-healing, observable systems.

Requirements

  • Bachelor’s or Master’s degree in Computer Science, Information Technology, or related field.
  • Solid experience in SRE, DevOps, or infrastructure engineering, with strong hands-on expertise in Azure.
  • Proven experience designing and operating distributed systems at scale with a strong understanding of reliability engineering principles (SLIs/SLOs/SLA).
  • Deep proficiency with Terraform, Kubernetes, Docker, and modern IaC and container orchestration best practices.
  • Expertise in CI/CD automation and release engineering, capable of implementing blue-green, canary, and rollback mechanisms.
  • Advanced use of observability tools such as Mimir, Grafana, Prometheus, and ELK stack.

Nice to have

  • Knowledge of SQL Server and PostgreSQL performance tuning and management in cloud environments.
  • Experience promoting GitOps workflows and tools such as Argo CD or Flux.

Culture & Benefits

  • Flexible time off with ample learning and development opportunities including leadership training.
  • Comprehensive onboarding program and recognition through Bonusly and peer-nominated awards.
  • Company-paid medical, dental, and vision (with 100% employer-paid options and 90% coverage for dependents), FSA, HSA, 401k match, and telehealth options including One Medical.
  • Parental leave and support, up to $20k in fertility services, surrogacy, and adoption reimbursement, maternity support through Maven Maternity, and free breast milk shipping through Maven Milk.
  • Pet insurance, legal advisory services, and financial planning tools.
  • Commitment to individuality, uniqueness, and diversity in a non-discriminatory environment.

Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →