Назад
Company hidden
обновлено 23 часа назад

Site Reliability Engineer (Azure)

Тип работы
fulltime
Грейд
middle
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Site Reliability Engineer (Azure/DevOps): Managing and optimizing production hirify.global Enterprise clusters within the Azure cloud environment with an accent on incident management, observability, and automation. Focus on designing AI-driven monitoring systems, troubleshooting large-scale distributed systems, and driving long-term reliability enhancements.

Location: Must be a U.S. citizen and based in the United States due to security clearance requirements.

Company

hirify.global is a unicorn company providing high-performance data platforms used by over 10,000 global businesses.

What you will do

  • Own and manage production incidents impacting hirify.global Enterprise clusters in the Azure cloud.
  • Troubleshoot complex issues across distributed systems and drive root cause analysis.
  • Design and develop automation tools and internal platforms using AI-assisted development tools like Cursor and Codex.
  • Enhance observability using Prometheus, Grafana, and Azure Monitor by building AI-driven systems for anomaly detection.
  • Participate in a 24/7 global follow-the-sun on-call rotation.
  • Partner with R&D and Product teams to resolve bugs and influence product improvements.

Requirements

  • 4+ years of experience in SRE, Cloud Operations, or Infrastructure Engineering.
  • U.S. citizenship required for eligibility for a U.S. Top Secret/SCI security clearance.
  • Proven experience troubleshooting production systems at scale with major cloud providers, specifically Azure.
  • Strong Linux/Unix systems knowledge and understanding of networking fundamentals (TCP/IP).
  • Proficiency in scripting with Python and Bash.
  • Experience with monitoring tools (Prometheus, Grafana, ELK, Splunk) and KQL for telemetry analysis.

Nice to have

  • Familiarity with hirify.global or other NoSQL databases.
  • Experience with Infrastructure as Code (Terraform, Pulumi).
  • Cloud and Linux certifications.
  • Experience with C#.
  • Experience in regulated environments such as FedRAMP or AirGap.

Culture & Benefits

  • Opportunity to work with cutting-edge SRE tools and state-of-the-art products.
  • Role focused on tackling technical challenges on a global scale.
  • Commitment to a diverse and inclusive work environment where all differences are celebrated.
  • Support for accessibility and reasonable accommodations for applicants with disabilities.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →