Principal Site Reliability Engineer (Cybersecurity)

164 500 - 235 000$

Формат работы

remote (только USA)/hybrid

Тип работы

fulltime

Грейд

senior

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Principal Site Reliability Engineer (Cybersecurity): Building and optimizing high-availability, scalable cloud infrastructure across multi-cloud environments with an accent on automation and observability. Focus on reducing Mean Time to Mitigate (MTTM), implementing self-healing systems, and maturing architectural standards for a global security platform.

Location: Must be based in the USA. Hybrid (3 days a week in San Jose, CA) or Remote options available.

Salary: $164,500 – $235,000 USD

Company

hirify.global is an AI-forward enterprise providing a cloud-native Zero Trust Exchange platform to secure digital transformation for millions of users.

What you will do

Design and implement highly available, scalable infrastructure across AWS, Azure, GCP, and bare-metal environments.
Drive an "automation-first" culture by writing Python and Go code to eliminate manual toil.
Implement sophisticated observability using Prometheus, Grafana, and OpenTelemetry, defining SLIs/SLOs and error budgets.
Serve as a lead Incident Commander, develop response playbooks, and conduct deep-dive post-incident analyses.
Partner with Engineering teams to perform technical operability reviews.

Requirements

10+ years of experience managing reliability, scalability, and availability for large-scale production services.
Deep expertise in programming with Python, Go, or C/C++.
Strong background in networking protocols, Linux/FreeBSD systems, and distributed architecture.
Experience in high-stakes incident management and participation in a 24/7 on-call rotation.
Proficiency in ITIL frameworks to drive service maturity through systematic problem management.
Authorized to work in the US.

Nice to have

Extensive experience with Infrastructure-as-Code (Ansible, Terraform) and public clouds.
Experience with chaos engineering and disaster recovery planning at scale.
Expertise in global routing (BGP), traffic tunneling (GRE, IPSec), and L7 proxy architectures (HAProxy).

Culture & Benefits

Comprehensive and inclusive benefits program supporting families through all life stages.
A high-accountability culture that values impact over activity and customer obsession.
Environment based on transparency, trust, and constructive, honest debate.
Flexible work arrangements including remote and hybrid options.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →