Назад
Company hidden
2 дня назад

Staff Site Reliability Operations Engineer (GCP)

136 000 - 265 700$
Формат работы
remote (только United_states/Canada)
Тип работы
fulltime
Английский
b2
Страна
US/Canada
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Staff Site Reliability Operations Engineer (GCP): Leading global platform reliability and observability strategy with an accent on intelligent, self-healing infrastructure. Focus on scaling enterprise-grade GKE topologies, managing high-throughput Kafka streams, and optimizing complex networking across all OSI layers.

Location: Must be based in the United States or Canada

Salary: $136,000–$265,700

Company

hirify.global is a cloud-first, AI-powered platform provider enabling Communication Service Providers to simplify operations and accelerate innovation.

What you will do

  • Architect and optimize complex networking infrastructure spanning Layer 1 through Layer 7.
  • Design and scale a unified observability platform using the Grafana Labs suite.
  • Deploy machine learning models and automated anomaly detection to reduce telemetry noise.
  • Drive the architecture, security, and networking of production GKE clusters.
  • Maintain high-throughput Apache Kafka pipelines and large-scale data environments.
  • Champion the long-term technical roadmap for distributed infrastructure and GCP standards.

Requirements

  • Must be based in the United States or Canada
  • 8+ years of experience in SRE, Production Engineering, or Distributed Systems roles.
  • Expert-level mastery of GKE internals, multi-cluster networking, and GitOps.
  • Deep technical knowledge of networking protocols (BGP, OSPF, TCP, QUIC, gRPC).
  • Proven experience managing high-throughput Kafka and large-scale data tiers (PostgreSQL, AlloyDB, BigQuery).
  • Advanced expertise in Terraform for multi-region GCP architecture and proficiency in Go or Python.

Nice to have

  • Deep knowledge of Google Cloud architectural best practices and Cloud SDN.
  • Understanding of Linux internals, eBPF-based monitoring, and packet analysis tools.
  • Exceptional written communication skills for asynchronous alignment.

Culture & Benefits

  • 100% fully remote work environment.
  • Opportunity to work on cutting-edge AIOps and cloud-native observability.
  • Collaborative distributed engineering organization across multiple time zones.
  • Comprehensive benefits package including potential bonus eligibility.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →