Staff Site Reliability Operations Engineer (GCP)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Staff Site Reliability Operations Engineer (GCP): Leading global platform reliability and observability strategy with an accent on intelligent, self-healing infrastructure. Focus on scaling enterprise-grade GKE topologies, managing high-throughput Kafka streams, and optimizing complex networking across all OSI layers.
Location: Must be based in the United States or Canada
Salary: $136,000–$265,700
Company
is a cloud-first, AI-powered platform provider enabling Communication Service Providers to simplify operations and accelerate innovation.
What you will do
- Architect and optimize complex networking infrastructure spanning Layer 1 through Layer 7.
- Design and scale a unified observability platform using the Grafana Labs suite.
- Deploy machine learning models and automated anomaly detection to reduce telemetry noise.
- Drive the architecture, security, and networking of production GKE clusters.
- Maintain high-throughput Apache Kafka pipelines and large-scale data environments.
- Champion the long-term technical roadmap for distributed infrastructure and GCP standards.
Requirements
- Must be based in the United States or Canada
- 8+ years of experience in SRE, Production Engineering, or Distributed Systems roles.
- Expert-level mastery of GKE internals, multi-cluster networking, and GitOps.
- Deep technical knowledge of networking protocols (BGP, OSPF, TCP, QUIC, gRPC).
- Proven experience managing high-throughput Kafka and large-scale data tiers (PostgreSQL, AlloyDB, BigQuery).
- Advanced expertise in Terraform for multi-region GCP architecture and proficiency in Go or Python.
Nice to have
- Deep knowledge of Google Cloud architectural best practices and Cloud SDN.
- Understanding of Linux internals, eBPF-based monitoring, and packet analysis tools.
- Exceptional written communication skills for asynchronous alignment.
Culture & Benefits
- 100% fully remote work environment.
- Opportunity to work on cutting-edge AIOps and cloud-native observability.
- Collaborative distributed engineering organization across multiple time zones.
- Comprehensive benefits package including potential bonus eligibility.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →