2 дня назад
Staff Site Reliability Engineer (GCP)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
Текст:
TL;DR
Staff Site Reliability Engineer (GCP): Building and optimizing resilient infrastructure for an agentic software creation platform with an accent on observability, automation, and high availability. Focus on scaling Kubernetes clusters, designing sophisticated monitoring solutions, and reducing MTTR through automated incident response.
Location: Remote (Europe)
Company
is an agentic software creation platform that enables anyone to build applications using natural language.
What you will do
- Architect and implement comprehensive monitoring, logging, and tracing solutions to provide real-time visibility into system health.
- Define and drive reliability standards by implementing and tracking SLOs and SLIs across product teams.
- Lead incident management and response for high-impact events, conducting blameless post-mortems to prevent recurrence.
- Drive automation and Infrastructure as Code (IaC) using Terraform or Pulumi to eliminate operational toil.
- Optimize large-scale Kubernetes deployments on GCP to resolve performance bottlenecks and reduce latency.
- Provide staff-level guidance on system designs and mentor engineers to instill reliability as a core engineering value.
Requirements
- 8-10 years of experience in Site Reliability Engineering, DevOps, or Infrastructure Engineering.
- Strong programming skills in Python or Go with a focus on high-quality, tested code.
- Deep understanding of distributed systems and experience scaling production services.
- Expertise with Kubernetes and cloud-native technologies.
- Proven track record of designing sophisticated monitoring and observability solutions.
- Must be based in Europe.
Nice to have
- Deep experience with Google Cloud Platform (GCP) services and tools.
- Expert-level knowledge of Prometheus, Grafana, Datadog, or OpenTelemetry.
- Experience in rapid-growth startup environments.
- Ability to write company-facing technical blog posts and training materials.
Culture & Benefits
- Competitive salary and equity.
- 401(k) program with a 4% match.
- Comprehensive health, dental, vision, and life insurance.
- Flexible Time Off (FTO) and paid parental, medical, and caregiver leave.
- Autonomous work environment with wellness stipends and in-office setup reimbursement.
- Quarterly team gatherings to foster collaboration.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →