15 дней назад

Site Reliability Engineer (Postgres)

Формат работы

remote (Global)

Тип работы

fulltime

Грейд

senior

Английский

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Site Reliability Engineer (Postgres/AWS): Establishing reliability practices and frameworks to enable engineering teams to own their own reliability with an accent on SLIs, SLOs, and error budget policies. Focus on designing sustainable on-call practices, automating operational toil, and driving systemic fixes from incident postmortems.

Location: Fully Remote (Global)

Company

Supabase is the Postgres development platform providing a complete backend solution including Database, Auth, Storage, Edge Functions, Realtime, and Vector Search.

What you will do

Partner with service teams to define meaningful SLIs and SLOs and build error budget policies to guide engineering decisions.
Own and evolve the Operational Readiness Review (ORR) process for new services and major changes.
Strengthen the incident-to-improvement pipeline by connecting postmortem findings to systemic fixes.
Act as the reliability expert for architecture reviews, failure mode analysis, and resilience design.
Identify and quantify operational toil across the organization and advocate for automation to eliminate it.
Help teams design sustainable on-call practices, improving alert quality and reducing noise.

Requirements

7+ years of experience in SRE, production engineering, or reliability-focused roles.
Proven experience shaping SRE practices and driving adoption across engineering teams.
Software engineering mindset with the ability to write code and build tools.
Hands-on experience operationalizing SLOs/SLIs at scale and implementing error budget policies.
Deep expertise in incident response, postmortem facilitation, and systemic improvement.
Proficiency with AWS and infrastructure-as-code (Pulumi preferred, Terraform/CDK acceptable).

Nice to have

Experience with Kubernetes-based platform operations.
Familiarity with OpenTelemetry, VictoriaMetrics, Grafana, or similar observability tooling.
Experience building developer-facing reliability tooling such as SLO dashboards or DORA metrics tracking.

Culture & Benefits

Fully remote work with a WeWork membership or co-working allowance provided.
Equity ownership (ESOP) for every team member.
100% health insurance coverage for employees and 80% for dependents.
Annual company-wide off-sites in different cities.
Async-first work environment with a professional development allowance for learning.

Hiring process

Application review followed by a short intro video call.
Up to four interviews with team leads, peers, cross-functional partners, and leadership.
Final decision via follow-up questions or a direct offer.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Похожие вакансии