Назад
Company hidden
5 дней назад

SRE (Site Reliability Engineering) (GCP)

150 000 - 275 000$
Формат работы
onsite
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

SRE (Site Reliability Engineering) (GCP): Own reliability across GCP infrastructure, lead incident response, and drive scalable database and search operations with an accent on Kubernetes, Terraform-managed environments, and production observability. Focus on preventing recurrence through end-to-end incident handling, scaling MySQL/Postgres, and tuning Elasticsearch clusters under real-world load.

Location: New York City (On-site)

Salary: $150K–$275K (Offers Equity)

Company

hirify.global builds a large-scale platform for gaming clips and creator-powered gaming communities.

What you will do

  • Own reliability across GCP infrastructure, including Kubernetes clusters, managed services, and data pipelines to improve availability and latency.
  • Lead incident response end-to-end: on-call rotations, runbooks, postmortems, and follow-through to prevent recurrence.
  • Architect and execute database scaling strategies (sharding, replication, query optimization, capacity planning) for MySQL and Postgres.
  • Manage and evolve Terraform-managed GCP and Kubernetes configurations.
  • Own Elasticsearch end-to-end: capacity planning, sharding strategy, index lifecycle management, upgrades, and performance tuning.
  • Build observability (metrics, dashboards, alerting, tracing) and improve CI/CD reliability across GitHub Actions; harden IAM, secrets management, and network segmentation.

Requirements

  • Hands-on experience scaling and sharding relational databases (MySQL/Postgres) in production.
  • Strong GCP experience with Kubernetes, VPC, IAM, Cloud Logging, and managed services.
  • Fluency in Terraform with ownership of infrastructure-as-code at scale.
  • Production experience operating Elasticsearch and keeping clusters healthy.
  • Proven incident response skills: calm execution during P0s, clear communication, and postmortems that prevent recurrence.
  • Excellent communication skills to flag issues quickly and lead/write actionable postmortems.

Culture & Benefits

  • Comprehensive medical, dental, and vision coverage.
  • 401(k) and wellness/fitness perks (Wellhub membership) plus mental health resources.
  • Paid parental leave, fertility and maternal health benefits.
  • Generous PTO and daily meals and commuter benefits at the NYC HQ in Flatiron.
  • Learning and development stipend; competitive salary with meaningful equity.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →