Назад
Company hidden
1 день назад

Site Reliability Engineer, Infrastructure - Analytics Platform (AI)

230 000 - 385 000$
Формат работы
onsite
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Site Reliability Engineer, Infrastructure - Analytics Platform (DevOps/AI): Designing and operating critical data infrastructure for AI research with an accent on high-throughput pipelines and low-latency data systems. Focus on scaling ClickHouse clusters, optimizing Kafka ingestion, and ensuring production reliability for analytics platforms.

Location: San Francisco

Salary: $230k – $385k + Equity

Company

hirify.global is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.

What you will do

  • Manage the infrastructure lifecycle across provisioning, upgrades, and scaling using an IaC-first approach.
  • Operate and scale ClickHouse clusters, focusing on sharding, replication, capacity planning, and performance tuning.
  • Optimize Kafka as the ingestion backbone to improve throughput, lag handling, and failure recovery.
  • Build robust monitoring and alerting systems using SLIs/SLOs, dashboards, and actionable runbooks.
  • Develop and implement incident response standards, on-call practices, and disaster recovery strategies.
  • Partner with software engineers to embed reliability into design, implementation, and release processes.

Requirements

  • Track record of owning production infrastructure for data-heavy, low-latency systems end to end.
  • Strong hands-on experience operating ClickHouse, Kafka, and large-scale data systems.
  • Practical experience with Snowflake workflows and cross-system data architecture.
  • Strong operational experience with Kubernetes, Terraform, and cloud infrastructure.
  • Ability to independently define and implement operational standards and runbooks.
  • Location: Must be based in San Francisco

Culture & Benefits

  • Opportunity to work on critical infrastructure that accelerates the progress toward AGI.
  • Environment emphasizing high personal rigor and organization in high-pressure production settings.
  • Collaborative culture working across engineering and research teams.
  • Competitive compensation package including equity.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →