Site Reliability Engineer, Infrastructure - Analytics Platform (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Site Reliability Engineer, Infrastructure - Analytics Platform (DevOps/AI): Designing and operating critical data infrastructure for AI research with an accent on high-throughput pipelines and low-latency data systems. Focus on scaling ClickHouse clusters, optimizing Kafka ingestion, and ensuring production reliability for analytics platforms.
Location: San Francisco
Salary: $230k – $385k + Equity
Company
is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.
What you will do
- Manage the infrastructure lifecycle across provisioning, upgrades, and scaling using an IaC-first approach.
- Operate and scale ClickHouse clusters, focusing on sharding, replication, capacity planning, and performance tuning.
- Optimize Kafka as the ingestion backbone to improve throughput, lag handling, and failure recovery.
- Build robust monitoring and alerting systems using SLIs/SLOs, dashboards, and actionable runbooks.
- Develop and implement incident response standards, on-call practices, and disaster recovery strategies.
- Partner with software engineers to embed reliability into design, implementation, and release processes.
Requirements
- Track record of owning production infrastructure for data-heavy, low-latency systems end to end.
- Strong hands-on experience operating ClickHouse, Kafka, and large-scale data systems.
- Practical experience with Snowflake workflows and cross-system data architecture.
- Strong operational experience with Kubernetes, Terraform, and cloud infrastructure.
- Ability to independently define and implement operational standards and runbooks.
- Location: Must be based in San Francisco
Culture & Benefits
- Opportunity to work on critical infrastructure that accelerates the progress toward AGI.
- Environment emphasizing high personal rigor and organization in high-pressure production settings.
- Collaborative culture working across engineering and research teams.
- Competitive compensation package including equity.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →