Customer Reliability Manager (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Customer Reliability Manager (AI): Leading a team of Customer Reliability Engineers to ensure high-quality support for AI infrastructure and deployment models with an accent on BYOC, hybrid, and SaaS Enterprise environments. Focus on reducing friction for developers, managing high-severity incidents, and scaling complex support motions for LLM-powered applications.
Location: On-site in San Francisco, New York City, or Seattle
Company
is an AI observability platform that helps developers understand and improve AI behavior in production by connecting evals and observability in one workflow.
What you will do
- Lead and grow a team of Customer Reliability Engineers across hybrid, BYOC, and enterprise SaaS deployment models.
- Own the primary after-hours on-call rotation for SEV1 incidents and lead the incident response and escalation process.
- Triage and scope deployment-related bugs and feature requests, implementing fixes where feasible.
- Lead new BYOC deployments and upgrades, validating data plane releases against standard hybrid deployments.
- Mentor the team on infrastructure debugging, deployment best practices, and customer ownership.
- Synthesize operational trends and customer feedback for Product and Engineering teams to improve reliability.
Requirements
- 5–10+ years of experience leading support for developer-facing products.
- Deep familiarity with deploying Terraform, Helm, and Kubernetes across major cloud providers.
- Ability to review, debug, and reason about backend services and infrastructure configurations.
- Strong ownership of customer-impacting issues and empathetic communication in high-stakes situations.
- Must be based in and work on-site in San Francisco, New York City, or Seattle.
Nice to have
- Familiarity with OpenAI, Anthropic, or similar LLM providers at a systems or integration level.
- Experience guiding teams working with datasets, evaluation metrics, or prompt engineering.
- Track record of building support tooling or product-led growth initiatives in high-growth startups.
- Experience leading support for self-hosted offerings and containerized environments.
Culture & Benefits
- Comprehensive medical, dental, and vision insurance.
- Daily lunch, snacks, and beverages provided in the office.
- Flexible time off policy.
- Competitive salary and equity packages.
- AI Stipend.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →