Company hidden

обновлено 2 часа назад

Manager, Reliability Operations (AI)

Формат работы

hybrid

Тип работы

fulltime

Грейд

lead

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Manager, Reliability Operations (AI): Leading a global 24/7 reliability operations function within the AI Resiliency Centre to ensure high availability and rapid incident response for eCommerce and corporate services with an accent on integrating agentic AI responders and maturing incident management. Focus on reducing MTTK/MTTR, implementing observability strategies, and scaling AI-driven automation to prevent customer-impacting incidents.

Location: Prague, Czechia (Flexible work model)

Company

hirify.global is a global travel technology platform powering a family of brands including Expedia, Hotels.com, and Vrbo.

What you will do

Lead a 24/7 global reliability operations team focused on monitoring, triage, and remediation of production systems.
Mature incident management practices to reduce mean time to detect (MTTD) and resolve (MTTR) across multiple domains.
Partner with engineering and SRE teams to define operational standards, runbooks, and system design (LLD) for reliable operations.
Develop and manage observability strategies using monitoring, alerting, and logging to proactively identify reliability risks.
Build, coach, and mentor a high-performing reliability operations team, fostering a culture of accountability.
Integrate and operate AI/ML-enabled solutions for noise reduction, capacity forecasting, and automated workflows.

Requirements

Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience operating large-scale systems.
Substantial experience leading 24/7 operational teams in reliability operations or SRE.
Proven track record implementing observability and incident management practices for distributed systems.
Hands-on familiarity with AI-driven automation tools and at least one scripting language (Python preferred).
Experience with monitoring tools such as Datadog, Splunk, Catchpoint, or PagerDuty.
Must be based in Prague, Czechia.

Nice to have

Experience leading reliability for complex, high-traffic, globally distributed systems.
Success in scaling AI/ML capabilities like predictive capacity modeling or AI-assisted runbooks.
Depth in using AIOps platforms to correlate signals across logs, metrics, and traces.

Culture & Benefits

Flexible work model with access to modern office spaces.
Full benefits package including travel perks and generous time-off.
Parental leave and comprehensive career development resources.
Inclusive and welcoming community with an open company culture.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Manager, Reliability Operations (AI)

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Nice to have

Culture & Benefits

Похожие вакансии

Engineering Manager (Backend)

Software Team Lead (C#/.NET)

Software Engineering Manager (Mainframe)

Разработка

Game Dev

Design и Creative

Аналитика

Менеджмент

People & Business

Manager, Reliability Operations (AI)

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Nice to have

Culture & Benefits

Categories

Похожие вакансии

Engineering Manager (Backend)

Software Team Lead (C#/.NET)

Software Engineering Manager (Mainframe)