Назад
Company hidden
1 день назад

Incident & Change Champion (AI Infrastructure)

Формат работы
remote (Global)
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
UK/US/Norway +2 еще
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Incident & Change Champion (AI Infrastructure): Owning and optimizing Incident and Change Management processes for a GPU cloud platform with an accent on operational discipline and tooling implementation. Focus on reducing system downtime through disciplined major incident coordination, CAB leadership, and fostering a blameless postmortem culture.

Location: Remote (Global)

Company

hirify.global is a GPU cloud engineered for AI, providing cost-effective, high-performance infrastructure for AI start-ups and large enterprise customers.

What you will do

  • Develop and refine Incident and Change Management processes to v1.0, including severity declarations, SLA/SLO tables, and communication ladders.
  • Lead the migration and implementation of incident and change workflows within Jira Service Management.
  • Act as Incident Commander or Major Incident Manager for SEV-1 and complex SEV-2 events, coordinating internal and external communications.
  • Chair the Change Advisory Board (CAB) and manage the change calendar, including freeze windows for critical periods.
  • Train and certify a pool of Incident Commanders across Support and SRE teams and run quarterly tabletop exercises.
  • Define and report key operational metrics (MTTA, MTTR, change success rate) to the senior leadership team.

Requirements

  • 5+ years in ITSM / Service Management roles with direct ownership of Incident and Change Management processes.
  • Hands-on experience facilitating major incidents end-to-end as an Incident Commander in a 24/7 production environment.
  • Demonstrable experience running a Change Advisory Board or equivalent change-review forum.
  • Proven track record configuring Jira Service Management, ServiceNow, or equivalent ITSM tooling.
  • Strong technical writing skills for process documents, postmortems, and executive reports.
  • Comfort holding the room under pressure with senior stakeholders, engineers, and customers concurrently.

Nice to have

  • Experience in cloud, hyperscaler, AI infrastructure, or HPC environments.
  • Familiarity with SRE concepts, including SLOs, error budgets, and runbook discipline.
  • Experience designing and running tabletop exercises and game days.
  • Experience operating processes for regulated or sovereign customer workloads.
  • Familiarity with Jira automation and JSM portals.

Culture & Benefits

  • Competitive compensation package including base salary and equity with annual reviews.
  • Remote-first work environment with high autonomy and human-first flexibility.
  • Opportunity to join a fast-growing tech startup pushing the boundaries of AI infrastructure.
  • Dynamic progression plan tailored to individual professional ambitions.
  • Collaborative and supportive environment focused on ownership, transparency, and accountability.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →