Назад
Company hidden
1 месяц назад

Platform Operations Lead (AI Cloud)

Формат работы
remote (только United_kingdom)
Тип работы
fulltime
Грейд
lead
Английский
b2
Страна
UK
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Platform Operations Lead (AI Cloud): Scaling the operational maturity of cloud infrastructure with an accent on automation, tooling, and reliability. Focus on reducing operational load on engineering teams by designing self-service solutions and improving incident response.

Location: Remote (UK)

Company

hirify.global is the company behind Hyperstack, a full-stack AI cloud providing on-demand and private GPU infrastructure for AI researchers and enterprises.

What you will do

  • Build and improve scalable infrastructure operations processes to support a growing cloud platform.
  • Develop secure automation, diagnostics, and tooling for customer-facing and operational teams.
  • Identify operational pain points and transform them into automated or self-service solutions.
  • Support the rollout of new infrastructure environments in collaboration with DevOps and Engineering.
  • Enhance observability, incident response, and operational documentation across production.
  • Design runbooks and escalation paths between technical and customer-facing teams.

Requirements

  • Strong background in infrastructure, cloud operations, DevOps, or platform engineering.
  • Experience supporting production environments with a focus on reliability and incident response.
  • Proven ability to design and implement automation that reduces manual operational workloads.
  • Strong scripting and workflow automation skills emphasizing security and maintainability.
  • Ability to coordinate between multiple technical and non-technical teams.
  • Experience with observability, monitoring, and orchestration technologies.

Nice to have

  • Experience with GPU, high-performance compute, or managed infrastructure platforms.
  • Exposure to Kubernetes, OpenStack, Grafana, or Windmill automation platforms.
  • Experience maturing 24/7 support, NOC, or SRE capabilities.
  • Experience designing self-service tooling for support and operations.

Culture & Benefits

  • Competitive salary and annual discretionary bonus scheme.
  • Employee wellbeing benefits and 25 days of holiday plus public holidays.
  • Flexible working arrangements.
  • High degree of ownership, autonomy, and trust to experiment.
  • Collaborative, international culture built on transparency and ownership.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →