1 месяц назад
Platform Operations Lead (AI Cloud)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
Текст:
TL;DR
Platform Operations Lead (AI Cloud): Scaling the operational maturity of cloud infrastructure with an accent on automation, tooling, and reliability. Focus on reducing operational load on engineering teams by designing self-service solutions and improving incident response.
Location: Remote (UK)
Company
is the company behind Hyperstack, a full-stack AI cloud providing on-demand and private GPU infrastructure for AI researchers and enterprises.
What you will do
- Build and improve scalable infrastructure operations processes to support a growing cloud platform.
- Develop secure automation, diagnostics, and tooling for customer-facing and operational teams.
- Identify operational pain points and transform them into automated or self-service solutions.
- Support the rollout of new infrastructure environments in collaboration with DevOps and Engineering.
- Enhance observability, incident response, and operational documentation across production.
- Design runbooks and escalation paths between technical and customer-facing teams.
Requirements
- Strong background in infrastructure, cloud operations, DevOps, or platform engineering.
- Experience supporting production environments with a focus on reliability and incident response.
- Proven ability to design and implement automation that reduces manual operational workloads.
- Strong scripting and workflow automation skills emphasizing security and maintainability.
- Ability to coordinate between multiple technical and non-technical teams.
- Experience with observability, monitoring, and orchestration technologies.
Nice to have
- Experience with GPU, high-performance compute, or managed infrastructure platforms.
- Exposure to Kubernetes, OpenStack, Grafana, or Windmill automation platforms.
- Experience maturing 24/7 support, NOC, or SRE capabilities.
- Experience designing self-service tooling for support and operations.
Culture & Benefits
- Competitive salary and annual discretionary bonus scheme.
- Employee wellbeing benefits and 25 days of holiday plus public holidays.
- Flexible working arrangements.
- High degree of ownership, autonomy, and trust to experiment.
- Collaborative, international culture built on transparency and ownership.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →
Похожие вакансии
2 дня назад
Senior Cloud Infrastructure & Application Development Engineer (AI)
2 дня назад
Principal Site Reliability Engineer (AI)
6 дней назад
Lead Cloud Engineer (Azure)
4 дня назад
Senior Cloud Platform Engineer (GCP)
3 дня назад
Staff DevOps Engineer (AI)
5 дней назад