Infrastructure Support Engineer

Формат работы

onsite

Тип работы

fulltime

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Infrastructure Support Engineer (GPU Cloud): Ensuring efficiency, reliability, and scalability of data centre infrastructure with an accent on monitoring, troubleshooting, and customer support. Focus on handling tickets, following runbooks, collaborating with engineering teams, and identifying automation opportunities.

Location: UK, with availability to travel to hirify.global or customer locations for deployments, troubleshooting, and operational tasks.

Company

GPU cloud engineered for AI, providing cost-effective, high-performance infrastructure for AI startups and enterprises.

What you will do

Join support duty rotation to handle tickets, alerts, and incidents, escalating appropriately and collaborating with engineering.
Manage and resolve tickets using the ticketing system, keeping all parties informed with clear notes and communications.
Follow runbooks for common issues, propose improvements, and contribute fixes; participate in monitoring, triage, and log capture.
Deliver tasks and projects to timelines, flag blockers early, and share knowledge through documentation and training materials.
Participate in incident reviews, identify automation opportunities, and collaborate with cross-functional teams including onsite operations.
Engage in on-call/out-of-hours work when scheduled and constantly upskill.

Requirements

Growth mindset: curious, dependable, collaborative, seeking feedback and investing in learning.
Platform/DC fundamentals: servers, networks, storage, virtualization from support/operations background.
Linux fundamentals: CLI, systemd, filesystems, permissions, basic networking tools, troubleshooting.
Networking basics: IP addressing, subnets, VLANs, routing, DNS, firewalls.
Kubernetes exposure: core concepts, basic troubleshooting, runbooks.
GPU awareness: basic diagnostics like nvidia-smi; observability: dashboards, alerts.
Scripting/automation: Bash/Python snippets, Git; cloud/virtualization basics.

Nice to have

Hands-on Kubernetes administration, operators, storage/networking add-ons.
Deeper GPU/HPC: RDMA/InfiniBand, distributed workloads, NCCL.
Infrastructure as Code: Ansible, Terraform; GitOps, CI/CD.
Access/security tools like Teleport or Vault; relevant certifications.

Culture & Benefits

Collaborative, supportive, innovative environment with real impact in a fast-growing AI tech startup.
Highly competitive package (base + equity) with reviews every 12 months.
Dynamic progression plan tailored to ambitions, with autonomy and flexibility.
Human-first flexibility: shape your day around life's moments, relentless innovation, ownership, accountability.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →