Company hidden

2 дня назад

Infrastructure Support Engineer (GPUs)

Формат работы

onsite

Тип работы

fulltime

Грейд

middle

Английский

Страна

Singapore

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Infrastructure Support Engineer (GPUs): Maintaining and troubleshooting high-performance GPU cloud infrastructure for AI workloads with an accent on service reliability and rapid incident response. Focus on managing Kubernetes clusters, Linux-based systems, and GPU-specific diagnostics to ensure seamless AI development for customers.

Location: Singapore (includes availability to travel to hirify.global or Customer locations)

Company

hirify.global is a GPU cloud provider engineered specifically for AI startups and large enterprises to reduce the complexity of AI development.

What you will do

Handle day-to-day tickets and alerts within the support duty rotation, escalating complex incidents to Engineering.
Resolve common issues using established runbooks and contribute to their improvement and incremental fixes.
Monitor, troubleshoot, and triage platform issues, capturing logs and facts for efficient handover.
Collaborate with cross-functional teams and serve as the escalation point for onsite operations staff.
Document validated steps and contribute to training materials to build team capability.
Identify and implement automation opportunities to optimize support processes.

Requirements

2-4 years of experience in support, operations, or infrastructure engineering, ideally within cloud or Data Centre environments.
Proficiency in Linux CLI, system services, filesystems, permissions, and basic networking tools.
Solid grasp of networking basics: IP addressing, subnets, VLANs, routing, DNS, and firewalls.
Exposure to Kubernetes core concepts (nodes, pods, services, logs) and basic troubleshooting.
Familiarity with GPU diagnostics such as nvidia-smi.
Ability to write simple Bash or Python scripts and use Git for version control.

Nice to have

Hands-on experience with Kubernetes administration, operators, or specialized storage/networking add-ons.
Knowledge of RDMA/InfiniBand, HPC concepts, and NCCL for performance troubleshooting.
Experience with Infrastructure as Code tools like Ansible or Terraform.
Participation in GitOps and CI/CD pipelines using GitHub Actions.
Experience with security tooling such as Teleport or Vault.

Culture & Benefits

Culture of relentless innovation, ownership, and accountability.
Commitment to openness, transparency, and an open-source approach to build trust.
Dedicated focus on sustainability and reducing the environmental impact of AI technologies.
Fast, efficient, and respectful collaboration within a global team.
Inclusive environment with an equal opportunities statement for diverse backgrounds.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Infrastructure Support Engineer (GPUs)

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Nice to have

Culture & Benefits

Похожие вакансии

Systems Engineer (HPC)

Network Engineer

IT Support Intern

Database Administrator (Web3)

Engineering Specialist III

Engineering Director (DBA)

Разработка

Game Dev

Design и Creative

Аналитика

Менеджмент

People & Business

Infrastructure Support Engineer (GPUs)

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Nice to have

Culture & Benefits

Categories

Похожие вакансии

Systems Engineer (HPC)

Network Engineer

IT Support Intern

Database Administrator (Web3)

Engineering Specialist III

Engineering Director (DBA)