обновлено 5 дней назад

Staff Infrastructure Engineer (Kubernetes)

Формат работы

onsite

Тип работы

fulltime

Грейд

senior

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Staff Infrastructure Engineer (Kubernetes): Design and evolve Kubernetes control plane architecture for multi-tenant, multi-region AI compute platform with an accent on scalability, reliability, and operational ownership. Focus on multi-tenant cluster models, regional scaling strategies, networking integration, and production incident resolution.

Location: Las Vegas, Nevada. Must have authorization to work in the United States

Company

Cloud platform delivering seamless, secure AI compute at scale across multiple data centers.

What you will do

Design and evolve Kubernetes control plane architecture across regions, including multi-tenant models like vcluster or Kamaji.
Own platform reliability, on-call rotation, incident response, and lifecycle management of clusters.
Implement multi-region scaling, cluster topology, and failure domain strategies.
Design networking architectures, optimize CNI (Cilium), pod-to-pod traffic, and integrate with high-performance networking.
Enhance observability for control plane, cluster health, and lead root cause analysis.
Collaborate with DevOps, infrastructure, compute, storage, and networking teams.

Requirements

7+ years in infrastructure, platform engineering, or distributed systems
Deep experience operating Kubernetes at scale in production across multiple clusters and regions
Strong Kubernetes internals knowledge: API server, scheduler, controller manager, etcd
Expertise in Linux systems, troubleshooting Kubernetes, container runtime, networking
Experience with CNI plugins (Cilium preferred), resource isolation, scheduling
Experience in CSP, hyperscale, or large-scale environments strongly preferred

Nice to have

Virtual cluster technologies (vcluster, Kamaji)
Supporting GPU workloads in Kubernetes
NUMA-aware scheduling, topology-aware workloads
RDMA and high-throughput networking
Observability platforms (Prometheus, Grafana)

Culture & Benefits

100% paid medical, dental, vision insurance for employees
Company HSA contributions, 100% paid short/long-term disability
401(k), flexible PTO, paid holidays, parental leave
Flexible spending account, employee assistance program
Supplementary benefits: pet/legal insurance, virtual healthcare
Stock options, in-office perks

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Похожие вакансии

Staff Infrastructure Engineer (Kubernetes)

TensorWave

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Nice to have

Culture & Benefits

Похожие вакансии

Senior Site Reliability Engineer (AI)

Senior Infrastructure Engineer (AI)

Site Reliability Engineer - Vice President

Senior Software Engineer (Infrastructure)

Senior Site Reliability Engineer (AI)

Senior Platform Ops Engineer (AWS)

Разработка

Game Dev

Design и Creative

Аналитика

Менеджмент

People & Business

Staff Infrastructure Engineer (Kubernetes)

TensorWave

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Nice to have

Culture & Benefits

Categories

Похожие вакансии

Senior Site Reliability Engineer (AI)

Senior Infrastructure Engineer (AI)

Site Reliability Engineer - Vice President

Senior Software Engineer (Infrastructure)

Senior Site Reliability Engineer (AI)

Senior Platform Ops Engineer (AWS)