Эта вакансия в архиве

Посмотреть похожие вакансии ↓

Company hidden

обновлено 20 часов назад

Kubernetes Platform Engineer (AI)

Формат работы

onsite

Тип работы

fulltime

Грейд

senior

Английский

b2

Страна

US

Описание вакансии

Текст:

/

TL;DR

Senior Kubernetes Platform Engineer (AI): Building and optimizing seamless, secure, reliable, and resilient AI infrastructure at scale, with an accent on maintaining stability and reliability of bare-metal Kubernetes infrastructure. Focus on troubleshooting, incident response, and day-to-day cluster operations across multi-tenant workloads.

Location: Las Vegas, Nevada, USA (Onsite)

Company

hirify.global Cloud is dedicated to building secure, reliable, and resilient AI infrastructure at scale to empower innovation.

What you will do

Own and troubleshoot operational issues within Kubernetes environments.
Maintain and monitor core services like Cilium, HAProxy, and Prometheus.
Ensure uptime, performance, and reliability of multi-tenant clusters.
Assist with Ingress/Egress connectivity and network debugging.
Support internal and customer teams in secure, isolated VPC environments.
Collaborate with senior engineers on automation and cluster lifecycle improvements.

Requirements

Bachelor of Science in Computer Science, Computer Engineering, or a related technical field, or equivalent practical experience.
5+ years experience in DevOps, SRE, or Linux infrastructure roles.
4+ years of hands-on experience with Kubernetes in production.
3+ years designing and operating multi-tenant Kubernetes platforms at CSP or hyperscaler scale.
Proven experience implementing production-grade cluster authentication (OIDC/SSO integration, RBAC policies) and advanced network design (CNI selection/configuration, network policies, service mesh architecture, cross-cluster networking).
Strong infrastructure-as-code mindset (Helm, Terraform, Ansible).
Solid experience with monitoring and logging tools (Prometheus, Grafana, Loki).

Nice to have

Experience with RKE2, Rancher, or similar platforms.
Experience troubleshooting or supporting AI or GPU-based workloads.
Familiarity with HAProxy, Cilium, or other Kubernetes ingress/networking tools.

Culture & Benefits

Mission-driven company with competitive salary and stock options.
100% paid Medical, Dental, and Vision insurance.
Flexible PTO and Paid Holidays.
401(k) and Parental Leave.
Flexible Spending Account, Short Term Disability Insurance, Life and Voluntary Supplemental Insurance.
Mental Health Benefits through Spring Health.

Похожие вакансии

1 день назад

Senior Engineer - HPC Operations (AI)

106 400 - 159 600$

2 дня назад

Senior Site Reliability Engineer (Kubernetes/Terraform)

150 000 - 200 000$

7 дней назад

AI Infrastructure Engineer

157 487 - 174 713$

2 дня назад

Senior Site Reliability Engineer (AI)

109 600 - 164 400$

2 дня назад

Staff DevOps Engineer (AI)

192 000 - 272 000$

12 часов назад

Senior DevOps Engineer (Kubernetes)

115 500 - 181 500$