Company hidden

обновлено 6 дней назад

Senior Software Engineer (AI)

Формат работы

onsite

Тип работы

fulltime

Грейд

senior

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Senior Software Engineer (AI): Building and automating large-scale GPU cluster provisioning and operations across bare metal, Kubernetes, and Slurm environments with an accent on platform scalability and reliability. Focus on developing Kubernetes Operators, gRPC/REST APIs in Go, and end-to-end infrastructure automation pipelines.

Location: On-site in Las Vegas, Nevada

Company

hirify.global is a cloud platform provider delivering seamless, secure, and resilient AI compute at scale.

What you will do

Build and maintain fully automated pipelines for provisioning bare metal GPU clusters from zero to production.
Automate Slurm and Kubernetes cluster lifecycle, including bootstrapping, upgrades, and decommissioning at scale.
Develop infrastructure for GPU node configuration, including drivers and firmware.
Own cluster validation pipelines, automating health checks and GPU burn-in tests.
Build day-2 operations automation, including node remediation and rolling upgrades.
Own the full observability stack for automation services and cluster health systems.

Requirements

5+ years in infrastructure or platform engineering.
3+ years of experience writing production Go.
Deep understanding of Kubernetes internals (Informers, Controller-runtime, client-go, CRDs, Operators, and Admission webhooks).
Experience building production-scale gRPC and REST APIs in Go.
Familiarity with bare metal infrastructure concepts (PXE, IPMI, BMC).
Authorization to work in the United States is required.

Nice to have

Knowledge of GPU workload infrastructure and RoCE networking automation.
Experience with GitOps tools like ArgoCD.
Experience with CI/CD tools such as GitHub Actions and Argo Workflows.
Experience with Ansible and Terraform.

Culture & Benefits

Stock options and competitive equity.
100% paid Medical, Dental, and Vision insurance for employees.
Company contributions to Health Savings Account (HSA).
401(k) and comprehensive disability and life insurance.
Flexible PTO and paid holidays.
Parental leave and Employee Assistance Program.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Похожие вакансии

Senior Software Engineer (AI)

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Nice to have

Culture & Benefits

Похожие вакансии

Site Reliability Engineer (Kubernetes)

Senior Software Engineer, Infrastructure & Tools (DevOps)

Senior Site Reliability Engineer (AI)

Senior Infrastructure Engineer (AI)

Staff Development Experience Engineer (DevOps)

Developer Experience Engineer (AI)

Разработка

Game Dev

Design и Creative

Аналитика

Менеджмент

People & Business

Senior Software Engineer (AI)

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Nice to have

Culture & Benefits

Categories

Похожие вакансии

Site Reliability Engineer (Kubernetes)

Senior Software Engineer, Infrastructure & Tools (DevOps)

Senior Site Reliability Engineer (AI)

Senior Infrastructure Engineer (AI)

Staff Development Experience Engineer (DevOps)

Developer Experience Engineer (AI)