Назад
Company hidden
обновлено 6 дней назад

Senior Software Engineer (AI)

Формат работы
onsite
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Senior Software Engineer (AI): Building and automating large-scale GPU cluster provisioning and operations across bare metal, Kubernetes, and Slurm environments with an accent on platform scalability and reliability. Focus on developing Kubernetes Operators, gRPC/REST APIs in Go, and end-to-end infrastructure automation pipelines.

Location: On-site in Las Vegas, Nevada

Company

hirify.global is a cloud platform provider delivering seamless, secure, and resilient AI compute at scale.

What you will do

  • Build and maintain fully automated pipelines for provisioning bare metal GPU clusters from zero to production.
  • Automate Slurm and Kubernetes cluster lifecycle, including bootstrapping, upgrades, and decommissioning at scale.
  • Develop infrastructure for GPU node configuration, including drivers and firmware.
  • Own cluster validation pipelines, automating health checks and GPU burn-in tests.
  • Build day-2 operations automation, including node remediation and rolling upgrades.
  • Own the full observability stack for automation services and cluster health systems.

Requirements

  • 5+ years in infrastructure or platform engineering.
  • 3+ years of experience writing production Go.
  • Deep understanding of Kubernetes internals (Informers, Controller-runtime, client-go, CRDs, Operators, and Admission webhooks).
  • Experience building production-scale gRPC and REST APIs in Go.
  • Familiarity with bare metal infrastructure concepts (PXE, IPMI, BMC).
  • Authorization to work in the United States is required.

Nice to have

  • Knowledge of GPU workload infrastructure and RoCE networking automation.
  • Experience with GitOps tools like ArgoCD.
  • Experience with CI/CD tools such as GitHub Actions and Argo Workflows.
  • Experience with Ansible and Terraform.

Culture & Benefits

  • Stock options and competitive equity.
  • 100% paid Medical, Dental, and Vision insurance for employees.
  • Company contributions to Health Savings Account (HSA).
  • 401(k) and comprehensive disability and life insurance.
  • Flexible PTO and paid holidays.
  • Parental leave and Employee Assistance Program.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →