Назад
Company hidden
2 дня назад

Infrastructure Engineer – DevOps, Kubernetes & Automation (AI)

Формат работы
onsite
Тип работы
fulltime
Грейд
middle
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Infrastructure Engineer – DevOps, Kubernetes & Automation (DevOps): Deploying and maintaining internal infrastructure automation and Kubernetes platform operations with an accent on Ansible, Linux systems, and CI/CD tooling. Focus on converting manual operational work into repeatable automation and ensuring the operational health of the AI compute environment.

Location: On-site in Las Vegas, Nevada. Authorization to work in the United States is required.

Company

A cloud platform provider delivering secure and resilient AI compute at scale.

What you will do

  • Deploy, maintain, and troubleshoot Kubernetes clusters, including node maintenance and configuration updates.
  • Investigate K8s issues related to pods, services, networking, storage, and ingress.
  • Develop and maintain Ansible roles and playbooks to standardize infrastructure automation across environments.
  • Improve CI/CD pipelines and Git-based workflows for infrastructure code.
  • Troubleshoot Ubuntu-based Linux systems, networking, and GPU node operating environments.
  • Document deployment procedures, troubleshooting steps, and operational standards.

Requirements

  • Linux system administration experience.
  • Basic to intermediate experience with Kubernetes.
  • Practical experience with Ansible and Git workflows.
  • Understanding of CI/CD concepts and basic networking (DNS, routing, firewalls).
  • Ability to troubleshoot services using logs, systemd, and command-line tools.
  • Work authorization for the United States.

Nice to have

  • Experience with Ubuntu server, RKE2, Rancher, or Cilium.
  • Proficiency with observability tools like Prometheus, Grafana, and Loki.
  • Experience with bare metal provisioning (MAAS, PXE) or data center infrastructure.
  • Experience supporting GPU, AI, or HPC compute environments.
  • Knowledge of Python or Go for operational tooling.

Culture & Benefits

  • Comprehensive insurance: 100% paid Medical, Dental, and Vision for employees.
  • Financial benefits: 401(k) and Company Health Savings Account contributions.
  • Insurance options: Short/Long Term Disability, Life, Pet, and Legal insurance.
  • Time off: Flexible PTO and paid holidays.
  • Family support: Parental leave and Employee Assistance Program.
  • Additional in-office perks.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →