Назад
Company hidden
17 часов назад

Infrastructure Software Engineer (AI)

150 000 - 215 000$
Формат работы
remote (Global)
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Infrastructure Software Engineer (Python/AI): Building and maintaining the control plane, tooling, and automation for GPU cloud fleet operations with an accent on device provisioning and hardware lifecycle management. Focus on designing scalable workflow orchestration systems, optimizing AI/HPC infrastructure performance, and implementing robust automation for large-scale distributed systems.

Location: Remote (Global)

Salary: $150,000 - $215,000 USD

Company

hirify.global is a GPU cloud provider engineered for AI, providing high-performance infrastructure for start-ups and large enterprise customers.

What you will do

  • Architect and implement workflow automation systems for fleet and network operations, balancing reliability and maintainability.
  • Own end-to-end delivery of device provisioning, validation, and remediation workflows at scale.
  • Design and build workflow orchestration systems for GPU nodes and network switches.
  • Build production-grade Python systems for hardware lifecycle automation, utilizing AI tools to accelerate delivery.
  • Partner with Infrastructure, Platform, and SRE teams to translate operational needs into scalable automation.
  • Establish engineering standards for reliability, observability, and operational excellence across all services.

Requirements

  • Bachelor's degree in Computer Science, Computer Engineering, or equivalent practical experience.
  • 5+ years of experience building large-scale infrastructure applications.
  • Proficiency in Python, C, C++, or Java for API design and unit testing.
  • Deep understanding of Linux operating systems and networking fundamentals (TCP/IP, BGP).
  • Experience with configuration management tools such as Ansible and Terraform.
  • Experience building and debugging large-scale distributed systems using tools like NetBox, OpenStack, MAAS, or Ironic.

Nice to have

  • Experience with AI/HPC infrastructure, including NVIDIA GPUs, InfiniBand, or SLURM.
  • Knowledge of advanced observability systems like Prometheus, Grafana, and OpenTelemetry.
  • Familiarity with cloud-native technologies including Kubernetes and Docker.
  • Ability to integrate AI tools to optimize and redesign workflows for measurable impact.

Culture & Benefits

  • Highly competitive compensation package including base salary and equity.
  • Remote-first culture with a focus on autonomy and human-first flexibility.
  • Dynamic progression plan tailored to individual ambitions.
  • Comprehensive benefits package including medical, dental, vision, and retirement plan participation.
  • Collaborative environment in a fast-growing tech startup at the intersection of cloud and AI.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →