Назад
Company hidden
14 часов назад

Senior Software Engineer - Fleet Management (AI)

Формат работы
remote (Global)
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
UK
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Senior Software Engineer (AI): Building and optimizing Python-based workflow automation systems for GPU node and network switch lifecycle management at scale, with an accent on device provisioning, burn-in testing, network configuration, and hardware health validation. Focus on designing foundational platform components, integrating with datacenter infrastructure, and driving technical strategy for reliability and operational excellence in distributed systems.

Location: Remote (Global)

Company

hirify.global is the GPU cloud engineered for AI, providing cost-effective, high-performance infrastructure for AI start-ups and large enterprise customers.

What you will do

  • Build Python-based workflow automation systems for GPU node and network switch lifecycle management at scale.
  • Design foundational platform components with established software patterns.
  • Implement device provisioning, burn-in testing, network configuration, and hardware health validation workflows.
  • Integrate with datacenter infrastructure management systems, cloud orchestration platforms, and bare metal provisioning tools.
  • Build distributed workflow orchestration systems to coordinate complex automation tasks.
  • Drive technical strategy for reliability, observability, incident response, and operational excellence.

Requirements

  • 5+ years of software engineering experience building and operating production systems, with a focus on infrastructure automation or workflow tooling.
  • Strong proficiency in Python.
  • Proven ability to build distributed systems at scale, with an emphasis on infrastructure reliability, scalability, and security.
  • Technical expertise in quickly understanding systems design tradeoffs and rapidly evolving software systems.
  • Experience delivering automation systems from ambiguous requirements to operational systems in production, including day 2 operations (monitoring, incident response, performance optimisation).
  • Excellent communication skills to build consensus with stakeholders.

Nice to have

  • Experience with workflow orchestration tools like Temporal, Airflow, or Prefect.
  • Hands-on experience with infrastructure tooling like DCIMs, NetBox, OpenStack, or ERP systems.
  • Experience with bare metal provisioning and automation (MAAS, Ironic, IPMI, PXE boot, network automation).
  • Experience building hardware lifecycle automation.
  • GPU infrastructure experience (health monitoring, burn-in testing, cluster management).
  • Deep knowledge of Kubernetes, Infrastructure as Code (Terraform, Pulumi), AWS, and GCP.

Culture & Benefits

  • Collaborative, supportive, and innovative remote-first environment where contributions have a real impact.
  • Highly competitive compensation package (base + equity) with reviews every 12 months.
  • Opportunity to join a fast-growing tech startup and contribute to cutting-edge AI technology.
  • Dynamic progression plan tailored to individual ambitions, with full support for growth.
  • Human-First Flexibility, trusting hirify.globalrs with autonomy to shape their day around life's moments.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Текст вакансии взят без изменений

Источник - загрузка...