Назад
Company hidden
12 часов назад

Systems Engineer (HPC)

Формат работы
remote (только USA)/hybrid
Тип работы
fulltime
Английский
b2
Страна
US/Canada
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Systems Engineer (HPC): Designing, operating, and scaling high-performance infrastructure for AI platforms with an accent on Linux environment management, HPC cluster reliability, and large-scale automation. Focus on scaling systems to thousands of nodes, managing petabyte-scale storage, and optimizing performance for research and production workloads.

Location: Must be based in the US or Canada (Montreal, Toronto, New York, Palo Alto, San Francisco).

Company

hirify.global is a pioneering startup building high-performance, open, and efficient AI systems to power the next generation of applications.

What you will do

  • Operate and maintain large-scale Linux environments across bare metal, clusters, and cloud.
  • Monitor system health, troubleshoot incidents, and ensure high availability for research and production workloads.
  • Scale infrastructure to support thousands of nodes and petabyte-scale storage systems.
  • Automate operational tasks and improve provisioning using Python, Bash, Ansible, or Terraform.
  • Collaborate with HPC, platform, and research teams to drive system architecture decisions.

Requirements

  • Must be based in the US or Canada.
  • Strong Linux systems administration experience.
  • Experience working in large-scale environments such as HPC clusters or cloud infrastructure.
  • Proficiency with job schedulers like Slurm.
  • Solid troubleshooting skills across systems, hardware, and networks.

Nice to have

  • Experience with container orchestration like Kubernetes.
  • Knowledge of storage systems such as Ceph, Lustre, or NFS.
  • Networking fundamentals including Ethernet and InfiniBand.
  • Experience with Infrastructure as Code and automation tooling.
  • Background in GPU or AI/ML infrastructure.

Culture & Benefits

  • Opportunity to shape data center operations from the ground up in a high-growth AI startup.
  • Collaborative, low-ego, and highly technical team environment.
  • Competitive compensation and benefits package.
  • Direct impact on scaling cutting-edge AI infrastructure.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →