Назад
Company hidden
12 часов назад

Staff Engineer, HPC Systems Software (AI)

100 000 - 500 000$
Формат работы
hybrid
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US/Canada
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Staff Engineer, HPC Systems Software (AI): Architecting and maintaining the operating system foundation for global hardware design infrastructure with an accent on bare-metal provisioning and configuration-as-code. Focus on scaling OS lifecycle management across hundreds of compute nodes and optimizing Linux kernel performance for AI hardware development.

Location: Hybrid: Must be based in Austin (TX), Santa Clara (CA), or Toronto (CA)

Salary: $100k - $500k

Company

hirify.global is a startup leading the industry in cutting-edge AI technology and high-performance RISC-V CPUs.

What you will do

  • Design and maintain automated OS deployment pipelines for global bare-metal HPC clusters.
  • Manage large-scale configuration using Ansible to ensure consistency across compute infrastructure.
  • Deploy and lifecycle manage RHEL and Ubuntu systems across diverse hardware platforms.
  • Implement infrastructure-as-code for repeatable, version-controlled system configurations.
  • Troubleshoot OS-level issues and optimize kernel parameters to resolve performance bottlenecks.
  • Collaborate with hardware design teams to standardize system configurations and development environments.

Requirements

  • Experience in RHEL and Ubuntu administration within HPC or large-scale compute environments.
  • High proficiency in Ansible for automation across hundreds of nodes.
  • Experience with bare-metal provisioning systems such as MAAS, Foreman, Cobbler, or Warewulf.
  • Deep understanding of Linux internals, networking, kernel tuning, and performance troubleshooting.
  • Familiarity with HPC cluster architecture and infrastructure-as-code practices.
  • Must be eligible to access U.S. export-controlled technology (EAR compliance).

Nice to have

  • Hands-on experience with IBM Spectrum LSF or similar HPC workload managers.
  • Integration with commercial HPC storage platforms like Pure Storage, Weka, or Vast Data.
  • Exposure to EDA tools and hardware design workflows in semiconductor development.
  • Experience with container technologies including Docker, Singularity, or Podman.
  • Cluster monitoring skills using Prometheus, Grafana, and custom tooling.
  • Python and bash scripting for production-level infrastructure automation.

Culture & Benefits

  • Highly competitive compensation package including base and variable targets.
  • Collaborative environment with a focus on curiosity and solving hard technical problems.
  • Opportunity to work on revolutionary AI platforms and RISC-V CPU architecture.
  • Equal opportunity employer.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →