Назад
Company hidden
3 дня назад

Hardware Engineer (GPU & PCIe)

102 000 - 145 000$
Формат работы
remote (только USA)/hybrid
Тип работы
fulltime
Грейд
middle
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Hardware Engineer (GPU & PCIe): Design, development, and optimization of server hardware infrastructure with an accent on GPU and PCIe troubleshooting. Focus on automating the server hardware lifecycle, performing failure analysis on H100/NVLink systems, and integrating observability platforms.

Location: Hybrid (New York, NY / Sunnyvale, CA / Bellevue, WA) or Remote for candidates located more than 30 miles from an office. Must be a U.S. person (Citizen, Green Card holder, etc.) due to export control regulations.

Salary: $102,000 – $145,000

Company

hirify.global is The Essential Cloud for AI, providing a platform of technology, tools, and teams that enables innovators to build and scale AI with superior infrastructure performance.

What you will do

  • Troubleshoot complex GPU and PCIe related failures and partner with external vendors on failure analysis.
  • Develop and maintain hardware/firmware management services and automate all aspects of the server hardware lifecycle.
  • Serve as the senior point of contact for hardware escalation and troubleshooting.
  • Collaborate with cross-functional teams to define hardware requirements, system architecture, and resolution playbooks.
  • Analyze hardware performance, identify bottlenecks, and propose improvements for enhanced efficiency.
  • Create and maintain detailed documentation of hardware designs, specifications, and test procedures.

Requirements

  • 2+ years of experience supporting and troubleshooting data center class GPUs (H100 or newer, including Infiniband and NVLink).
  • Proficiency in Python and Ansible for programmatically interacting with server BMCs using Redfish or IPMI.
  • Experience automating GPU diagnostics and troubleshooting tools using observability platforms like Prometheus and Grafana.
  • In-depth knowledge of server hardware components, specifically GPUs and PCIe devices.
  • Must be a U.S. person (Citizen, Lawful Permanent Resident, Refugee, or Asylee) to comply with U.S. Government export regulations.

Culture & Benefits

  • 100% company-paid medical, dental, and vision insurance.
  • 401(k) with generous employer match and Employee Stock Purchase Program (ESPP).
  • Flexible PTO and company-paid Life, Short-term, and Long-term disability insurance.
  • Comprehensive family support including paid parental leave and childcare support via Kinside.
  • Daily catered lunch provided at office and data center locations.
  • Casual work environment focused on innovative disruption.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →