HPC Operations Lead (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
HPC Operations Lead (AI): Managing reliability and excellence of HPC data center environments with an accent on critical facility systems and AI-driven operational workflows. Focus on leading multi-site teams, designing preventative maintenance programs, and optimizing hardware break-fix functions.
Location: On-site 5 days/week in Chicago or New York; regular travel to HPC data center sites is a core requirement
Company
Group is a world-class research and trading firm that applies cutting-edge scientific research to global financial markets.
What you will do
- Lead and manage data center site leads and their teams across multiple HPC facilities.
- Develop and enforce operational standards for power, cooling, cabling, and hardware lifecycle.
- Design and own the preventative maintenance program to minimize unplanned downtime.
- Own the HPC data center monitoring strategy and lead critical incident response and root cause analysis.
- Manage hardware break-fix functions and maintain expertise in server and switch architectures (Arista, Cisco).
- Champion AI adoption to automate workflows and analyze telemetry data.
Requirements
- 7+ years of data center operations experience, with at least 3 years leading teams in 24/7 critical environments.
- In-depth knowledge of power distribution and cooling technologies (air and liquid cooling).
- Deep expertise in server hardware (GPUs, NVMe, BMC/IPMI) and network switch hardware.
- Strong Linux systems proficiency and knowledge of L2/L3 networking protocols (BGP, OSPF).
- Must be based in Chicago or New York and work on-site 5 days/week.
- Willingness and ability to travel regularly to data center sites.
Nice to have
- Programming or scripting experience, preferably in Python.
- Bachelor's degree.
- Knowledge of industry standards including ASHRAE and TIA-942.
Culture & Benefits
- Unique culture based on fearlessness, creativity, and intellectual honesty.
- Collaborative environment focused on winning together and unlocking individual talent.
- Opportunity to work with world-class research and cutting-edge technology in financial markets.
- High standards for work quality and operational discipline.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →