Engineering Manager, Datacenter Storage Engineering (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Engineering Manager, Datacenter Storage Engineering (AI): Leading the team responsible for distributed storage infrastructure across all regions with an accent on high-performance SAN and NFS systems for GPU-centric compute. Focus on designing and operating large-scale storage architectures, including VAST Data and parallel filesystems like Lustre, to support AI training and inference workloads.
Location: Must be based in the USA
Salary: $150,000 - $240,000 USD
Company
is a rapidly growing, remote-first company providing cloud infrastructure for full-stack AI applications, enabling developers to build and scale custom AI systems.
What you will do
- Define and operate global storage platforms supporting training, inference, and dataset access at scale.
- Manage and grow a team of storage and systems engineers, setting technical direction and operational standards.
- Design and operate large-scale SAN and NFS deployments for GPU clusters.
- Lead deployments of VAST Data and parallel filesystems like Lustre.
- Drive performance optimization across the stack, from NAND/NVMe media to client access patterns.
- Partner with Networking, GPU Platform, and SRE teams to ensure storage meets workload requirements.
Requirements
- Must be based in the USA
- 3+ years of experience managing storage, systems, or infrastructure engineering teams in production.
- 8+ years of experience designing and operating large-scale storage systems at multi-petabyte scale.
- Hands-on experience deploying and operating VAST Data in production.
- Experience with Lustre or comparable HPC parallel filesystems.
- Deep knowledge of Linux internals, storage controllers, and high-performance data paths like NFS over RDMA.
- Successful completion of a background check.
Nice to have
- Experience supporting AI training pipelines and large-scale model checkpointing.
- Familiarity with RDMA fabrics and multi-tenant isolation.
- Background in hyperscale, HPC, or AI-focused infrastructure.
Culture & Benefits
- Meaningful equity in a fast-growing company.
- Generous medical, dental, and vision plans.
- Flexible PTO policy.
- Remote-first work environment with collaborative team culture.
- Opportunity to work on cutting-edge AI infrastructure.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →