Senior Specialist Field Engineer (Compute Infrastructure)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Senior Specialist Field Engineer (Compute Infrastructure): Leading the technical delivery of large-scale GPU supercomputers for strategic AI customers with an accent on bare-metal infrastructure, high-speed fabric validation, and cluster bring-up. Focus on designing and operationalizing complex HPC environments, ensuring performance at scale, and bridging the gap between facility design and production-ready compute.
Location: Hybrid (Must be based in or near Livingston, NJ; New York, NY; Sunnyvale, CA; San Francisco, CA; Bellevue, WA; or Dallas, TX). Must be a U.S. person (citizen, green card holder, refugee, or asylee) due to export control regulations.
Salary: $188,000 – $275,000
Company
is a specialized cloud provider built for AI, delivering high-performance infrastructure to leading AI labs and enterprises.
What you will do
- Serve as the primary technical point of contact for customers, managing the end-to-end delivery of bare-metal compute infrastructure.
- Lead the bring-up and acceptance of large-scale GPU clusters, including InfiniBand/RoCE fabric validation and HPC performance benchmarking.
- Define and operationalize models for managing bare-metal fleets, including IT service, break-fix, and firmware management.
- Partner with Data Center Operations, Fleet Operations, and Networking teams to ensure infrastructure readiness.
- Advise on technical contract terms, SLAs, and support boundaries for strategic customer environments.
- Identify opportunities for product enhancement and collaborate with internal engineering teams to shape the product roadmap.
Requirements
- 7+ years of experience as a Solutions Architect, Field Engineer, or Infrastructure Engineer in cloud or HPC environments.
- Expertise in bare-metal compute infrastructure and large-scale GPU cluster delivery.
- Deep knowledge of modern rack-scale GPU hardware (e.g., NVIDIA HGX/GB200), high-speed interconnects, and firmware/BMC/BIOS layers.
- Expert-level Linux system administration and networking fundamentals (routing, fabric topologies, TCP/IP).
- Hands-on experience with orchestration layers such as Kubernetes and Slurm.
- Must be a U.S. person to comply with U.S. Government export regulations.
Nice to have
- Experience operating security-sensitive or air-gapped environments.
- Proficiency in scripting and automation (Python, Bash, Ansible).
- Experience designing AI supercomputers from MEP designs.
- Background in multi-cloud or hybrid environment solutions.
Culture & Benefits
- Comprehensive medical, dental, and vision insurance (100% paid by company).
- 401(k) with employer match and Employee Stock Purchase Program (ESPP).
- Flexible PTO and generous paid parental leave.
- Family-forming support and mental wellness benefits.
- Catered lunches in office and data center locations.
- Casual work environment focused on innovation and hyper-growth.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →