Staff Infrastructure Engineer (Virtualization)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Staff Infrastructure Engineer (Virtualization): Design and implement scalable virtualization platform for high-performance AI compute workloads with an accent on KVM/QEMU-based architectures and Linux primitives. Focus on optimizing VM lifecycle management, performance isolation, GPU passthrough, and integration with high-throughput networking and storage systems.
Location: Las Vegas, Nevada. Authorization to work in the United States required.
Company
Versatile cloud platform delivering seamless, secure, reliable AI compute at scale across multiple data centers.
What you will do
- Design and implement scalable virtualization platform supporting high-density compute and GPU workloads
- Lead evolution from Proxmox toward KVM/QEMU-based architectures
- Define standards for VM lifecycle management, performance isolation, resource allocation, and resilience
- Optimize for high-performance including NUMA alignment, CPU pinning, PCIe awareness, GPU passthrough
- Integrate with networking (SR-IOV, RDMA) and storage teams for high-throughput systems
- Build automation for hypervisor deployment, image pipelines, cluster scaling, and lifecycle management
- Troubleshoot system-level performance issues across compute, memory, storage, and network
- Contribute to long-term platform architecture and infrastructure strategy
Requirements
- 7+ years in infrastructure, systems, or platform engineering
- Deep experience with Linux-based virtualization: KVM/QEMU, libvirt or similar
- Strong knowledge of CPU scheduling, NUMA architectures, memory management, storage I/O
- Experience designing/operating virtualization platforms at scale (hundreds+ hosts)
- Solid networking fundamentals: Linux networking (bridges, bonding, VLANs), high-performance concepts
- Experience with infrastructure automation (Ansible, Terraform)
- Strong troubleshooting skills across distributed systems
Nice to have
- Experience in cloud/CSP environments
- GPU workloads and passthrough (VFIO)
- SR-IOV and advanced NIC features
- Integration with Kubernetes, bare metal provisioning (MAAS)
- Distributed storage (Ceph, Weka)
- High-performance or low-latency environments
Culture & Benefits
- Stock options
- 100% paid medical, dental, vision insurance for employees
- Company HSA contributions
- 100% paid short/long-term disability insurance
- Life, voluntary supplemental, pet, legal insurance options
- Flexible spending account, 401(k), employee assistance program
- Flexible PTO, paid holidays, parental leave
- In-office perks
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →