Tech Lead, AI Compute Infrastructure (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Tech Lead, AI Compute Infrastructure (AI): Building and scaling foundational compute infrastructure powering state-of-the-art generative video models with an accent on GPU utilization optimization, large-scale AI job frameworks, and observability tools. Focus on developing distributed training and inference pipelines, integrating acceleration techniques like custom CUDA kernels, and managing Kubernetes/Ray clusters for high-throughput video generation.
Location: Los Angeles, Palo Alto, San Francisco, Toronto, Singapore
Company
builds AI technology to make visual storytelling accessible through scalable video generation.
What you will do
- Optimize GPU and cluster utilization across thousands of devices for inference, training, data processing, and deployment of video generation models.
- Build scalable frameworks for managing massive heterogeneous compute jobs including multi-modal data processing, distributed training, and benchmarking.
- Develop observability, tracing, and visualization tools to ensure cluster reliability and diagnose performance bottlenecks.
- Integrate acceleration techniques like custom CUDA kernels and distributed training libraries into production pipelines.
- Champion adoption of Kubernetes and Ray for elastic, cost-efficient scaling of distributed systems.
Requirements
- Bachelor's in Computer Science, Engineering, or equivalent.
- 5+ years in large-scale MLOps, AI infrastructure, or HPC systems.
- Experience with Ray, Apache Spark, LanceDB.
- Strong proficiency in Python and C++.
- Deep hands-on with Kubernetes, Ray, PyTorch, TensorFlow, or JAX.
Nice to have
- Master's or PhD in Computer Science or related field.
- Tech Lead experience driving cross-functional projects to production.
- Experience with Generative AI models like diffusion models or LLMs.
- Background in petabyte-scale multi-modal data infrastructure.
- Expertise in GPU acceleration and low-level programming like CUDA, NCCL.
Culture & Benefits
- Competitive salary and benefits package.
- Dynamic, inclusive work environment.
- Opportunities for professional growth and advancement.
- Collaborative culture valuing innovation and creativity.
- Access to latest technologies and tools.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →