Software Engineer, AI Compute Infrastructure (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Software Engineer, AI Compute Infrastructure (AI): Building and scaling the foundational compute infrastructure for generative video models with an accent on GPU optimization and high-throughput data pipelines. Focus on designing scalable job frameworks, implementing distributed computing systems using Ray and Kubernetes, and optimizing CUDA kernels for model performance.
Location: Los Angeles, Palo Alto, San Francisco, Toronto, or Singapore
Company
is an AI company dedicated to making visual storytelling accessible through state-of-the-art generative video technology.
What you will do
- Optimize GPU and cluster utilization across thousands of devices for inference, training, and large-scale deployment of video generation models.
- Build highly scalable frameworks for managing heterogeneous compute jobs and multi-modal data ingestion.
- Develop world-class observability, tracing, and visualization tools to diagnose performance bottlenecks.
- Collaborate with AI researchers to integrate acceleration techniques such as custom CUDA kernels into production pipelines.
- Manage and optimize cloud and container infrastructure using Kubernetes and Ray for elastic scaling.
Requirements
- 5+ years of full-time industry experience in large-scale MLOps, AI infrastructure, or HPC systems.
- Strong proficiency in Python and a high-performance language such as C++.
- Deep hands-on experience with orchestration and distributed computing frameworks like Kubernetes and Ray.
- Experience with core ML frameworks including PyTorch, TensorFlow, or JAX.
- Proficiency with data frameworks such as Ray, Apache Spark, or LanceDB.
- Bachelor's degree in Computer Science, Engineering, or a related field.
Nice to have
- Master's or PhD in Computer Science or a related technical field.
- Demonstrated Tech Lead experience driving projects from conceptual design to production.
- Prior experience building infrastructure specifically for Generative AI models (diffusion, GANs, or LLMs).
- Expertise in GPU acceleration and low-level compute programming (CUDA, NCCL).
- Background in operating large-scale data infrastructure managing petabytes of multi-modal data.
Culture & Benefits
- Competitive salary and comprehensive benefits package.
- Dynamic and inclusive work environment.
- Opportunities for professional growth and career advancement.
- Collaborative culture that values innovation and creativity.
- Access to the latest technologies and tools.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →