Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Machine Learning Engineer (AI/Training Optimization): Designing and optimizing large-scale distributed training systems for multimodal and foundation models with an accent on GPU utilization, memory efficiency, and communication overhead. Focus on implementing custom CUDA or Triton kernels and scaling training workflows using Megatron-LM, NeMo, and FSDP.
Location: Based in Beijing, China
Company
Canva is building a creative intelligence engine powered by AI to transform the future of AI-assisted design.
What you will do
- Design, implement, and optimize large-scale machine learning systems for training.
- Improve GPU utilization, memory efficiency, and minimize communication overhead.
- Collaborate with research and modeling teams to align systems with algorithmic needs.
- Evaluate and apply best practices for distributed training using frameworks like Megatron-LM and NVIDIA NeMo.
- Develop low-level optimizations, including custom CUDA or Triton kernels.
- Debug, profile, and fine-tune training workflows to enhance scalability.
Requirements
- Strong background in LLMs, multimodal AI, or diffusion models.
- Proficiency in Python; knowledge of C++ or Rust is a plus.
- Deep knowledge of PyTorch or JAX and distributed libraries such as Megatron-LM, NeMo, or DeepSpeed.
- Experience with optimization techniques like FSDP/ZeRO and gradient checkpointing.
- Hands-on experience writing custom GPU kernels in CUDA or Triton.
- Full proficiency in English is required.
- Must be based in Beijing, China.
Culture & Benefits
- Opportunity to work on foundational AI technologies at massive scale.
- Global collaboration within a high-impact research and engineering environment.
- Open to candidates of all experience levels, including graduates and interns.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →