Senior Software Engineer / Lead (AI/ML Platform)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Senior Software Engineer / Lead (AI/ML Platform): Design and build an AI/ML platform capable of high-throughput training and inference across local and cloud GPU environments with an accent on systems architecture, GPU acceleration, performance engineering, and reliable operation of AI workloads at scale. Focus on optimizing GPU memory, compute throughput, inter-node communication, and leading engineering initiatives with ML and hardware teams.
Location: Singapore (onsite, full-time)
Company
Global gaming company revolutionizing the way the world games with hardware, software, and gamer-centric experiences across 5 continents.
What you will do
- Design and implement architecture for model training, fine-tuning, and serving in heterogeneous compute environments including GPUs, NPUs, and accelerators.
- Develop and optimize high-performance inference stacks using vLLM, SGLang, TensorRT-LLM, or Triton, along with APIs, CLI tools, and backend services for model lifecycle management.
- Build and operate GPU clusters in on-prem and cloud environments (AWS/GCP/Azure), optimizing GPU memory, throughput, PCIe/NVLink, and inter-node communication.
- Tune CUDA, cuDNN, NCCL for performance and reliability, deploying scalable GPU-based inference systems focused on latency, throughput, and cost.
- Mentor engineers, drive design reviews, architecture discussions, and coding standards; collaborate with ML researchers and hardware teams on algorithms and deployment strategies.
- Coordinate engineering deliverables and provide technical direction for the team.
Requirements
- 8+ years in software engineering, ML infrastructure, or high-performance computing
- Hands-on experience designing and operating local GPU servers and cloud GPU environments
- Deep knowledge of transformer-based model inference and training optimization
- Experience building high-availability distributed systems with failover, replication, and autoscaling
- Strong proficiency in Python and at least one systems language (C++ or Go)
- Experience with deep learning frameworks (PyTorch, TensorFlow) and inference engines (vLLM, SGLang, TensorRT, Triton)
- Solid understanding of distributed systems, parallel computing, container orchestration (Docker, Kubernetes), GPU programming (CUDA or ROCm), memory hierarchy, and performance tuning
- Strong collaboration and communication skills; ability to lead engineering efforts
Nice to have
- Experience with on-device or edge-accelerated inference
- Familiarity with cloud-native GPU scheduling and autoscaling systems
- Experience with model compression, quantization, speculative decoding, or other inference-efficiency techniques
- Contributions to open-source AI infrastructure projects
- Master's degree or PhD in Computer Science, Electrical Engineering, or related field
Culture & Benefits
- Work across a global team on 5 continents with accelerated personal and professional growth.
- Gamer-centric #LifeAt experience in an inclusive, respectful, and fair workplace.
- Equal Opportunity Employer committed to diversity, providing reasonable accommodations where needed.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →