Framework Software Engineer (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Framework Software Engineer (AI): Designing and optimizing high-performance inference frameworks for large-scale distributed serving of LLM workloads with an accent on memory management, parallelism, and throughput. Focus on building scalable multi-node architectures and driving data-informed decisions for NPU/GPU-based systems.
Location: Seongnam, South Korea (Onsite)
Company
is an AI semiconductor startup developing high-performance hardware and software solutions for accelerated AI inference.
What you will do
- Design and develop high-performance inference frameworks for large-scale distributed LLM serving.
- Optimize end-to-end serving performance metrics including TTFT, ITL, and throughput.
- Implement advanced techniques like continuous batching, KV-cache management, and speculative decoding.
- Architect multi-node serving solutions involving prefill/decode disaggregation and distributed caching.
- Analyze runtime behavior, communication overhead, and memory usage across heterogeneous environments.
- Collaborate with infrastructure, compiler, and hardware teams to co-design end-to-end AI systems.
Requirements
- Master's degree or higher in CS, EE, or a related technical field.
- Strong proficiency in Python, C++, and PyTorch with deep knowledge of runtime internals.
- Hands-on experience with inference serving or high-performance ML systems.
- Solid understanding of Linux systems, profiling, and debugging performance bottlenecks.
- Ability to reason about system-level trade-offs and solve complex architectural problems.
- Clear communication skills and experience collaborating in fast-paced engineering teams.
Nice to have
- Experience with serving frameworks like vLLM, SGLang, or TensorRT-LLM.
- Deep understanding of attention mechanisms and memory-efficient inference.
- Experience with multi-node inference and tensor/pipeline parallelism.
- Proven record of open-source contributions to ML infrastructure projects.
Hiring process
- Document screening followed by an online interview.
- On-site interview including a technical assignment presentation.
- Culture-fit interview and final offer discussion.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →