Sr. Principal Software Engineer (Generative AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Sr. Principal Software Engineer (Generative AI): Optimizing and deploying high-performance LLM inference pipelines across data center, edge, and embedded platforms with an accent on CUDA kernel development and quantization strategies. Focus on reducing latency, improving throughput, and eliminating external vendor dependency for model deployment.
Location: Remote (USA)
Salary: $141,400 USD - $226,300 USD
Company
Global leader in AI for transportation, specializing in building AI and voice-powered companions for the automotive industry.
What you will do
- Optimize and deploy high-performance LLM inference pipelines across data center, edge, and embedded platforms.
- Own and tune inference runtimes including vLLM, TensorRT-LLM, llama.cpp, and QAIRT.
- Develop custom CUDA kernels and implement quantization strategies (INT8, INT4, FP4, FP8, AWQ, GPTQ).
- Optimize KV cache performance through paging, prefix caching, and memory layout design.
- Design and tune batching strategies and speculative decoding to minimize tail latency and improve tokens/sec.
Requirements
- Proven experience optimizing ML inference performance in production environments.
- Deep understanding of GPU architecture and memory hierarchies.
- Hands-on experience with CUDA and low-level performance tuning.
- Experience deploying models beyond research environments.
- Must be based in the USA.
Culture & Benefits
- Comprehensive insurance coverage (medical, dental, vision, life, and disability).
- Annual bonus opportunity and equity awards for certain levels.
- Paid time off and paid company holidays.
- Company contribution to retirement savings plans.
- Collaborative, customer-centric, and fast-paced work environment.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →