Audio Inference Engineer (Model Efficiency)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Audio Inference Engineer (Model Efficiency): Develop and optimize high-performance audio inference systems focusing on latency, throughput, and quality for real-time and streaming audio workloads. Focus on system bottleneck identification, creative solutions for audio processing, and seamless integration with training and serving infrastructure.
Location: Remote with preferred time zones EST and PST; offices in New York, San Francisco, Toronto, Montreal, London, Paris, and Seoul
Company
is a leading AI company focused on training and deploying frontier models to power advanced AI systems for developers and enterprises.
What you will do
- Build and optimize audio inference serving systems to improve latency, throughput, and quality
- Identify system bottlenecks and deliver innovative solutions for audio processing and streaming workloads
- Collaborate closely with training and serving infrastructure teams for seamless model deployment
- Focus on real-time and streaming audio inference integration
Requirements
- Location: Remote with preference for EST and PST time zones
- Experience in developing high-performance audio or machine learning inference systems
- Proficiency in C++ and Python programming languages
- Hands-on experience with deep learning models for audio, speech, or language applications
- Strong results-oriented mindset and bias for action
Nice to have
- Experience with GPU programming and low-level system optimization
- Knowledge of model parallelization techniques over multiple GPUs
- Experience with duplex real-time streaming architectures
- Familiarity with machine learning frameworks internals (PyTorch, TensorFlow, specialized audio libraries)
- Experience with inference frameworks like vLLM, SGLang, Tensort-LLM, or custom distributed inference systems
- Expertise in sequence modeling and end-to-end audio pipeline optimization
Culture & Benefits
- Inclusive and open work environment
- Work with a cutting-edge AI research team
- Weekly lunch stipend, in-office lunches, and snacks
- Full health and dental benefits including mental health budget
- 100% parental leave top-up for up to 6 months
- Personal enrichment benefits for arts, fitness, and workspace improvement
- Remote-flexible with multiple global offices and co-working stipend
- 6 weeks of vacation (30 working days)
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →