Member Of Technical Staff, LLM Inference (AI Engineering)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Member Of Technical Staff, LLM Inference (AI Engineering): Responsible for building and maintaining tools and systems that enable AI researchers to run models easily and efficiently with an accent on optimizing compute efficiency and enabling cutting-edge research and production deployment. Focus on kernels to architecture co-design to distributed systems to profiling and testing tools.
Location: Expected to work from a designated office at least four days a week if live within 50 miles (U.S.) or 25 miles (non-U.S.) of that location
Salary: USD $139,900 – $331,200 per year.
Company
AI (MAI) is dedicated to advancing Copilot and other consumer AI products and research.
What you will do
- Work alongside researchers and engineers to implement frontier AI research ideas.
- Introduce new systems, tools, and techniques to improve model inference performance.
- Build tools to debug performance bottlenecks, numeric instabilities, and distributed systems issues.
- Build tools and establish processes to enhance the team’s collective productivity.
- Find ways to overcome roadblocks and deliver work to users quickly and iteratively.
Requirements
- Bachelor’s Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
- Experience with generative AI.
- Experience with distributed computing.
- Python and Python ecosystem expertise.
Nice to have
- Master’s Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor’s Degree in Computer Science or related technical field AND 12+ years technical engineering experience.
- Experience with large scale production inference.
- Experience with GPU kernel programming.
- Experience benchmarking, profiling, and optimizing PyTorch generative AI models.
- Experience with open source inference frameworks like vLLM and SGLang.
- Working experience and conversant with the material in the JAX scaling book.
Culture & Benefits
- Expected to work from a designated office at least four days a week if live within 50 miles (U.S.) or 25 miles (non-U.S.) of that location.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →