TL;DR
Senior AI Inference Engineer (AI): Working on the C++ layer powering local AI and enhancing inference engines like llama.cpp and ONNX to run efficiently on edge devices with an accent on runtime optimization, model loading speed, and performance across different hardware. Focus on ensuring the inference layer is stable, optimized, and ready for integration with the rest of the stack, enabling private and fast on-device AI.
Company
Tether is building cutting-edge solutions that empower businesses to seamlessly integrate reserve-backed tokens across blockchains.
What you will do
- Deploy machine learning models to edge devices using frameworks like llama.cpp, ggml, and ONNX.
- Collaborate with researchers to assist in coding, training, and transitioning models from research to production.
- Integrate AI features into existing products.
Requirements
- Excellent programming skills in C++.
- Strong experience with Llama.cpp and ggml inference engines.
- Good understanding of deep learning concepts and model architectures.
- Experience with transformers and LLMs.
- Ability to rapidly assimilate new technologies and techniques.
- A degree in Computer Science, AI, Machine Learning, or a related field.
Culture & Benefits
- Global talent powerhouse, working remotely from every corner of the world.
- Opportunity to collaborate with some of the brightest minds in the fintech space.
- Contribute to the most innovative platform on the planet.
Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →