Эта вакансия в архиве
Посмотреть похожие вакансии ↓2 месяца назад
Senior Research Engineer (Multimodal & Video Foundation Model)
Описание вакансии
Текст:
TL;DR
Senior Research Engineer (Multimodal & Video Foundation Model): Develop and innovate advanced multimodal and video-centric AI models and architectures with an accent on scalable training pipelines, novel AI architectures, and generative video models. Focus on designing and optimizing large-scale multimodal systems, prototyping generative AI applications, and benchmarking model performance across diverse tasks.
Location: 100% remote worldwide
Company
is a leading fintech product company pioneering blockchain-based financial solutions including the world’s most trusted stablecoin USDT, with a global remote team.
What you will do
- Pioneer multimodal and video-centric AI research contributing to prototypes and scalable systems.
- Design and implement novel AI architectures integrating text, visual, and audio modalities.
- Engineer scalable training and inference pipelines optimized for large-scale multimodal datasets and distributed GPU systems.
- Optimize data processing, model execution, and pipeline throughput for efficiency.
- Build modular tools for preprocessing and managing multimodal data assets.
- Collaborate cross-functionally to translate model innovations into production-grade solutions.
- Prototype generative AI applications showcasing new multimodal foundation model capabilities.
- Develop benchmarking tools to evaluate model performance across diverse tasks.
Requirements
- Bachelor’s degree in Computer Science, Computer Engineering, or related field, or equivalent experience.
- Expertise in Python and PyTorch with experience across full development pipeline from data processing to optimization.
- Experience with large-scale text data; bonus for interleaved audio, video, image, and text data.
- Hands-on experience developing or benchmarking LLMs, Vision Language Models, Audio Language Models, or generative video models.
- First-author publications at leading AI conferences (CVPR, ICCV, ECCV, ICML, ICLR, NeurIPS).
- Excellent English communication skills (C1+ required).
Nice to have
- PhD in Computer Vision, Machine Learning, NLP, Computer Science, Applied Statistics, or related field.
- Expertise in computer vision, video generation foundation models, and multimodal research.
Culture & Benefits
- Fully remote work from anywhere worldwide.
- Collaborate with a global team of top fintech and AI professionals.
- Opportunity to work on cutting-edge AI and blockchain technologies.
- Focus on innovation and pioneering new financial and AI solutions.