Multimodal Generative AI Researcher (LLM/VLM)

Формат работы

remote (Global)

Тип работы

fulltime

Грейд

lead

Английский

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Multimodal Generative AI Researcher (LLM/VLM): Designing and fine-tuning large-scale Vision-Language Models (VLMs) and Language Models (LLMs) for multimodal tasks across vision, language, and 3D, bridging research breakthroughs with scalable engineering. Focus on building robust training and evaluation pipelines, analyzing model performance, and publishing impactful research.

Location: Remote

Company

hirify.global is a leading generative AI company focused on open-source AI models.

What you will do

Design and fine-tune large-scale VLMs/LLMs for tasks such as visual reasoning, retrieval, 3D understanding, and embodied interaction.
Build robust, efficient training and evaluation pipelines including data curation, distributed training, and scalable fine-tuning.
Conduct in-depth analysis of model performance, including ablations, bias/robustness checks, and generalization studies.
Collaborate across research, engineering, and 3D/graphics teams to bring models from prototype to production.
Publish impactful research and help establish best practices for multimodal model adaptation.

Requirements

PhD or equivalent experience in Machine Learning, Computer Vision, NLP, Robotics, or Computer Graphics.
Proven track record in fine-tuning or training large-scale VLMs/LLMs for real-world downstream tasks.
Strong engineering mindset to design, debug, and scale training systems end-to-end.
Deep understanding of multimodal alignment and representation learning (e.g., vision–language fusion, CLIP-style pre-training, retrieval-augmented generation).
Familiarity with recent trends, including video-language and long-context VLMs, spatio-temporal grounding, agentic multimodal reasoning, and Mixture-of-Experts (MoE) fine-tuning.
Hands-on experience with PyTorch, DeepSpeed, Ray, and distributed or mixed-precision training.

Nice to have

Experience integrating 3D and graphics pipelines into training workflows.
Research or implementation experience with vision-language-action models or multimodal agents.
Familiarity with efficient adaptation methods (e.g., LoRA, adapters, QLoRA, parameter-efficient finetuning, and distillation for edge deployment).
Knowledge of video and 4D generation trends, latent diffusion/rectified flow, or multimodal retrieval and reasoning pipelines.
Background in GPU optimisation, quantisation, or model compression for real-time inference.
Open-source contributions or publication track record in top-tier ML/CV/NLP venues.

Culture & Benefits

Work remotely, pushing the frontier of multimodal AI models.
Collaborative environment across research, engineering, and 3D/graphics teams.
Commitment to equal employment opportunity.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Multimodal Generative AI Researcher (LLM/VLM)

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Nice to have

Culture & Benefits

Похожие вакансии

Principal AI Scientist (LLM)

Principal AI Researcher (LLM)

Senior AI Engineer (Multimodal AI)

LLM Engineer (AI)

Senior Data Scientist (Generative AI, LLM)

Principal AI/ML Architect (AWS)

Разработка

Game Dev

Design и Creative

Аналитика

Менеджмент

People & Business