Роль четко определена с акцентом на интеллектуальные документы и современные методы машинного обучения, но отсутствие информации о зарплате затрудняет полную оценку привлекательности.
Кликните для подробной информации
Зарплата не указанаСовременный стекПродуктовая компанияЧеткое определение роли
Full-time Middle 🌎 World 💻 Development 🏠 Remote Job description
What you'll do:
Model Development & Evaluation
Build and maintain evaluation frameworks for document models, LLMs, OCR, and structured extraction.
Define metrics, benchmarks, and validation strategies for real-world document workloads.
Dataset & Pipeline Creation
Design and curate high-quality datasets for supervised training, fine-tuning, and validation.
Create scalable preprocessing pipelines for PDFs, scans, images, forms, and semi-structured documents.
Model Training & Fine-Tuning
Train and fine-tune transformer-based OCR, VLMs, layout models, and open-source LLMs for document understanding tasks.
Optimize models for reliability, accuracy, and cost efficiency in production environments.
Inference & Deployment
Deploy ML models with modern inference runtimes (vLLM, TGI, TensorRT, ONNX Runtime).
Build guardrails, monitoring, and fallback mechanisms to ensure safe and predictable model behavior.
RAG & Document Reasoning
Develop retrieval and chunking strategies tailored to document structures (tables, forms, multi-page PDFs).
Optimize end-to-end RAG pipelines for semantic search, Q&A, and workflow automation.
Cross-Functional Collaboration
Partner with PMs, backend engineers, and product designers to define AI opportunities and translate requirements into technical solutions.
Who you are:
We are expanding our AI/ML function with an ML Engineer who specializes in document intelligence , vision–language models , and LLM-based extraction and reasoning. You should be comfortable with both traditional document AI approaches and cutting-edge GenAI workflows. You thrive in fast-moving environments, are self-directed, and enjoy solving practical ML problems that directly impact customers. We’re looking for someone with experience in:
Vision transformers, layout models, and OCR systems
Structured extraction from complex documents
RAG for document-heavy workloads
Optimizing LLM pipelines for cost, accuracy, and throughput
Deploying and benchmarking models in real production systems
Required Experience:
5+ years of Python experience
Experience training, fine-tuning, and deploying traditional computer vision models for document intelligence tasks (layout detection, table extraction, OCR, information extraction)
Hands-on experience with document understanding frameworks and models:
Traditional document AI models (LayoutLM, Donut, DocFormer)
Modern vision-language models with OCR capabilities (DeepSeek-OCR, LightOnOCR-1B, etc.)
Experience deploying and optimizing models using inference frameworks such as vLLM (preferred), TGI, TensorRT, or ONNX Runtime
Experience applying LLMs to document intelligence workflows, including both frontier models and open-source alternatives
Strong understanding of coordinate systems and spatial reasoning for absolute positioning and field detection in forms/documents
It would be awesome if you had:
Familiarity with PDF parsing libraries and document preprocessing pipelines
Experience fine-tuning open-source models for domain-specific document tasks
Knowledge of evaluation metrics for document understanding tasks (F1, exact match, etc.)
Benefits:
An honest, open culture that emphasizes feedback and promotes professional and personal development
An opportunity to work from anywhere — our team is distributed worldwide, from Lisbon to Manila, from Florida to California
6 self care days
A competitive salary
And much more!
Apply for this job Please mention "I found this job at Remocate!"
Показать контакты
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →
Текст вакансии взят без изменений
Источник - Telegram канал. Название доступно после авторизации