Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
Текст:
TL;DR
Staff AI Engineer (Web3): Designing and optimizing post-training pipelines for large language models with an accent on alignment, controllability, and reasoning capabilities. Focus on implementing DPO, GRPO, and RLAIF systems to improve model performance and production-grade inference deployment.
Location: APAC
Company
OKX is a leading crypto exchange and developer of OKX Wallet, providing global access to crypto trading and decentralized applications.
What you will do
- Lead the full post-training pipeline for LLMs, including supervised fine-tuning and preference optimization.
- Implement advanced training paradigms such as DPO (Direct Preference Optimization) and GRPO (Generalized Reward Policy Optimization).
- Develop domain-specific data recipes, curation strategies, and augmentation pipelines.
- Build and refine Reward Models and RLAIF closed-loop systems.
- Optimize inference efficiency and deploy models using low-latency frameworks like vLLM and SGLang.
- Evaluate model performance through automated benchmarks and human/AI feedback loops.
Requirements
- Bachelor's degree in Computer Science, AI, Machine Learning, or a related field.
- At least 8 years of industry experience.
- Hands-on experience across the full post-training pipeline for large models.
- Deep familiarity with preference learning and alignment techniques (DPO, GRPO, RL-based methods).
- Proven experience training specialized small models from scratch.
- Experience deploying models in production using vLLM, SGLang, or similar frameworks.
Culture & Benefits
- Competitive total compensation package.
- L&D programs and education subsidies for professional growth.
- Comprehensive healthcare schemes for employees and their dependants.
- Wellness and meal allowances.
- Various team building programs and company events.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →