Research Engineer, Post-Training (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Research Engineer, Post-Training (AI): Scaling the post-training loop to transform expert feedback into high-performing models for legal work with an accent on model training experiments and reward systems. Focus on optimizing agent harnesses, designing reliable grading systems, and improving quality on long-horizon legal tasks.
Location: San Francisco
Salary: $231,000 – $340,000 USD (Offers Equity)
Company
is an AI platform transforming legal and professional services by combining frontier agentic AI with deep domain expertise.
What you will do
- Drive post-training experiments to push agent performance while balancing cost, latency, security, and governance.
- Optimize agent harnesses, including domain-specific skills, retrieval strategies, and validation loops for long-horizon legal work.
- Design and develop reliable grading and reward systems for high-stakes legal evaluation and iteration.
- Analyze agent behavior to identify successful patterns and convert findings into training data, evals, or harness changes.
- Collaborate with internal and external research partners to define experiments and execute model improvements.
Requirements
- Hands-on experience with post-training or model-training (SFT, preference optimization, RLHF/RLAIF, reward modeling, distillation).
- Strong judgment regarding model behavior, including the ability to inspect traces, outputs, and identify failure modes.
- Strong Python and research-engineering ability to build reliable systems for faster research iteration.
- Ability to self-manage ambiguous applied research projects and communicate effectively with cross-functional teams.
- Must be located in San Francisco
Nice to have
- Experience building ML data or evaluation infrastructure, such as curation pipelines or experiment tracking dashboards.
- Experience with distributed training, inference systems, GPU workloads, or large-scale ML experimentation.
- Research publications, open-source contributions, or shipped industry work in LLMs and AI agents.
Culture & Benefits
- Competitive compensation including a high base salary and equity.
- Opportunity to contribute to a generational company at a true inflection point with strong product-market fit.
- High-intensity environment that values decisiveness, simplicity, and ownership.
- Collaborative culture working alongside world-class researchers and investors.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →