AI Evaluation Engineer (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
AI Evaluation Engineer (AI/ML): Developing and optimizing evaluation frameworks for LLM-based systems and agentic workflows with an accent on data-driven iteration and model quality. Focus on designing feedback loops, curating high-leverage datasets, and analyzing model failure modes to improve contract understanding.
Location: Hybrid in San Francisco or New York City
Salary: $245,000 – $295,000
Company
is a leading AI contracting platform that transforms legal agreements into intelligent assets for transformative organizations.
What you will do
- Analyze training and evaluation datasets to identify distributional gaps and labeling inconsistencies.
- Design and execute labeling campaigns, including the development of golden datasets and annotation guidelines.
- Build and maintain dashboards to track model accuracy, regression trends, and product-specific KPIs.
- Investigate failure modes via prompt clustering, error taxonomy development, and user intent classification.
- Operationalize feedback loops by mining product telemetry and human-in-the-loop reviews.
- Partner with engineers and PMs to run structured A/B tests and human evaluations for new features.
Requirements
- Bachelor's or Master's degree in a quantitative field (Statistics, Computer Science, Data Science, Applied Math).
- 8+ years of experience in applied ML or data science, preferably in NLP or LLM-based applications.
- Strong proficiency in SQL and Python, including experience with Pandas and experiment tracking tools.
- Must be based in or be able to work in a hybrid setup in San Francisco or New York City.
- Ability to navigate ambiguity and communicate technical insights to cross-functional stakeholders.
Nice to have
- Familiarity with LLM eval techniques, Reinforcement Learning from Human Feedback (RLHF), or agentic system design.
- Experience with program management.
Culture & Benefits
- 100% health coverage for employees (medical, dental, and vision).
- Market-leading gender-neutral parental leave and compassionate leave policies.
- 401(k) plan with employer match for US employees.
- Monthly stipends for wellbeing and hybrid work.
- Mental health support through Modern Health, including therapy and coaching.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →