Senior Data Scientist - LLM Evaluation (AI)

200 000 - 240 000$

Формат работы

onsite

Тип работы

fulltime

Грейд

senior

Английский

Страна

Вакансия из Hirify RU Global, списка компаний с восточно-европейскими корнями
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Senior Data Scientist (AI Engineering): Architecting the LLM evaluation framework to define statistical standards for deployment in insurance claims processing with an accent on statistical significance, inter-rater reliability, and robust experimental design. Focus on turning subjective outputs into objective, measurable data for safe, accurate, and reliable AI products.

Location: Open to sponsoring candidates currently in the U.S. who need to transfer their active visa.

Salary: $200,000 - $240,000

Company

hirify.global delivers state of the art technology that helps insurance claims teams make claims handling more accurate, fair, and efficient.

What you will do

Design and implement comprehensive scorecards and benchmarking suites for LLM-based extraction, summarization, and chat interfaces.
Act as the technical lead in working with Subject Matter Experts (SMEs) to codify their expertise into evaluation datasets.
Design statistical guardrails to scale human and automated labeling efforts.
Provide clear, data-driven "Go/No-Go" recommendations for model deployment based on error analysis and statistical confidence intervals.

Requirements

5+ years of experience in Data Science with a strong background in traditional statistics.
2+ years of focused experience working with LLMs, specifically in evaluation, benchmarking, and prompt auditing.
Master’s or PhD in Statistics, Mathematics, or a related quantitative field.
A Quality Mindset and proven ability to work with non-technical SMEs to translate their qualitative feedback into quantitative metrics.
Proficient in Python (Pandas, Scikit-learn, Statsmodels) and SQL.

Nice to have

Deep knowledge of metrics like Cohen’s Kappa or Fleiss' Kappa to quantify agreement between SMEs and evaluate the clarity of labeling instructions.
Experience in Active Learning.
Experience with platforms like Labelbox, Snorkel, or Prodigy to manage the flow between human annotators and automated systems.

Culture & Benefits

Medical, dental, vision, short & long-term disability, life insurance and AD&D, and 401k matching.
Paid time off and sick leave, 100% paid parental leave.
Catered lunches, happy hours, pet-friendly spaces, and monthly technology stipend.
$1,000/year for each employee for professional development, as well opportunities for tuition reimbursement.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Текст вакансии взят без изменений

Источник - загрузка...