Назад
Company hidden
обновлено 15 часов назад

Senior Data Scientist - LLM Evaluation (AI)

200 000 - 240 000$
Формат работы
onsite
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify RU Global, списка компаний с восточно-европейскими корнями
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Senior Data Scientist (AI Engineering): Architecting the LLM evaluation framework to define statistical standards for deployment in insurance claims processing with an accent on statistical significance, inter-rater reliability, and robust experimental design. Focus on turning subjective outputs into objective, measurable data for safe, accurate, and reliable AI products.

Location: Open to sponsoring candidates currently in the U.S. who need to transfer their active visa.

Salary: $200,000 - $240,000

Company

hirify.global delivers state of the art technology that helps insurance claims teams make claims handling more accurate, fair, and efficient.

What you will do

  • Design and implement comprehensive scorecards and benchmarking suites for LLM-based extraction, summarization, and chat interfaces.
  • Act as the technical lead in working with Subject Matter Experts (SMEs) to codify their expertise into evaluation datasets.
  • Design statistical guardrails to scale human and automated labeling efforts.
  • Provide clear, data-driven "Go/No-Go" recommendations for model deployment based on error analysis and statistical confidence intervals.

Requirements

  • 5+ years of experience in Data Science with a strong background in traditional statistics.
  • 2+ years of focused experience working with LLMs, specifically in evaluation, benchmarking, and prompt auditing.
  • Master’s or PhD in Statistics, Mathematics, or a related quantitative field.
  • A Quality Mindset and proven ability to work with non-technical SMEs to translate their qualitative feedback into quantitative metrics.
  • Proficient in Python (Pandas, Scikit-learn, Statsmodels) and SQL.

Nice to have

  • Deep knowledge of metrics like Cohen’s Kappa or Fleiss' Kappa to quantify agreement between SMEs and evaluate the clarity of labeling instructions.
  • Experience in Active Learning.
  • Experience with platforms like Labelbox, Snorkel, or Prodigy to manage the flow between human annotators and automated systems.

Culture & Benefits

  • Medical, dental, vision, short & long-term disability, life insurance and AD&D, and 401k matching.
  • Paid time off and sick leave, 100% paid parental leave.
  • Catered lunches, happy hours, pet-friendly spaces, and monthly technology stipend.
  • $1,000/year for each employee for professional development, as well opportunities for tuition reimbursement.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Текст вакансии взят без изменений

Источник - загрузка...