Назад
Company hidden
6 дней назад

Machine Learning Engineer (LLM Evals)

200 000 - 300 000$
Формат работы
hybrid
Тип работы
fulltime
Грейд
middle
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Machine Learning Engineer (LLM Evals): Build and maintain evaluation pipelines, LLM-powered judges, and observability infrastructure to measure and improve AI assistant quality with an accent on large-scale evaluation, quality measurement, and agent observability. Focus on designing reliable evaluation datasets, scoring assistant responses, and integrating quality signals into product launches.

Location: Hybrid, 3-4 days a week in San Francisco Bay Area offices

Salary: $200,000 - $300,000 annually

Company

hirify.global is a Work AI platform delivering enterprise AI solutions including intelligent search, AI assistants, and scalable AI agents with a focus on secure, customizable AI infrastructure for large organizations.

What you will do

  • Design and curate evaluation datasets ensuring representative coverage of real assistant behavior.
  • Build and maintain large-scale evaluation pipelines measuring assistant quality across thousands of queries.
  • Develop LLM-powered judges to score correctness, completeness, and response quality aligned with human judgment.
  • Evaluate new models and product changes to provide quality signals that gate launches and prevent regressions.
  • Build observability infrastructure including trace enrichment, data pipelines, and dashboards for AI agents.
  • Collaborate cross-functionally to integrate evaluation results and customer feedback to improve assistant behavior.

Requirements

  • Location: Must work hybrid in San Francisco Bay Area offices (3-4 days/week)
  • 2+ years software engineering experience with strong coding skills.
  • Strong backend fundamentals in Go and Python; experience with distributed data pipelines.
  • Experience with LLM evaluation, reinforcement learning from human feedback, or NLP.
  • Analytical rigor in interpreting offline metrics and real user experience.
  • Team player with customer-focused mindset and commitment to quality.

Culture & Benefits

  • Comprehensive benefits including Medical, Vision, Dental coverage, and 401k plan.
  • Home office improvement stipend and annual education and wellness stipends.
  • Generous time-off policy and healthy daily lunches.
  • Inclusive and diverse company culture with regular events.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Текст вакансии взят без изменений

Источник - загрузка...