Machine Learning Engineer (LLM Evals)

200 000 - 300 000$

Формат работы

hybrid

Тип работы

fulltime

Грейд

middle

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Machine Learning Engineer (LLM Evals): Build and maintain evaluation pipelines, LLM-powered judges, and observability infrastructure to measure and improve AI assistant quality with an accent on large-scale evaluation, quality measurement, and agent observability. Focus on designing reliable evaluation datasets, scoring assistant responses, and integrating quality signals into product launches.

Location: Hybrid, 3-4 days a week in San Francisco Bay Area offices

Salary: $200,000 - $300,000 annually

Company

hirify.global is a Work AI platform delivering enterprise AI solutions including intelligent search, AI assistants, and scalable AI agents with a focus on secure, customizable AI infrastructure for large organizations.

What you will do

Design and curate evaluation datasets ensuring representative coverage of real assistant behavior.
Build and maintain large-scale evaluation pipelines measuring assistant quality across thousands of queries.
Develop LLM-powered judges to score correctness, completeness, and response quality aligned with human judgment.
Evaluate new models and product changes to provide quality signals that gate launches and prevent regressions.
Build observability infrastructure including trace enrichment, data pipelines, and dashboards for AI agents.
Collaborate cross-functionally to integrate evaluation results and customer feedback to improve assistant behavior.

Requirements

Location: Must work hybrid in San Francisco Bay Area offices (3-4 days/week)
2+ years software engineering experience with strong coding skills.
Strong backend fundamentals in Go and Python; experience with distributed data pipelines.
Experience with LLM evaluation, reinforcement learning from human feedback, or NLP.
Analytical rigor in interpreting offline metrics and real user experience.
Team player with customer-focused mindset and commitment to quality.

Culture & Benefits

Comprehensive benefits including Medical, Vision, Dental coverage, and 401k plan.
Home office improvement stipend and annual education and wellness stipends.
Generous time-off policy and healthy daily lunches.
Inclusive and diverse company culture with regular events.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →