Назад
Company hidden
3 часа назад

Machine Learning Engineer (LLM Evals)

200 000 - 300 000$
Формат работы
hybrid
Тип работы
fulltime
Грейд
middle
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Machine Learning Engineer (LLM Evals & Observability): Design and curate evaluation datasets and pipelines for AI assistants and agents with an accent on LLM-powered judges, quality metrics, and observability infrastructure. Focus on building scalable evals, evaluating model changes, and closing the loop for continuous improvement using customer feedback and automated techniques.

Location: Hybrid (3-4 days a week in one of our SF Bay Area offices)

Salary: $200,000 - $300,000 annually

Company

hirify.global is the Work AI platform powering intelligent enterprise search, AI Assistant, and scalable AI agents with over 100 SaaS connectors.

What you will do

  • Design and curate evaluation datasets with sampling strategies, query diversity, and golden sets for reliable coverage of assistant behavior.
  • Build and maintain large-scale evaluation pipelines measuring quality across thousands of real user queries.
  • Develop LLM-powered judges scoring correctness, completeness, and response quality aligned with human judgment.
  • Evaluate new models and product changes to gate launches and prevent regressions.
  • Build observability for AI agents including trace enrichment, data pipelines, and dashboards.
  • Close the quality loop using eval results, feedback, and techniques like automated prompt iteration.
  • Collaborate across teams to integrate evals into the shipping process.

Requirements

  • 2+ years of software engineering experience with strong coding skills
  • Strong backend fundamentals in Go and Python; comfortable with distributed data pipelines
  • Experience with LLM evaluation, RLHF, NLP, or large ML systems
  • Analytically rigorous mindset focused on metrics predicting user experience
  • Team player thriving in customer-focused, cross-functional environment
  • Deep care for quality in systems and product improvement

Culture & Benefits

  • Comprehensive benefits: Medical, Vision, Dental coverage, generous time-off, 401k contribution.
  • Home office improvement stipend, annual education and wellness stipends.
  • Vibrant culture with regular events and daily healthy lunches.
  • Commitment to diversity, inclusion, and AI fluency for all hires.

Hiring process

  • AI-focused exercise or discussion in interviews to assess AI thinking and usage.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →