TL;DR
Machine Learning Engineer (LLM Evals): Build and maintain evaluation pipelines, LLM-powered judges, and observability infrastructure to measure and improve AI assistant quality with an accent on large-scale evaluation, quality measurement, and agent observability. Focus on designing reliable evaluation datasets, scoring assistant responses, and integrating quality signals into product launches.
Location: Hybrid, 3-4 days a week in San Francisco Bay Area offices
Salary: $200,000 - $300,000 annually
Company
hirify.global is a Work AI platform delivering enterprise AI solutions including intelligent search, AI assistants, and scalable AI agents with a focus on secure, customizable AI infrastructure for large organizations.
What you will do
- Design and curate evaluation datasets ensuring representative coverage of real assistant behavior.
- Build and maintain large-scale evaluation pipelines measuring assistant quality across thousands of queries.
- Develop LLM-powered judges to score correctness, completeness, and response quality aligned with human judgment.
- Evaluate new models and product changes to provide quality signals that gate launches and prevent regressions.
- Build observability infrastructure including trace enrichment, data pipelines, and dashboards for AI agents.
- Collaborate cross-functionally to integrate evaluation results and customer feedback to improve assistant behavior.
Requirements
- Location: Must work hybrid in San Francisco Bay Area offices (3-4 days/week)
- 2+ years software engineering experience with strong coding skills.
- Strong backend fundamentals in Go and Python; experience with distributed data pipelines.
- Experience with LLM evaluation, reinforcement learning from human feedback, or NLP.
- Analytical rigor in interpreting offline metrics and real user experience.
- Team player with customer-focused mindset and commitment to quality.
Culture & Benefits
- Comprehensive benefits including Medical, Vision, Dental coverage, and 401k plan.
- Home office improvement stipend and annual education and wellness stipends.
- Generous time-off policy and healthy daily lunches.
- Inclusive and diverse company culture with regular events.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →