Staff Product Manager (AI Evals)

Формат работы

remote (только USA)

Тип работы

fulltime

Грейд

principal

Английский

Страна

Вакансия из Hirify RU Global, списка компаний с восточно-европейскими корнями
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Staff Product Manager (AI Evals): Owns the evaluation framework for AI agents at hirify.global, covering both internal framework development and customer-facing tools for agent assessment. Focus on translating practitioner knowledge of AI/ML evaluation into product features, driving adoption, and defining metrics for agent quality and customer evaluation engagement.

Location: Palo Alto, California or remote within the USA

Company

hirify.global is a leader in enterprise orchestration, providing an AI-powered platform to streamline operations by connecting data, processes, applications, and experiences for 400,000 global customers.

What you will do

Define and own the evaluation framework for hirify.global's internal AI agent features, driving adoption across teams.
Build the customer-facing evaluation experience for builders to test, measure, and improve agents created on hirify.global.
Make critical decisions regarding evaluation complexity exposure, balancing rigor with approachability.
Partner with the Build Experience PM to integrate evaluation seamlessly into the builder journey.
Work with ML engineers and platform teams to ground the framework in technical reality while ensuring accessibility.
Establish metrics for internal agent quality and customer evaluation adoption, and understand customer struggles with agent performance assessment.

Requirements

7+ years in Product Management with hands-on experience writing evaluations for AI/ML systems (agents, LLMs, or similar).
Track record of shipping technical products to both internal and external users.
Experience driving adoption of frameworks or practices across engineering teams.
Strong written and verbal communication skills.
Bachelor's degree or equivalent experience.
Practitioner depth in evaluations, including building test suites, designing rubrics, and debugging agent underperformance.
Strong product management experience, including shipping products, driving roadmaps, and leading cross-functional teams.
Technical translation ability to make complex evaluation concepts accessible to business technologists without oversimplification.
Internal influence skills to drive adoption of frameworks and tools across teams and collaborate credibly with ML engineers.
Comfort defining products from ambiguity, scoping v1s, and iterating based on learnings.
B2B product sensibility, viewing enterprise conventions as problems to solve.

Nice to have

Experience with agent architectures, RAG systems, or LLM application development.
Background in ML engineering, solutions architecture, or technical program management.
Experience building developer tools or platform products.
Familiarity with evaluation frameworks (e.g., human eval pipelines, automated benchmarks, red-teaming).

Culture & Benefits

Flexible, trust-oriented culture that empowers full ownership of roles.
Driven by innovation and seeking team players to actively build the company.
Emphasis on balancing productivity with self-care.
Vibrant and dynamic work environment.
Multitude of benefits (detailed on careers page).
Recognized as a top enterprise startup and a leader for remote workers.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Текст вакансии взят без изменений

Источник - загрузка...