Назад
Company hidden
28 дней назад

Software Engineer, Agent Evaluation and Quality (AI)

Тип работы
fulltime
Английский
b2
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Software Engineer, Agent Evaluation and Quality (AI): Building measurement, evaluation, and feedback-loop infrastructure that improves hirify.global's core agent reliably over time with an accent on datasets, scorers, pipelines, analysis tooling, and user signals. Focus on designing AI evaluation systems, developing debugging workflows, interpreting agent behavior, and operationalizing quality metrics for reliability and guardrails.

Company

hirify.global AI is building the best tool for professional programmers to automate coding through inventive research, design, and engineering in a flat, talent-dense organization.

What you will do

  • Design and build AI evaluation systems including curated datasets, offline replay, scorers, regression alerts, and dashboards.
  • Design feedback loops from real usage by collecting, cleaning, and interpreting user signals to inform model and harness changes.
  • Develop analysis tooling and workflows for debugging agent behavior, deep dives on failure modes, clustering themes, and surfacing insights.
  • Improve reliability and guardrails by defining good/bad/degraded sessions, alerting, and triage primitives.

Requirements

  • Experience building and operating evaluation or measurement systems like AI evals, experimentation, ranking, or search quality, turning ambiguous quality into metrics, pipelines, and decisions.
  • Strong data acumen and ability to collaborate with data scientists and researchers.
  • Taste and strong opinions on model and agent behaviors, staying informed on emerging research and trends.
  • Strong software engineering fundamentals and experience shipping production systems.

Culture & Benefits

  • Flat organization that values truth-seeking, passion, creativity, spirited debate, crazy ideas, and shipping code.
  • Small, talent-dense team partnering closely with research, product, and infrastructure.
  • Impact compounds across products and high-stakes decisions on models, quality, and cost.

Hiring process

  • Short technical interviews if fit appears.
  • Onsite in office for small project, idea discussions, and team meetings.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →