Company hidden

1 день назад

QA Lead (AI Brand Evaluation)

Формат работы

remote (только Colombia)

Тип работы

fulltime

Грейд

lead

Английский

Страна

Colombia

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

QA Lead (AI Brand Evaluation): Own end-to-end QA and evaluation strategy for agentic and AI-powered product features, with an accent on transforming ambiguous business requirements into robust evaluation criteria and scoring systems. Focus on building validation frameworks for non-deterministic AI outputs, curating golden test sets, and running adversarial/ambiguity testing at high volume.

Location: Remote within Colombia

Company

hirify.global is a design and technology company building intelligent, shoppable brand experiences using its proprietary platform.

What you will do

Lead end-to-end QA and evaluation strategy for multi-agent architectures and high-volume data pipelines.
Translate business objectives into data-driven evaluation criteria, operational rubrics, and scoring methodologies.
Oversee brand scoring architecture and ensure automated systems produce precise, reliable business metrics from massive datasets.
Build tools, frameworks, and validation pipelines for non-deterministic AI outputs, including golden test set curation.
Establish governance and risk-adaptive guardrails for data precision, PII compliance, and logical reasoning across agentic workflows.
Drive adversarial/red-teaming and ambiguity testing; monitor observability dashboards for latency, data drift, and accuracy trends.

Requirements

Location: Must be based in Colombia (remote)
8+ years of QA engineering, systems analysis, or software testing experience, including 2+ years in a lead/strategic role.
Proven ability to lead QA initiatives in fast-paced environments with ambiguous requirements and establish order.
Strong data/statistical mindset: move beyond pass/fail toward trend, composite scoring, and statistical evaluation.
Hands-on scripting and data querying experience with Python, JavaScript, or SQL to build automated evaluation pipelines.
Operational fluency with LLM behaviors and agentic workflows (e.g., hallucinations, context limits, instruction adherence).

Nice to have

Experience with LLM evaluation frameworks (e.g., Ragas, LangSmith, TruLens) or prompt engineering.
Familiarity with cloud data warehouses and analytics platforms (e.g., BigQuery, Databricks, Google Cloud).
Experience building internal QA automation tools and lightweight validation scripts.

Culture & Benefits

Remote role with work based in Colombia.
Inclusive, equal-opportunity hiring and barrier-free recruitment process.
Focus on building intelligent, brand-focused experiences using emerging technologies.

Hiring process

Interviews to assess QA leadership, evaluation strategy, and experience with AI/LLM evaluation.
Discussion of approach to building scoring systems, golden datasets, and adversarial testing frameworks.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

QA Lead (AI Brand Evaluation)

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Nice to have

Culture & Benefits

Hiring process

Похожие вакансии

QA Engineer (Web)

Senior QA Automation Engineer (Python)

QA Manager (Medtech)

Lead QA Engineer (Web UI)

Test Manager

Python QA Automation (ADCM)

Разработка

Game Dev

Design и Creative

Аналитика

Менеджмент

People & Business

QA Lead (AI Brand Evaluation)

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Nice to have

Culture & Benefits

Hiring process

Categories

Похожие вакансии

QA Engineer (Web)

Senior QA Automation Engineer (Python)

QA Manager (Medtech)

Lead QA Engineer (Web UI)

Test Manager

Python QA Automation (ADCM)