Назад
Company hidden
2 дня назад

Member of Technical Staff - Model Evaluation (AI)

Формат работы
onsite
Тип работы
fulltime
Грейд
middle
Английский
b2
Страна
UK
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Member of Technical Staff - Model Evaluation (AI): Developing and implementing evaluation frameworks for Grok to measure performance on high-value enterprise use cases with an accent on uncovering model weaknesses and tracking progress over time. Focus on building robust benchmarking infrastructure and collaborating with core modeling teams to drive iterative improvements in model quality.

Location: Must be based in or able to work from London, UK

Company

hirify.global is an AI research company focused on creating systems that understand the universe and aid humanity in its pursuit of knowledge.

What you will do

  • Identify and define high-value enterprise use cases for AI model application.
  • Develop and execute comprehensive assessments and benchmarks for AI models.
  • Analyze model training data and outputs to diagnose performance weaknesses.
  • Collaborate with modeling and data teams to design actionable improvement plans.
  • Build and maintain scalable infrastructure and frameworks for model evaluation.

Requirements

  • Experience in model assessment and the development of evaluation tasks, including public and in-house benchmarks.
  • Proficiency in collecting and synthesizing datasets for new evaluations.
  • Experience building infrastructure for model evaluation.
  • Familiarity with inference frameworks such as SGlang and vLLM.
  • Strong communication skills to accurately share technical findings.

Culture & Benefits

  • Small, highly motivated team with a flat organizational structure.
  • Focus on engineering excellence and high-intensity execution.
  • Environment that prioritizes direct contribution and initiative.
  • Opportunity to shape the development of large-scale AI capabilities.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →