TL;DR
Data Scientist (AI): Architecting and maintaining automated evaluation pipelines to assess answer quality for an LLM-first search engine with an accent on designing evaluation sets for tool calls and developing VLM-based solutions for visual rendering. Focus on continuous review of public benchmarks and directly shaping product changes through evaluation metrics.
Location: Hybrid in London, New York City, or Belgrade. USD salary ranges apply only to U.S.-based positions. International salaries are set based on the local market.
Salary: $210,000–$385,000
Company
hirify.global serves tens of millions of users daily with a reliable, high-quality LLM-first search engine and specialized data sources.
What you will do
- Architect and maintain automated evaluation pipelines to assess answer quality across hirify.global's products.
- Design evaluation sets and methods specifically to measure the impact of tool calls on final answer quality.
- Develop VLM-based solutions to programmatically evaluate how final answers render visually across platforms and devices.
- Continuously review and incorporate public benchmarks into regular performance measurements.
- Collaborate closely with technical leadership to measure and improve Answer Quality.
Requirements
- PhD or MS in a technical field or equivalent experience.
- 4+ years of experience in data science or machine learning.
- Strong proficiency in Python and SQL (expected to write production-grade code).
- Experience building within a modern cloud data stack, specifically AWS and Databricks.
- Comfortable with agentic coding workflows and using AI-assisted development tools.
Nice to have
- 1+ years of experience working with LLMs at scale, specifically with LLM-as-a-judge setups.
- Prior experience working on customer-facing web products or consumer apps, with real user traffic at scale.
- A strong research background, with experience applying research methods to real-world ML problems.
- Experience defining evaluation metrics and building ground truth datasets.
Culture & Benefits
- Comprehensive benefits program including equity, health, dental, vision, retirement, fitness, commuter, and dependent care accounts for U.S. employees.
- Full-time employees outside the U.S. enjoy a comprehensive benefits program tailored to their region of residence.
- Operate within a small, high-impact team.
- Evaluation metrics directly shape product changes.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →