TL;DR
Applied AI Evaluation Engineer (AI): Designing and implementing evaluation systems for LLM capabilities across diverse customer use cases with an accent on building robust evaluation infrastructure and developing novel methodologies. Focus on understanding model performance for enterprise customers and translating insights into model improvements.
Location: On-site in Paris
Company
hirify.global democratizes AI through high-performance, optimized, open-source, and cutting-edge models, products, and solutions, with a comprehensive AI platform designed to meet enterprise needs.
What you will do
- Design and implement comprehensive evaluation frameworks to measure LLM capabilities across diverse customer use cases.
- Build scalable evaluation infrastructure and pipelines that enable rapid, reproducible assessment of model performance.
- Develop novel evaluation methodologies to assess emerging capabilities or verticalized use cases.
- Create custom evaluation suites tailored to enterprise customers' specific needs.
- Collaborate with research teams to translate evaluation insights into model improvements.
- Partner with product teams to continuously improve evaluation tooling based on customer feedback.
Requirements
- Fluent in English (C1).
- 3+ years of experience in ML evaluation, benchmarking for LLM or agentic systems.
- Proven experience in AI or machine learning product implementation with APIs and back-end.
- Deep understanding of concepts and algorithms underlying machine learning and LLMs.
- Strong technical coding skills in Python.
- Strong communication skills with an ability to explain complex technical concepts to technical and non-technical audiences.
Nice to have
- Contributions to open-source evaluation frameworks (e.g., LM Eval Harness, OpenAI Evals) or published research on LLM evaluation.
- Experience as a Customer Engineer, Forward Deployed Engineer, Sales Engineer, Solutions Architect or Technical Product Manager.
- Experience with ML frameworks (PyTorch, HuggingFace Transformers).
Culture & Benefits
- Focus on people and outputs, shipping results over time spent.
- Autonomous work with direct communication; the best idea wins regardless of seniority.
- Direct and timely feedback, low ego, high standards, and an unstructured environment.
- Full health insurance coverage for you and your family.
- Competitive PTO (25 days of holidays and on average 8 to 10 days of RTT days).
- Comprehensive mobility allowance (€600 annual) covering public transportation and eco-friendly travel.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →