TL;DR
AI Engineer, Product (AI): Building and optimizing an LLM evaluation and A/B testing framework, enhancing end-to-end observability, and managing a reliable model release process. Focus on shipping measurable improvements to quality, latency, safety, and reliability in partnership with the Science team.
Location: This role is primarily based at one of our European offices (Paris, France and London, UK). We prioritize candidates who reside in Paris or are open to relocating to Paris. Remote candidates are considered from France, UK, Germany, Belgium, Netherlands, Spain, and Italy, with a requirement to visit the Paris office for the first week of onboarding and at least 3 days per month.
Company
hirify.global is a product company developing high-performance, optimized, open-source, and cutting-edge AI models, products, and solutions for enterprises, including the AI assistant 'le Chat'.
What you will do
- Build and maintain an LLM evaluation framework.
- Define and track key metrics (task success, helpfulness, hallucination, safety, latency/cost).
- Run and analyze A/B tests for prompts, models, and system prompts.
- Set up observability for LLM calls (logging, tracing, dashboards, alerts).
- Operate the model release process, including canary and shadow traffic, sign-offs, and regression detection.
- Improve core LLM behaviors (memory, intent classification, follow-ups, routing, tool-call reliability).
Requirements
- Strong TypeScript or Python skills.
- Production LLM experience with prompts, tool/function calling, and system prompts.
- Hands-on experience with evaluations and A/B testing, including designing metrics and making data-driven rollout decisions.
- Experience with observability (logging, tracing, dashboards, alerting).
- Product mindset: forming hypotheses, running experiments, interpreting results, and iterating.
- Clear written and spoken English communication.
Nice to have
- Experience with safety systems: moderation, PII handling/redaction, guardrails.
- Experience with release operations: canary/shadowing, automated rollbacks, experiment platforms.
Culture & Benefits
- Competitive salary and equity.
- Health insurance, transportation, sport allowance, and meal vouchers.
- Private pension plan and generous parental leave.
- Visa sponsorship provided.
- Dynamic, collaborative, low-ego, and team-spirited environment.
Hiring process
- Introduction call (30 min).
- Hiring Manager interview (30 min).
- Technical Rounds: Live-coding and AI Engineering interviews (45 min each).
- Culture-fit discussion (30 min).
- References.
Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →