TL;DR

Member Of Technical Staff LLM Evaluation (AI): Develop and implement advanced methodologies to evaluate Copilot's performance in real-world scenarios with an accent on large language model evaluation, classifier training, and real-time performance monitoring. Focus on designing automated evaluation frameworks, solving complex AI challenges, and collaborating with user researchers and product leaders to improve AI systems.

Location: Mountain View, United States; onsite work expected at least four days a week if living within 50 miles

Salary: $158,400–$304,200 per year depending on location and role level

Company

%hirify_global% is a leading technology corporation focused on AI research and development, empowering users worldwide through innovative software solutions.

What you will do

Develop and implement evaluation frameworks for Copilot's AI performance across diverse scenarios and edge cases.
Leverage data mining, prompt engineering, and classifier training to identify failure modes and mitigation strategies.
Build automated testing systems and efficient model pipelines for real-time AI performance monitoring.
Collaborate with user researchers and product leaders to maintain a user-oriented perspective and validate approaches.
Track and adapt state-of-the-art AI research techniques to drive innovation in production systems.

Requirements

Location: Must be based in or near Mountain View, United States, with onsite presence expected.
Advanced degree (Bachelor’s with 5+ years, Master’s with 3+ years, or Doctorate with 1+ year) in Data Science, Mathematics, Statistics, Computer Science, or related field.
Experience with data science techniques, managing structured and unstructured data, and statistical analysis.
Experience working with large language models and writing production-quality Python code.
Demonstrated interest in Responsible AI and creative problem solving in complex AI environments.
English proficiency: at least B2 level.

Nice to have

Doctorate with 5+ years or equivalent experience in data science.
Experience in prompt engineering and classifier training for LLM evaluation.

Culture & Benefits

Work in a leading global AI research and development environment.
Collaborate with diverse teams focused on innovation and inclusion.
Competitive salary with location-based adjustments.
Commitment to growth mindset, respect, integrity, and accountability.
Onsite work policy with flexibility subject to local laws.