TL;DR
Staff Applied Researcher (AI Quality): Designing and implementing next-generation evaluation frameworks for AI-powered developer experiences with an accent on code generation, reasoning, and agentic workflows. Focus on developing scalable automatic metrics, LLM-judge systems, and human-in-the-loop evaluation pipelines to influence product decisions across hirify.global AI.
Location: Remote, United States
Salary: $140,400.00 - $372,300.00 USD/Yr
Company
hirify.global is the world’s leading AI-powered developer platform and the biggest open-source community, empowering 150 million developers.
What you will do
- Design and implement next-generation evaluation frameworks for AI-powered developer experiences, including code generation, reasoning, and agentic workflows.
- Develop scalable automatic metrics, LLM-judge systems, reward models, and human-in-the-loop evaluation pipelines.
- Build and optimize evaluation tooling, datasets, and benchmarking systems, creating new benchmarks for coding agents.
- Collaborate with engineering, product, and design teams to productionize research and accelerate model iteration.
- Own end-to-end quality insights for hirify.global Copilot and new AI features.
- Shape hirify.global’s strategy for model quality and alignment, mentoring other researchers and engineers.
Requirements
- Bachelor's degree with 8+ years, Master's with 6+ years, or Doctorate with 4+ years of experience in data science, computer science, or related fields.
- 3+ years of strong engineering skills in Python/Typescript.
- Experience building production-grade evaluation or data/ML pipelines at scale.
- Proven track record shipping research or evaluation systems in production environments.
- Strong cross-functional communication and influence skills.
- Location: Remote, United States.
Nice to have
- Experience with LLM judge systems, reward modeling, alignment, or safety evaluations.
- Background in code generation, developer tools, or AI-assisted programming.
- Experience with large-scale experimentation and online/offline evaluation strategies.
- Open-source contributions or experience working with developer communities.
- Experience designing and leading complex research projects from ideation to implementation.
Culture & Benefits
- Remote-first work environment with competitive pay.
- Generous learning and growth opportunities.
- Excellent benefits package.
- Customer-obsessed, ship to learn, growth mindset, own the outcome, better together, diverse and inclusive values.
Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →