Member of Technical Staff, Evals (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Member of Technical Staff, Evals (AI): Building and maintaining an internal evaluations platform for frontier AI systems with an accent on large-scale evaluation infrastructure and dataset quality measurement. Focus on auditing benchmarks, improving evaluation correctness, and developing frameworks to ensure trustworthy measurements for pre-training and RL.
Location: San Francisco, USA (Visa sponsorship and relocation support provided)
Salary: $200,000 – $550,000 per year + Equity
Company
is a small, fast-moving, and highly collaborative team dedicated to the safe deployment of AGI.
What you will do
- Build and maintain the internal evals platform used across Magic Design.
- Implement and validate eval tasks for pre-training, post-training, reinforcement learning, and product systems.
- Develop infrastructure for running large-scale evaluations and measuring dataset quality.
- Audit and improve public benchmarks and open-source evaluation methodologies.
- Partner with research, data, and product teams to define metrics reflecting model quality.
- Build tooling and frameworks that enable data-driven decisions based on trustworthy measurements.
Requirements
- Strong software engineering fundamentals with experience building production systems or internal platforms.
- Experience working with machine learning systems, evaluation frameworks, or research tooling.
- Ability to reason critically about benchmarks, metrics, and experimental methodology.
- Experience designing, implementing, or operating systems that run at scale.
- Strong debugging and investigative skills with a high bar for correctness.
- Must be based in or willing to relocate to San Francisco.
Culture & Benefits
- 401(k) plan with 6% salary matching.
- Generous health, dental, and vision insurance for employees and dependents.
- Unlimited paid time off.
- Visa sponsorship and relocation support for candidates moving to San Francisco.
- Culture based on integrity, hands-on building, and a singular focus on AGI.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →