TL;DR
Member of Technical Staff, Pretraining Evaluations (AI): Developing and improving methods to measure base model progress for large language models with an accent on implementing new evaluations and reducing noise in existing ones. Focus on statistical understanding of evaluations, improving signal-to-noise ratio, and measuring model progress at all scales.
Location: Remote. No restrictions on where you can be located for this role.
Company
hirify.global is an AI company training and deploying frontier models for developers and enterprises building AI systems.
What you will do
- Deeply understand individual evaluation tasks in the base model evaluation suite, including their strengths and limitations.
- Suggest and implement improvements to the base model evaluation suite by adding new tasks or removing redundant ones.
- Improve the statistical understanding of evaluations and enhance the signal-to-noise ratio of the evaluation suite.
Requirements
- Familiarity with base model evaluations and their differences from post-trained models.
- Strong statistical skills and experience evaluating scientific experiments related to data collection and model performance.
- Ability to convey statistical information effectively to a broad audience using visualizations.
- Extremely strong software engineering skills.
- Proficiency in programming languages such as Python and ML frameworks (e.g., PyTorch, TensorFlow, JAX).
- Excellent communication skills to collaborate effectively with cross-functional teams and present findings.
- One or more papers at top-tier venues (such as NeurIPS, ICML, ICLR, AIStats, MLSys, JMLR, AAAI, Nature, COLING, ACL, EMNLP).
Culture & Benefits
- Open and inclusive culture and work environment.
- Work closely with a team on the cutting edge of AI research.
- Weekly lunch stipend, in-office lunches & snacks.
- Full health and dental benefits, including a separate budget for mental health.
- 100% Parental Leave top-up for up to 6 months.
- Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement.
- Remote-flexible, with offices in Toronto, New York, San Francisco, London, and Paris, as well as a co-working stipend.
- 6 weeks of vacation (30 working days).
Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →