TL;DR
Research Engineer (AI): Improving the intelligence of public AI models by building and scaling training environments for reinforcement learning with an accent on fine-tuning strategies, reward design, and data quality. Focus on exploring novel environment creation, developing QA frameworks, and translating capability goals into training systems.
Location: Hybrid (United States), with an expectation to be in one of the offices at least 25% of the time. Visa sponsorship is available.
Salary: $350,000–$850,000 USD
Company
hirify.global is a public benefit corporation focused on creating reliable, interpretable, and steerable AI systems.
What you will do
- Improve and execute fine-tuning strategies for adapting Claude to new domains and tasks.
- Manage technical relationships with external data vendors, including evaluation of data quality and reward design.
- Collaborate with domain experts to design data pipelines and evaluations.
- Explore novel ways of creating RL environments for high-value tasks.
- Develop and improve QA frameworks to catch reward hacking and ensure environment quality.
- Partner with other RL research teams and product teams to translate capability goals into training environments.
Requirements
- Experience with fine-tuning large language models for specific domains or real-world use cases.
- Experience with reinforcement learning, reward design, or training data curation for LLMs.
- Comfortable managing technical vendor relationships and iterating quickly on feedback.
- Strong project management and interpersonal skills.
- Passion for making AI more useful and accessible across different industries.
- At least a Bachelor's degree in a related field or equivalent experience.
Nice to have
- Experience training production ML systems.
- Familiarity with distributed systems and cloud infrastructure.
- Domain expertise in an area where models can be more useful.
- Experience working with external vendors or technical partners.
Culture & Benefits
- Competitive compensation and benefits, optional equity donation matching.
- Generous vacation and parental leave, flexible working hours.
- Collaborative team focused on high-impact AI research.
- Emphasis on advancing steerable, trustworthy AI as an empirical science.
- Lovely office space in San Francisco for collaboration.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →