TL;DR
Staff Research Engineer (AI/LLMs): Building hirify.global-native foundational Large Language Models (LLMs) and defining Continual Pre-Training (CPT) strategies with an accent on domain adaptation, multimodal data fusion, and scaling laws for graph-based data. Focus on transforming generic foundation models into hirify.global-native experts and designing continuous evaluation pipelines for real-time model monitoring.
Location: This role is completely remote friendly within the United States. If you happen to live close to one of our physical office locations (San Francisco, Los Angeles, New York City & Chicago) our doors are open for you to come into the office as often as you'd like.
Salary: $230,000 – $322,000 USD
Company
hirify.global is a community of communities, built on shared interests, passion, and trust, and is one of the internet’s largest sources of information.
What you will do
- Architect and validate Continual Pre-Training (CPT) frameworks, focusing on domain adaptation techniques for hirify.global-native LLMs.
- Lead research into fusing vision and language encoders for multimodal data processing of hirify.global’s rich media.
- Formulate data curriculum strategies to maximize community understanding while maintaining safety and reasoning capabilities.
- Conduct deep-dive research into Scaling Laws for Graph-based data, investigating impacts on model convergence.
- Design and scale continuous evaluation pipelines (“hirify.global Gym”) that monitor model reasoning and safety capabilities in real-time.
- Drive high-stakes architectural decisions regarding compute allocation, distributed training strategies, and checkpointing mechanisms on AWS Trainium/Nova clusters.
Requirements
- 7+ years of experience in Machine Learning engineering or research, with a specific focus on LLM Pre-training, Domain Adaptation, or Transfer Learning.
- Expert-level proficiency in Python and deep learning frameworks (PyTorch or JAX), with a track record of debugging complex training instabilities at scale.
- Deep theoretical understanding of Transformer architectures and Pre-training dynamics (Catastrophic Forgetting, Knowledge Injection).
- Experience with Multimodal models (VLM), understanding how to align image/video encoders with language decoders.
- Experience implementing continuous integration/evaluation systems for ML models, measuring generalization and reasoning performance.
- Demonstrated ability to communicate complex technical concepts to leadership and coordinate efforts across Infrastructure and Data teams.
Nice to have
- Published research or open-source contributions in Continual Learning, Curriculum Learning, or Efficient Fine-Tuning (LoRA/Peft).
- Experience with Graph Neural Networks (GNNs) or processing tree-structured data.
- Proficiency in low-level optimization (CUDA, Triton) or distributed training frameworks (Megatron-LM, DeepSpeed, FSDP).
- Familiarity with Safety alignment techniques (RLHF/DPO) to understand pre-training objectives.
Culture & Benefits
- Comprehensive Healthcare Benefits and Income Replacement Programs.
- 401k with Employer Match.
- Global Benefit programs covering workspace, professional development, and caregiving support.
- Family Planning Support, Gender-Affirming Care, and Mental Health & Coaching Benefits.
- Flexible Vacation & Paid Volunteer Time Off.
- Generous Paid Parental Leave.
Hiring process
- Interviews may be recorded, transcribed, and summarized by artificial intelligence (AI), with an opportunity to opt out.
- Personal information collected includes Identifiers, Professional and Employment-Related Information, and Sensory Information for application evaluation.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →