TL;DR
Senior Research Scientist (Language AI): Designing, implementing, and deploying cutting-edge research in reinforcement learning and post-training for large language models with an accent on aligning models with human intent and enabling general capabilities. Focus on building and deploying state-of-the-art reinforcement learning pipelines at scale and driving innovations into production for hirify.global's post-training stack.
Location: Hybrid in Berlin, Cologne, Hamburg, Munich, or London (office attendance required twice a week)
Company
hirify.global is a global communications platform powered by Language AI, focused on breaking down language barriers with human-sounding translations and intelligent writing suggestions for over 100,000 businesses worldwide.
What you will do
- Design, implement, and deploy cutting-edge research in reinforcement learning and post-training at scale.
- Build and deploy state-of-the-art reinforcement learning pipelines.
- Post-train large (multi-modal) models to align with human intent and enable reasoning capabilities.
- Manage the entire research and production lifecycle from idea conception to production deployment.
- Foster external collaborations with academic and industrial partners.
- Collaborate with Engineering, ML Platform, and HPC teams to deliver robust model updates.
Requirements
- Deep technical background, strong leadership skills, and a proven track record in reinforcement learning or large-scale model alignment to production.
- Strong practical background, creative mindset, and passion for solving hard problems with real-world impact.
- Solid mathematical background (Master's, PhD, or equivalent industry experience in mathematics, physics, computer science, or related field).
- Deep practical experience in Python and at least one modern machine learning framework (PyTorch, TensorFlow, or JAX).
- Track record of leading self-directed research projects that deliver tangible results.
- Hybrid work schedule, with team members coming into the office twice a week in Berlin, Cologne, Hamburg, Munich, or London.
Nice to have
- Experience working with large compute clusters and ML infrastructure.
- Expertise in deep reinforcement learning (RLHF/RLAIF/RLVR).
- Hands-on experience scaling and deploying LLMs or other foundation models in real-world systems.
Culture & Benefits
- Diverse and internationally distributed team (90+ nationalities).
- Open communication, regular feedback, and a culture valuing empathy and growth mindset.
- Flexible working hours and trust in productivity.
- Monthly full-day hacking sessions ("Hack Fridays").
- 30 days of annual leave and access to mental health resources.
- Competitive, location-tailored benefits package.
- Virtual Shares, linking employee contribution to hirify.global’s growth.
Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →