Назад
Company hidden
13 часов назад

ML Engineer (AI)

120 000 - 250 000$
Формат работы
hybrid
Тип работы
fulltime
Английский
b2
Страна
France/UK
Релокация
France
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

ML Engineer (AI/LLMs): Training, post-training, and evaluating core LLMs for an AI safety platform with an accent on SFT, RLHF, and DPO-style alignment. Focus on building large-scale data pipelines, distributed multi-GPU training, and optimizing production inference.

Location: Hybrid in Paris or London (Relocation package available for Paris only)

Compensation: $120K – $250K + Equity

Company

AI safety company building a reliability and optimization layer for AI systems using natural-language policies to enforce model behavior.

What you will do

  • Train and post-train LLMs using SFT, RLHF, DPO, and related alignment methods.
  • Build reward models based on human and synthetic preference data.
  • Design and manage high-throughput data pipelines for collection, filtering, and quality control at scale.
  • Execute distributed training on multi-GPU clusters and debug performance issues.
  • Develop evaluation systems and benchmarks to drive training decisions.
  • Optimize models for production inference using quantization, speculative decoding, and vLLM/TensorRT.

Requirements

  • Hands-on experience with modern LLM post-training (SFT, RLHF, DPO) on self-trained models.
  • Experience building large-scale data pipelines for training corpora and synthetic data.
  • Proficiency in PyTorch or JAX with experience in distributed multi-GPU training.
  • Deep understanding of model evaluation and the ability to build reliable benchmarks.
  • Experience with inference optimization tools like vLLM, TensorRT, or Triton.
  • Must be based in or able to relocate to Paris or London.

Nice to have

  • Public builder footprint (open-source models, datasets, or papers on HuggingFace/GitHub).
  • Experience at frontier or near-frontier AI labs.
  • Knowledge of advanced RL methods for LLMs (e.g., online RL, GRPO-style methods).
  • Experience with large-scale moderation, safety, or classification models.
  • Experience in multilingual model training.

Culture & Benefits

  • Paid time off in accordance with local regulations.
  • Comprehensive medical insurance for the France-based team.
  • Relocation support available for candidates moving to Paris.
  • Full provision of necessary hardware, AI agent subscriptions, and IDEs.
  • Bi-annual team off-sites in diverse locations.

Hiring process

  • Introductory call with HR (25 min).
  • Take-home technical test task.
  • Technical interview with the Head of Applied Research (60 min).
  • Final conversation with the CEO (45 min).

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →