Назад
Company hidden
3 месяца назад

Member Of Technical Staff - Safety Lead (AI)

Формат работы
onsite
Тип работы
fulltime
Грейд
lead
Английский
b2
Страна
UK/US
Релокация
UK/US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Member of Technical Staff - Safety Lead (AI): Owning the red-teaming and adversarial evaluation pipeline for Reflection’s models with an accent on security, misuse, and alignment gaps. Focus on developing scalable, automated safety benchmarks that evolve alongside model capabilities, moving beyond static datasets to dynamic adversarial testing.

Location: San Francisco, London, or New York. On-site.

Company

Reflection’s mission is to build open superintelligence and make it accessible to all.

What you will do

  • Own the red-teaming and adversarial evaluation pipeline for Reflection’s models.
  • Translate safety findings into concrete guardrails, ensuring models behave reliably under stress and adhere to deployment policies.
  • Validate that every release meets the lab’s risk thresholds before it ships.
  • Develop scalable, automated safety benchmarks that evolve alongside our model capabilities.
  • Research and implement state-of-the-art jailbreaking techniques and defenses.

Requirements

  • Graduate degree (MS or PhD) in Computer Science, Machine Learning, or related discipline, or equivalent practical experience in AI Safety.
  • Deep technical understanding of LLM safety, including adversarial attacks, red-teaming methodologies, and interpretability.
  • Strong software engineering capabilities with experience building automated evaluation pipelines or large-scale ML systems.
  • Willing to make high-stakes decisions regarding model release and safety thresholds.
  • Passionate about advancing the frontier of intelligence.

Nice to have

  • Experience with Reinforcement Learning (RLHF/RLAIF) and how it impacts model safety and alignment.

Culture & Benefits

  • Top-tier compensation: Salary and equity structured to recognize and retain the best talent globally.
  • Comprehensive medical, dental, vision, life, and disability insurance.
  • Fully paid parental leave for all new parents, including adoptive and surrogate journeys.
  • Financial support for family planning.
  • Paid time off when you need it and relocation support.
  • Lunch and dinner are provided daily.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →