Назад
обновлено 10 дней назад

Researcher, Automated Red Teaming (AI Safety)

295 000 - 445 000$
Формат работы
onsite
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Researcher, Automated Red Teaming (AI Safety): Building scalable, research-driven systems to continuously discover failure modes in AI models and their mitigations with an accent on automated classifier jailbreak discovery and bio threat-development elicitation. Focus on designing reproducible and interpretable experiments, building scalable automation for red teaming, and translating findings into actionable improvements for AI safety.

Location: Onsite in San Francisco, US

Salary: $295,000–$445,000

Company

OpenAI is an AI research and deployment company focused on ensuring that general-purpose artificial intelligence benefits all of humanity.

What you will do

  • Own the research and technical direction for automated red teaming across catastrophic risk areas, with an initial emphasis on classifier jailbreak and bio threat-development elicitation.
  • Partner with vertical risk teams (Cyber, Bio, Loss of Control) to define threat models, prioritize targets, and land mitigations.
  • Collaborate with the Classifiers team to convert discovered attacks into training data, evaluations, and measurable robustness gains.
  • Engage with product, engineering, and safety stakeholders to ensure automated red teaming outputs are operationally useful.

Requirements

  • Strong motivation for AI safety and reducing real-world catastrophic risk.
  • Strong applied research instincts, especially in designing reproducible, interpretable evaluations.
  • Hands-on experience with LLMs and agents, including multi-turn behaviors and tool use.
  • Comfortable building scalable automation, transforming red-teaming ideas into continuous pipelines.
  • Solid software engineering fundamentals (data structures, algorithms, testing discipline).
  • Ability to think in threat models and incentives, anticipating attacker actions and system failure under pressure.
  • Capacity to translate complex findings into actionable plans and drive alignment across research, engineering, product, and policy teams.
  • Focus on efficiency and prioritization, willing to decline low-leverage work to maximize risk reduction.

Nice to have

  • Experience in adversarial ML.
  • Experience in security research/red teaming.
  • Experience in abuse prevention systems.
  • Experience in large-scale evaluation infrastructure.

Culture & Benefits

  • Dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.
  • Committed to developing AI with safety and human needs at its core.
  • Values diverse perspectives, voices, and experiences.
  • Equal opportunity employer with commitments to non-discrimination and reasonable accommodations for disabilities.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →