Researcher, Automated Red Teaming (AI Safety)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Researcher, Automated Red Teaming (AI Safety): Building scalable, research-driven systems to continuously discover failure modes in AI models and their mitigations with an accent on automated classifier jailbreak discovery and bio threat-development elicitation. Focus on designing reproducible and interpretable experiments, building scalable automation for red teaming, and translating findings into actionable improvements for AI safety.
Location: Onsite in San Francisco, US
Salary: $295,000–$445,000
Company
OpenAI is an AI research and deployment company focused on ensuring that general-purpose artificial intelligence benefits all of humanity.
What you will do
- Own the research and technical direction for automated red teaming across catastrophic risk areas, with an initial emphasis on classifier jailbreak and bio threat-development elicitation.
- Partner with vertical risk teams (Cyber, Bio, Loss of Control) to define threat models, prioritize targets, and land mitigations.
- Collaborate with the Classifiers team to convert discovered attacks into training data, evaluations, and measurable robustness gains.
- Engage with product, engineering, and safety stakeholders to ensure automated red teaming outputs are operationally useful.
Requirements
- Strong motivation for AI safety and reducing real-world catastrophic risk.
- Strong applied research instincts, especially in designing reproducible, interpretable evaluations.
- Hands-on experience with LLMs and agents, including multi-turn behaviors and tool use.
- Comfortable building scalable automation, transforming red-teaming ideas into continuous pipelines.
- Solid software engineering fundamentals (data structures, algorithms, testing discipline).
- Ability to think in threat models and incentives, anticipating attacker actions and system failure under pressure.
- Capacity to translate complex findings into actionable plans and drive alignment across research, engineering, product, and policy teams.
- Focus on efficiency and prioritization, willing to decline low-leverage work to maximize risk reduction.
Nice to have
- Experience in adversarial ML.
- Experience in security research/red teaming.
- Experience in abuse prevention systems.
- Experience in large-scale evaluation infrastructure.
Culture & Benefits
- Dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.
- Committed to developing AI with safety and human needs at its core.
- Values diverse perspectives, voices, and experiences.
- Equal opportunity employer with commitments to non-discrimination and reasonable accommodations for disabilities.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →