Model Policy, Frontier Cyber Risk (AI Cybersecurity)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Model Policy, Frontier Cyber Risk (AI Cybersecurity): Define model policies for high-risk cybersecurity contexts with an accent on dual-use capabilities, threat models, and behavioral specifications. Focus on designing evaluation criteria, system mitigations, and safeguards across training, deployment, and monitoring.
Location: Based in San Francisco office, hybrid model (three days in office per week with optional work from home on Thursdays and Fridays). Relocation support offered to new employees.
Salary: $207K – $295K + equity
Company
AI research and deployment company dedicated to safe AGI benefiting humanity.
What you will do
- Design and maintain model policies for cybersecurity and frontier-risk domains, focusing on dual-use and high-risk capabilities.
- Translate threat models into behavioral specifications, evaluation criteria, grading guidance, and mitigations.
- Define boundaries between legitimate security research, defensive workflows, and harmful assistance.
- Build policy artifacts for implementation in training, evaluation, deployment, monitoring, and escalation systems.
- Partner with research, engineering, safety, and product teams to operationalize policies into scalable safeguards.
- Analyze red-teaming, deployment data, failures, and edge cases to iterate on policies and evaluations.
- Identify emerging cyber risks where AI lowers misuse barriers or boosts malicious capabilities.
- Contribute to system cards, safety reports, and external communications on cyber risk mitigation.
Requirements
- Strong technical expertise in cybersecurity (offensive/defensive security, vulnerability research, malware analysis, incident response, threat intelligence, app/infra/cloud security).
- Strong judgment on AI impacts to cyber threat landscape, including dual-use and agentic risks.
- Ability to distinguish legitimate security uses from harmful assistance.
- Experience with threat models in complex, adversarial environments.
- Translate security expertise into policy frameworks, evaluations, and enforcement.
- Use empirical evidence (evaluations, red-teaming, failures) for policy decisions.
- Systems thinking across policy, evaluations, training, deployment, and monitoring.
- Cross-functional collaboration and strong written communication on technical concepts.
Culture & Benefits
- Hybrid workplace with height-adjustable desks, well-stocked kitchens, three daily meals, outdoor space, nap rooms, bike storage.
- Pragmatic safety approach balancing risk reduction with beneficial AI uses.
- Equal opportunity employer committed to diversity and accommodations for disabilities.
- Background checks per applicable law, considering fair chance ordinances.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →