Technical AI Policy Researcher, Frontier Risk - Trust and Safety
San Francisco
Monday, 25 May 2026
Design and maintain multimodal Gen. AI policies across safety-relevant domains, including frontier risk in agentic modalities, loss of control or deceptive misalignment. - Translate risk and harm models into clear behavioral specifications, evaluation criteria, grading guidance, and system-level safeguards. - Define practical boundaries between beneficial uses of AI and assistance that could materially enable harm, exploitation, misuse, or unsafe outcomes. - Build policy artifacts that support model training, evaluation, and deployment. Partner with safety researchers, engineers, product teams, and other stakeholders to operationalize policy into scalable model behavior and measurable safeguards. - Design end-to-end policy to pre-launch evaluation to post-launch monitoring workflows across safety-relevant domains, including golden set construction, labeling guidance, calibration, adjudication, and eval coverage analysis, to ensure policies can be reliably measured and improved. - Use red-teaming results, deployment data, model failures, over-refusals, under-refusals, and ambiguous edge cases to improve policy and evaluation quality over time. - Identify emerging capability areas where frontier AI systems could create new safety, fairness or bias challenges or lower barriers to harm. - Monitor post-launch model activity to identify gaps in our policy framework to capture unsafe model behaviour. - Champion research to strengthen the defensibility and operability of policy positions, including working with Outreach and Partnerships to incorporate external expert input into relevant policy positions. - Combine longer-horizon safety research with hands-on launch and deployment work. - Contribute to risk reports, policy documentation, launch reviews, and AI governance reviews on the company's approach to building AI responsibly. - Support regulatory teams as a subject matter expert on AI compliance related initiatives. Requirements: Minimum Qualifications: - 5 years in Trust & Safety, AI Safety Research, AI Ethics, technical AI Governance, or equivalent experience. - Advanced degree in Computer Science, Human-Computer Interaction, Engineering, Data Science or quantitative Social Sciences - Direct experience in frontier risk research, AI evaluations, red-teaming, or AI governance work. - Strong technical understanding of LLM, multimodel, or genmedia model behavior, model failure modes, and safety risks. - Demonstrated experience working with external experts and stakeholders, including civil society, government, and academia. - Demonstrated success working in a fast-paced technology company or research organization conducting AI impact, risk assessments or algorithmic audits, and/or data science or product development related experience. - Ability to advocate for safety amongst a wide variety of business stakeholders including Product Policy, Engineering, Public Policy, Legal, Communications, and Data Science. Preferred Qualifications: - Technical knowledge in high efficiency on device architectures, multimodal understanding, V&V of AI systems, or RAG is beneficial but not required. - Experience working with governments, frontier AI companies, or AI Safety organizations. - Experience working in non-western cultures, with a particular emphasis on the global south. - Understanding of Trust & Safety positioning in the entertainment media technology sector, with comfort learning internal tools and product workflows.