Harmful biases in large language models (LLMs) make AI less trustworthy and secure. Auditing for biases can help identify potential solutions and develop better guardrails to make AI safer. In this podcast, Katie Robinson and Violet Turri, researchers in the SEI’s AI Division, discuss their recent work using role-playing game scenarios to identify biases in LLMs.
History
Publisher Statement
Audiovisual published 2024 via Software Engineering Institute, Carnegie Mellon University