Collaboration with the UK AI Security Institute’s Alignment Project to study AI models’ test awareness and develop methods to detect and reduce evaluation gaming and security risks in advanced AI systems.
Endorsements support ELLIS Institute Tübingen.
Collaboration with the UK AI Security Institute’s Alignment Project to study AI models’ test awareness and develop methods to detect and reduce evaluation gaming and security risks in advanced AI systems.
Endorsements support ELLIS Institute Tübingen.
People– no linked people
Updated 05/18/26 · By grantmaking.aiGrants Received– no grants recorded
Updated 05/18/26 · By grantmaking.aiDiscussion
Sign in to comment
No comments yet. Be the first to share your thoughts.