BELLS benchmark

active

An open-source benchmark suite developed by CeSIA to evaluate and compare large language model supervision and safeguard systems, measuring how reliably they detect problematic or unsafe behaviour in other models.

Endorsed by+1

Endorsements support CeSIA.

Endorsed by+1

Endorsements support CeSIA.

People– no linked people

Theory of Change

By providing standardised, open-source evaluations of how well different guardrail and monitoring systems detect harmful or non-compliant model behaviour, BELLS aims to raise the bar for AI supervision tools and inform regulators, labs, and safety institutes about which approaches best mitigate real-world risks from advanced language models.

Grants Received– no grants recorded

Discussion

No comments yet. Be the first to share your thoughts.