circuit-tracer is an open-source library for finding, visualizing, and intervening on feature circuits in language models using sparse autoencoder and transcoder features, developed by Anthropic fellows and maintained in collaboration with Decode Research.
Endorsements support Decode Research.
circuit-tracer is an open-source library for finding, visualizing, and intervening on feature circuits in language models using sparse autoencoder and transcoder features, developed by Anthropic fellows and maintained in collaboration with Decode Research.
Endorsements support Decode Research.
People
Updated 05/18/26 · By grantmaking.aiPrimary maintainer and lead author
Project Details
Updated 05/18/26 · By grantmaking.aicircuit-tracer is a collaborative mechanistic interpretability project that implements tools for discovering and analyzing feature circuits in large language models. Building on sparse autoencoders and transcoders, it identifies features that are causally responsible for particular outputs, assembles them into directed attribution graphs, and exposes APIs for visualizing and editing these circuits. The library was initially developed by Anthropic fellows and released as open source; it is maintained in close collaboration with Decode Research, which hosts the decoderesearch/circuit-tracer repository and provides a Neuronpedia integration so that researchers can run circuit tracing through an interactive web interface without needing local setup. circuit-tracer has been used in downstream work on circuit tracing in both language and vision-language models and is part of a broader ecosystem of interpretability tools around Neuronpedia and SAELens.
Theory of Change
Updated 05/18/26 · By grantmaking.aicircuit-tracer is based on the view that understanding model behavior requires moving beyond neuron-level saliency to structured, causal circuits over interpretable features. By making it easy for researchers to discover and inspect these feature circuits, run attribution analyses, and test targeted interventions, circuit-tracer aims to reveal how models implement behaviors such as multi-step reasoning or hallucination suppression. In combination with Neuronpedia and SAELens, this should help the alignment community build more reliable tools for auditing, debugging, and ultimately controlling advanced models.
Grants Received– no grants recorded
Updated 05/18/26 · By grantmaking.aiDiscussion
No comments yet. Be the first to share your thoughts.