A time‑bounded research program in Berkeley where participants worked with Redwood researchers on mechanistic interpretability of transformer models, using causal scrubbing and related tools to explain model behaviors.
Endorsements support Redwood Research.
A time‑bounded research program in Berkeley where participants worked with Redwood researchers on mechanistic interpretability of transformer models, using causal scrubbing and related tools to explain model behaviors.
Endorsements support Redwood Research.
People– no linked people
Updated 05/18/26 · By grantmaking.aiProject Details
Updated 05/18/26 · By grantmaking.aiREMIX (the Redwood Research Mechanistic Interpretability Experiment) invited a cohort of researchers to Berkeley for an intensive program focused on mechanistic interpretability. Participants collaborated with Redwood staff to generate and test hypotheses about how transformer models implement specific behaviors, applying Redwood’s causal scrubbing methodology and related tools to build more reliable interpretability techniques for future safety work.
Grants Received– no grants recorded
Updated 05/18/26 · By grantmaking.aiDiscussion
No comments yet. Be the first to share your thoughts.