Redwood Research Mechanistic Interpretability Experiment (REMIX)

active

A time‑bounded research program in Berkeley where participants worked with Redwood researchers on mechanistic interpretability of transformer models, using causal scrubbing and related tools to explain model behaviors.

Endorsements support Redwood Research.

People– no linked people

Project Details

REMIX (the Redwood Research Mechanistic Interpretability Experiment) invited a cohort of researchers to Berkeley for an intensive program focused on mechanistic interpretability. Participants collaborated with Redwood staff to generate and test hypotheses about how transformer models implement specific behaviors, applying Redwood’s causal scrubbing methodology and related tools to build more reliable interpretability techniques for future safety work.

Grants Received– no grants recorded

Discussion

No comments yet. Be the first to share your thoughts.