Unlearning in genomic and protein language models

Raising $20,000active

Implementing different types of unlearning methods for genomic and protein language models to remove sensitive biological information (e.g. pathogen virulence) while preserving predictive performance and scientific utility.

Project Details

This project will develop novel methods for removing sensitive information from genomic and protein language models without having to retrain them from scratch. The methods that will be developed here, will focus on ESM3 and EVO 2 but they will be transferable across genomic and protein language models. We will study how to identify and remove targeted biological knowledge from these models while keeping their broader capabilities intact, by measuring performance across different benchmarks.

The work will bring together three researchers with expertise in AI, genomics and biosafety, Dr. Georgakopoulos-Soares, Mr. Aris Karatzikos and Mr. Kimon Provatas. The team will design unlearning algorithms, test them on genomic and protein language models, and evaluate whether the models can successfully "forget" viral and bacterial virulence, transmissibility and toxicity capabilities without losing performance on useful downstream tasks.

The findings will include open-source code for unlearning across biological data, and a set of case studies showing how unlearning can improve the safety of different genomic and protein language models.

Theory of Impact

Our proposed project reduces AI-related existential risk by reducing the likelihood of an individual or group using genomic and protein language models for bioterrorist attacks. The set of developed software tools will be integrable to genomic and protein language models and we hope that developers of such models will incorporate our unlearning approaches.

People

Ilias Georgakopoulos-Soares

Team Member

Grants Received– no grants recorded

Funding Asks

grantmaking.ai Launch Round

Applied

Minimum

$20,000

Ideal

$30,000

How the money will be spent

The money will be spent on the salaries of the three researchers through our non-profit (https://biosafelabs.org/) and for computing. The work will be performed indepedent of our work at UT Austin, using our personal time, which will also help push our non-profit (BiosafeLabs) towards sustained, organic growth.

Discussion

No comments yet. Be the first to share your thoughts.