Skip to content

Public Datasets Available on Savio

We make available some large public datasets used for certain workflows/software packages. These datasets are available on a read-only basis in subdirectories located at /global/scratch/collections.

The datasets include:

Dataset Directory Version Update/Download Details
Blast blastdb 5.0 monthly various datasets including 'nr' and 'nt'
ColabFold databases colabdb 1.5.2 every 3 months UniRef30, BFD/Mgnfiy, ColabFold DB
genome/RefSeq genomesdb Release 218 yearly fungi, invertebrate, plant, vertebrate_mammalian, vertebrate_other
alphafold alphafolddb 2.3.0 every 3 months BFD, MGnify, PDB70, PDB PDB seqres.UniProt, UniRef30,UniProt, UniRef90

For more details, please see the README files in the database-specific subdirectories of global/scratch/collections.

AlphaFold 3 Model Parameters

The AlphaFold 3 trained model parameters are available free of charge for non-commercial use, in accordance with the AlphaFold 3 Model Parameters Terms of Use. You may only use the model parameters if received directly from Google. To request access to the AlphaFold 3 trained model parameters, please complete the information in the linked form here. You must provide accurate and up-to-date information.

You can request additional datasets (or an update to existing datasets) through our Software/Data Request Form.