Skip to content

Public Datasets Available on Savio

We make available some large public datasets used for certain workflows/software packages. These datasets are available on a read-only basis in subdirectories located at /global/scratch/collections.

The datasets include:

Dataset Directory Version Update/Download Details
Blast blastdb 5.0 monthly various datasets including 'nr' and 'nt'
ColabFold databases colabdb 1.5.2 every 3 months UniRef30, BFD/Mgnfiy, ColabFold DB
genome/RefSeq genomesdb Release 218 yearly fungi, invertebrate, plant, vertebrate_mammalian, vertebrate_other
alphafold alphafolddb 2.3.0 every 3 months BFD, MGnify, PDB70, PDB PDB seqres.UniProt, UniRef30,UniProt, UniRef90

For more details, please see the README files in the database-specific subdirectories of global/scratch/collections.

You can request additional datasets (or an update to existing datasets) through our Software/Data Request Form.