Skip to content

Public Datasets Available on Savio

We make available some large public datasets used for certain workflows/software packages. These datasets are available on a read-only basis in subdirectories located at /global/scratch/collections.

The datasets include:

Dataset Directory Version Date Downloaded Details
Blast blastdb 5.0 2023-03-24 various datasets including 'nr' and 'nt'
ColabFold databases colabdb TBD 2022-10-27
genome/RefSeq genomes TBD 2023-04-11

For more details, please see the README files in the database-specific subdirectories of global/scratch/collections.

You can request additional datasets (or an update to existing datasets) through our Software/Data Request Form.