Skip to content

Storing Data


By default, each user on Savio is entitled to a 10 GB home directory which receives regular backups; in addition, each Faculty Computing Allowance-using research group receives 30 GB of project space and each Condo-using research group receives 200 GB of project space to hold research specific application software shared among the group's users. All users also have access to the large Savio high performance scratch filesystem for working with non-persistent data.

Name Location Quota Backup Allocation Description
HOME /global/home/users/ 10 GB Yes Per User HOME directory for permanent data
GROUP /global/home/groups/ 30/200 GB No Per Group GROUP directory for shared data (30 GB for FCA, 200 GB for Condo)
SCRATCH /global/scratch/users none No Per User SCRATCH directory with Lustre FS. See below for details of purge policy.

For information on making your files accessible to other users (in particular members of your group), see these instructions.

Large input/output files and other files used intensively while running jobs should be placed in your scratch directory (rather than your home directory or a project directory) to avoid slowing down access to home directories for other users.

Scratch storage

Every Savio user has a scratch space located at /global/scratch/users/<username>. There are no limits on the use of the scratch space, though it is an 1.5 PB resource shared among all users.

Files stored in this space are not backed up; the recommended use cases for scratch storage are:

  • Working space for your running jobs (temporary files, checkpoint data, etc.)
  • High performance input and output
  • Large input/output files

Purge Policy

Please remember that global scratch storage is a shared resource. We strongly urge users to regularly clean up their data in global scratch to decrease scratch storage usage. Users with many terabytes of data may be requested to reduce their usage, if global scratch becomes full.

The following purge policy is effective August 12, 2021 for the newly upgraded Scratch storage system.

  • Research IT will purge files not accessed in 120 days
  • Research IT will run a check every week looking back at inactive files from the last 120 days to determine which files to delete
  • Research IT will notify users which files will be purged via a file in their user directory
  • As the system fills a more aggressive purge policy may be required to maintain system functionality

If you need to retain access to data on the cluster for more than 120 days between uses, you can consider purchasing storage space through the condo storage program.

Group scratch directories are not provided. Users who would like to share materials in their scratch directory with other users can set UNIX permissions to allow access to their directories/files.

Savio condo storage

Berkeley Research Computing (BRC) offers a Condo Storage service for researchers who are Savio Condo Cluster contributors and need additional persistent storage to hold their data sets while using the Savio cluster.

More storage options

If you need additional storage during the active phase of your research, such as longer-term storage to augment Savio's temporary Scratch storage, or off-premises storage for backing up data, the Active Research Data Storage Guidance Grid can help you identify suitable options.

Assistance with research data management

The campus's Research Data Management (RDM) service offers consulting on managing your research data, which includes the design or improvement of data transfer workflows, selection of storage solutions, and a great deal more. This service is available at no cost to campus researchers. To get started with a consult, please contact RDM Consulting.

In addition, you can find both high level guidance on research data management topics and deeper dives into specific subjects on RDM's website. (Visit the links in the left-hand sidebar to further explore each of the site's main topics.)

CGRL storage

The following storage systems are available to CGRL users. For running jobs, compute nodes within a cluster can only directly access the storage as listed below. The DTN can be used to transfer data between the locations accessible to only one cluster or the other, as detailed in the previous section.

Name Cluster Location Quota Backup Allocation Description
Home Both /global/home/users/$USER 10 GB Yes Per User Home directory ($HOME) for permanent data
Scratch Vector /clusterfs/vector/scratch/$USER none No Per User Short-term, large-scale storage for computing
Group Vector /clusterfs/vector/instrumentData/ 300 GB No Per Group Group-shared storage for computing
Scratch Rosalind (Savio) /global/scratch/users/$USER none No Per User Short-term, large-scale Lustre storage for very high-performance computing
Condo User Rosalind (Savio) /clusterfs/rosalind/users/$USER none No Per User Long-term, large-scale user storage
Condo Group Rosalind (Savio) /clusterfs/rosalind/groups/ none No Per Group Long-term, large-scale group-shared storage