Storing Data¶
Summary
By default, each user on Savio is entitled to a 10 GB home directory which receives regular backups; in addition, each Faculty Computing Allowance-using research group receives 30 GB of project space and each Condo-using research group receives 200 GB of project space to hold research specific application software shared among the group's users. All users also have access to the large Savio high performance scratch filesystem for working with non-persistent data.
Name | Location | Quota | Backup | Allocation | Description |
---|---|---|---|---|---|
HOME | /global/home/users/ | 10 GB | Yes | Per User | HOME directory for permanent data |
GROUP | /global/home/groups/ | 30/200 GB | No | Per Group | GROUP directory for shared data (30 GB for FCA, 200 GB for Condo) |
SCRATCH | /global/scratch/ | none | No | Per User | SCRATCH directory with Lustre FS. Forthcoming purge policy to delete any file not accessed in 6 months |
For information on making your files accessible to other users (in particular members of your group), see these instructions.
Large input/output files and other files used intensively while running jobs should be placed in your scratch directory (rather than your home directory or a project directory) to avoid slowing down access to home directories for other users.
Scratch storage¶
Every Savio user has a scratch space located at /global/scratch/<username>
. There are no limits on the use of the scratch space, though it is an 1.5 PB resource shared among all users.
Files stored in this space are not backed up; the recommended use cases for scratch storage are:
- Working space for your running jobs (temporary files, checkpoint data, etc.)
- High performance input and output
- Large input/output files
Important
Please remember that global scratch storage is a shared resource. We strongly urge users to regularly clean up their data in global scratch to decrease scratch storage usage. Users with many terabytes of data may be requested to reduce their usage, if global scratch becomes full.
BRC has an ongoing inactive data purge policy for the scratch area. Any file that has not been accessed in at least 6 months will be subject to deletion. If you need to retain access to data on the cluster for more than 6 months between uses, you can consider purchasing storage space through the condo storage program.
Group scratch directories are not provided. Users who would like to share materials in their scratch directory with other users can set UNIX permissions to allow access to their directories/files.
Savio condo storage¶
Berkeley Research Computing (BRC) offers a Condo Storage service for researchers who are Savio Condo Cluster contributors and need additional persistent storage to hold their data sets while using the Savio cluster.
More storage options¶
If you need additional storage during the active phase of your research, such as longer-term storage to augment Savio's temporary Scratch storage, or off-premises storage for backing up data, the Active Research Data Storage Guidance Grid can help you identify suitable options.
Assistance with research data management¶
The campus's Research Data Management (RDM) service offers consulting on managing your research data, which includes the design or improvement of data transfer workflows, selection of storage solutions, and a great deal more. This service is available at no cost to campus researchers. To get started with a consult, please contact RDM Consulting.
In addition, you can find both high level guidance on research data management topics and deeper dives into specific subjects on RDM's website. (Visit the links in the left-hand sidebar to further explore each of the site's main topics.)
CGRL storage¶
The following storage systems are available to CGRL users. For running jobs, compute nodes within a cluster can only directly access the storage as listed below. The DTN can be used to transfer data between the locations accessible to only one cluster or the other, as detailed in the previous section.
Name | Cluster | Location | Quota | Backup | Allocation | Description |
---|---|---|---|---|---|---|
Home | Both | /global/home/users/$USER |
10 GB | Yes | Per User | Home directory ($HOME) for permanent data |
Scratch | Vector | /clusterfs/vector/scratch/$USER |
none | No | Per User | Short-term, large-scale storage for computing |
Group | Vector | /clusterfs/vector/instrumentData/ |
300 GB | No | Per Group | Group-shared storage for computing |
Scratch | Rosalind (Savio) | /global/scratch/$USER |
none | No | Per User | Short-term, large-scale Lustre storage for very high-performance computing |
Condo User | Rosalind (Savio) | /clusterfs/rosalind/users/$USER |
none | No | Per User | Long-term, large-scale user storage |
Condo Group | Rosalind (Savio) | /clusterfs/rosalind/groups/ |
none | No | Per Group | Long-term, large-scale group-shared storage |