Accounts for Sensitive Data¶
Summary
As of 2019, Savio users can set up projects to work with moderately-sensitive data. Moderately-sensitive data includes the P2 and P3 (as well as NIH dbGap data) data security classifications as defined by UC Policy (IS-3) and documented by the campus Information Security Office. Moderately sensitive data used to be described as “PL1” data. For more information on UC Berkeley data classification protection levels, please see here.
Note that researchers must first have one of the regular Savio accounts. The P2/P3 storage will be associated with a Faculty Computing Allocation (FCA), Instructional Computing Allocation (ICA) or Condo computing allocation. Each project will be provided a separate group directory and project users will be provided with a scratch directory where sensitive data can be stored.
Support for P2/P3 data in Savio is an important step towards broader support for secure and sensitive data, as part of our Secure Research Data and Compute (SRDC) initiative.
Steps for Sensitive Data¶
Here are the steps involved in setting up a project for working with sensitive data:
- Researchers can request a P2/P3 Project once they have a regular FCA or Condo account. (For information about creating these accounts, see these instructions).
- If you do not already have a Savio account, then to request a P2/P3 Savio environment, you need to first request access to a Savio FCA (Faculty Compute Allowance) or Condo allocation by visiting the MyBRC User Portal in your web browser. After following the on-screen instructions in the portal and registering and/or logging in, you should first review and submit the cluster User Access Agreement form on the Home ("Welcome") page (if you haven't already done so from within the portal previously) by clicking on the "Review" button, and then clicking on the "Join" button to request to join an existing project. Please note that we have now replaced the Google forms that were previously used for BRC account and project requests with the MyBRC User Portal. If a researcher does not already have a BRC cluster account, one will be created for them when they submit a request to join a project for the first time and when a PI adds them to a project within the MyBRC User Portal. PIs can also use the MyBRC User Portal to submit account requests for non-UCB users associated with their project who have MyBRC portal accounts. Users wishing to get access to an additional project (FCA or condo) in addition to their current project(s) can also do so via the MyBRC User Portal.
- The research group should consult with Research Data Management (RDM) to determine whether Savio is an appropriate service based on the sensitivity of the data and computational needs. Send email to researchdata@berkeley.edu to start this process. This may be initiated by the PI or someone else in the PI's group.
-
Researchers can now request the creation of new secure P2/P3 directories via the MyBRC User Portal, which has replaced the Savio P2/P3 Project Request Google form.
- PIs (but not project managers or regular users) of active FCA, ICA, and Condo projects will see a “Request a Secure Directory” button on their MyBRC User Portal project page (unless they already have a secure directory under the project). Clicking this button directs them to a “New Secure Directories” landing page. After carefully reviewing the content of this page, the PI should click “Continue”.
- The subsequent form consists of two steps: A “Secure Directory: Data Description” section and a “Secure Directory: Directory Name” form. In the “Data Description” section of the form, the PI is asked to provide a description of the kind of P2/P3 data the research group is planning to work with on Savio. Please include: (1) Dataset description (2) Source of dataset (3) Security & Compliance requirements for this dataset(s) (4) Number and sizes of files (5) Anticipated duration of usage of datasets on Savio. In the “Directory Name” section of the form, the PI is asked to provide the name of the secure directory they are requesting. For example, to request the secure directories /global/scratch/p2p3/pl1_example and /global/home/groups/pl1data/pl1_example, the PI would provide “example” . Note that the PI will receive both a groups and a scratch directory upon approval of this request. Both directories will have the same name in the cluster.
- If the existing Savio project is a pooled allocation and not all PI/faculty in the pool are working with the P2/P3 data, a new project will be set up for the P2/P3 data and a name between 4-8 characters in length must be specified.
-
A Research IT consultant from RDM or BRC will contact the requestor to acknowledge the form has been received and to ask any questions needed to determine whether Savio is the appropriate data management and computation platform.
-
The PI will be asked to review and sign a Researcher Use Agreement (RUA) that outlines the PI’s responsibilities for using Savio with their sensitive research data. As of February 2024, the RUA will be generated automatically when the secure directory request is submitted via the MyBRC portal. Once the secure directory request is submitted, the details of the request will be reviewed, confirmed, and edited (if and as needed) by administrators. Once this process is completed, the PI/researcher will be notified, and based on the (edited if needed) details of the PI's secure directory request, an unsigned RUA will be generated which the PI can download to their computer from the MyBRC portal by clicking on the "Download" button on the MyBRC portal secure directory request page. The PI can then review and sign the downloaded RUA and then upload the signed RUA form back up to the MyBRC portal for submission to and review by administrators by clicking on the "Upload" button on the MyBRC portal secure directory request page and selecting the signed RUA form to be uploaded from their computer.
-
Research IT staff review these agreements with the PI. Both the PI and a responsible Research IT party sign and submit the RUA as described in the previous step above.
- Once approved, RIT/BRC staff will set up the appropriate storage locations:
- For Faculty Compute Allowance accounts, each user will be given access to a P2/P3 group directory with a 30 GB quota limit, i.e., each P2/P3 project will get a group folder in /global/home/groups/pl1data/ on the home directory server.
- If the PI is a Condo owner (has contributed compute nodes to Savio) then they would be given access to a 200 GB P2/P3 group folder under /global/home/groups/pl1data.
- Each P2/P3 user also gets access to a directory in the P2/P3 scratch space located at /global/scratch/p2p3/. Note this is separate from their scratch directory that will be set up for their non-sensitive data (located at /global/scratch/users).
- Note that there are restrictions on how these directories must be used. For example, NIH DbGAP data stored in the P2/P3 group/project directories must be encrypted. Sensitive data can only be stored unencrypted in the associated scratch directory. More information about the handling and use of encrypted data, including NIH data, can also be found in the Researcher Use Agreement, as well as the document NIH Active Research workflow for NIH data for Savio, along with our documentation on sensitive data here.
- PIs can now also use the MyBRC User Portal to make a request for adding and removing users to P2/P3 projects in order to specify which users should be members of the group and have access to the resources.
- To submit a request to add a user to (or remove a user from) an already-existing P2/P3 project so that the user can have access to (or should have access removed from) an (already-existing) P2/P3 group or scratch directory, the PI selects the already-existing P2/P3 secure directory on Savio by going to the “Projects” tab on the MyBRC portal, selecting “Allocations”, then looking for “Group P2/P3 Directory (Cluster Directory)” and “Scratch P2/P3 Directory (Cluster Directory)” in the “Resource Name” column under “Filter”, and then selecting the needed Allocation ID. Then, on the “Allocation Detail” page, the PI can click the “Add Users” or “Remove Users” button next to the “Users in Allocation” section to add or remove users who have access to the P2/P3 group and scratch directories.
- Once a new request is submitted on the MyBRC portal, BRC Savio administrators will process this request to add a new user (and the user account will be added to the file permissions for the group directory/folder where the P2/P3 data is stored) or remove a user, and provide updates. BRC Savio administrators will confirm that the PI/group has consulted with RDM and that a signed RUA has been submitted, and will confirm approval of requests via email before provisioning account access to the restricted folder.
- All P2/P3 projects are set up with a special Unix group. P2/P3 users should use this Unix group and set the directory group ownership and permissions appropriately to limit access to P2/P3 datasets and files to relevant users only.
- When a request is submitted via the MyBRC Portal to remove a user from accessing P2/P3 group and scratch directories (for when a user is no longer a member of the project team using the covered system), the Savio accounts of those who are no longer active users will be deactivated by the BRC admin team.
- Principal Investigators are responsible for monitoring account access to P2/P3 data within the Savio environment. The Savio team will provide PIs with Linux command line syntax for checking group directory permissions to verify which user accounts have access. If changes are necessary, the PI will submit a request via the MyBRC Portal (see above) to BRC administrators to add or remove account access. These responsibilities are delineated in the Researcher Use Agreement (RUA) for using P2/P3 data in the Savio HPC environment.
- RIT will conduct a semi-annual email-based confirmation of active users.
- The PI and group are notified by RIT/BRC staff that everything is set up.