Globus at UC Berkeley¶
Globus is a data transfer and storage service that allows one to easily and quickly move data between different resources (e.g., a personal computer, the Savio campus cluster, bDrive, and various others) and to share data on those resources with others. UC Berkeley has a Globus subscription (i.e., "High Assurance and HIPAA/BAA subscription") that provides a wide variety of functionality to all campus affiliates.
Data Transfer Between Various Resources¶
With Globus, you can transfer data between various resources. These include:
- Your personal computer, using Globus Connect Personal
- A lab or shared computer (either using Globus Connect Personal, setting up the computer as a Globus endpoint, and/or setting up Globus Connect Server 5 (GCSv5) on the computer to share collections with users)
- Savio, including home directories, scratch directories, and condo storage
- Research IT’s SRDC secure computing environment
- Departmental computing facilities, including the Statistical Computing Facility and Econometrics Laboratory
- An AEoD virtual machine, using Globus Connect Personal
- bDrive/Google Drive
- Various other cloud platforms, including Wasabi, AWS, and Google Cloud Storage
Currently it is not possible to use Globus to transfer data to/from Box.
Sharing Your Data with Collaborators¶
Users can set up guest collections (also termed shared endpoints in some cases) on a resource that they have access to in order to share data with collaborators that they choose. This gives the collaborator access to specific directories that you choose. The collaborator with whom you share does not have to have an account on the resources. This includes:
- Sharing directories in your Savio home, scratch, or condo storage directory
- Sharing folders in bDrive / Google Drive
- Sharing directories/folders on various cloud resources, including Wasabi and Google Cloud Storage
UC Berkeley’s Globus High Assurance and HIPAA/BAA subscription also allows you to set up guest collections (for sharing data) on your personal or lab computer that are discoverable by other Globus users. We can upgrade any user at UC Berkeley to Globus Plus as part of the subscription. Globus Plus users can, for example, create guest collections on their personal computers (using Globus Connect Personal) and transfer files between Globus Connect Personal endpoints (e.g., on their personal computers). Moreover, Globus Plus allows Globus Connect Personal to be used on endpoint devices as identifiable endpoints. The Globus subscription also allows researchers, for example, to install Globus Connect Server (version 5) on a lab computer so they could set up endpoints and collections that others could find and access, and so share data with other researchers. Note that you do not need Globus Plus if transferring files to/from a Globus server endpoint (e.g. your campus cluster or supercomputing center – such as Savio), or if you want to share files from a Globus server endpoint. If you need to upgrade your Globus account to Globus Plus, please contact us at firstname.lastname@example.org to request a Globus Plus invite. For more details see “What is Globus Plus? Do I need it?”
Getting help with Globus¶
Follow these links for help with:
- Downloading and installing Globus Connect Personal (free)
- Using Globus to transfer data to/from Savio
- Using Globus with your personal computer
- Using Globus with bDrive and other resources
For the following, please contact email@example.com:
- Setting up a lab computer/facility as an endpoint that can manage collections accessible and discoverable by collaborators.
- Setting up Globus Connect Personal to connect two personal computers to transfer data between or to share data with collaborators.
UCB Globus Collections/Endpoints¶
UC Berkeley’s Globus High Assurance and HIPAA/BAA subscription provides users with access to premium storage connectors (which support storage systems such as Google Drive, AWS S3, Google Cloud Storage, Wasabi, and Cloudian), and thus mapped collections, which are created by the endpoint administrator. Access to mapped collections requires an account on the endpoint’s host system. Mapped collections created on storage gateways that are flagged for high-assurance data are automatically configured for use with protected data.
The endpoint administrator specifies the identities required for access and an authentication assurance timeout period. If a user attempts access without having authenticated as required within the timeout period, the user will be prompted to authenticate with the required identity.
The steps for discovering and using mapped collections to access data are described in the Globus how-to guide, "Find and use a mapped collection from the Globus web app".
The names of the mapped collections users should use to transfer data and set up guest collections for the various services/connectors are specified in the table below. Users can search for the collections within the Globus web app when they wish to transfer files among collections (and to set up guest collections), including their own Globus Connect Personal endpoint. Users can also bookmark these for later reuse without having to search for them.
|UCB System||UCB Globus Collection/Endpoint Name(s)|
UCB BRC Savio Posix Data
|Google Cloud Storage||UCB Google Cloud Storage Collection|
|Wasabi||UCB S3 Wasabi Data|
|Amazon Web Services (AWS)||UCB AWS S3 Collections|
|Google Drive (bDrive)||UCB Google Drive Collections|
|Cloudian Storage||UCB Cloudian Storage Collections|
Also, note that the Globus Timer service (see here also), which can be used to automate and schedule data transfers with Globus, is not supported for high assurance collections at this time, so it can't be used with the Savio ucb#brc endpoint or with the other mapped collections listed in the table above. Savio users need to use the ucb#brc-basic (non-High Assurance) endpoint in order to use the Globus Timer service.