Skip to content

Globus at UC Berkeley

Globus is a data transfer and storage service that allows one to easily and quickly move data between different resources (e.g., a personal computer, the Savio campus cluster, bDrive, and various others) and to share data on those resources with others. UC Berkeley has a Globus subscription (i.e., "High Assurance and HIPAA/BAA subscription") that provides a wide variety of functionality to all campus affiliates.

Data Transfer Between Various Resources

With Globus, you can transfer data between various resources. These include:

  • Your personal computer, using Globus Connect Personal
  • A lab or shared computer (either using Globus Connect Personal, setting up the computer as a Globus endpoint, and/or setting up Globus Connect Server 5 (GCSv5) on the computer to share collections with users)
  • Savio, including home directories, scratch directories, and condo storage
  • Research IT’s SRDC secure computing environment
  • Departmental computing facilities, including the Statistical Computing Facility and Econometrics Laboratory
  • An AEoD virtual machine, using Globus Connect Personal
  • bDrive/Google Drive
  • Various other cloud platforms, including Wasabi, AWS, and Google Cloud Storage

 

Currently it is not possible to use Globus to transfer data to/from Box.

Sharing Your Data with Collaborators

Users can set up guest collections (also termed shared endpoints in some cases) on a resource that they have access to in order to share data with collaborators that they choose. This gives the collaborator access to specific directories that you choose. The collaborator with whom you share does not have to have an account on the resources. This includes:

 

UC Berkeley’s Globus High Assurance and HIPAA/BAA subscription also allows you to set up guest collections (for sharing data) on your personal or lab computer that are discoverable by other Globus users. We can upgrade any user at UC Berkeley to Globus Plus as part of the subscription. Globus Plus users can, for example, create guest collections on their personal computers (using Globus Connect Personal) and transfer files between Globus Connect Personal endpoints (e.g., on their personal computers). Moreover, Globus Plus allows Globus Connect Personal to be used on endpoint devices as identifiable endpoints. The Globus subscription also allows researchers, for example, to install Globus Connect Server (version 5) on a lab computer so they could set up endpoints and collections that others could find and access, and so share data with other researchers. Note that you do not need Globus Plus if transferring files to/from a Globus server endpoint (e.g. your campus cluster or supercomputing center – such as Savio), or if you want to share files from a Globus server endpoint. If you need to upgrade your Globus account to Globus Plus, please contact us at brc-hpc-help@berkeley.edu to request a Globus Plus invite. For more details see “What is Globus Plus? Do I need it?”

Getting help with Globus

Follow these links for help with:

 

For the following, please contact research-it-consulting@berkeley.edu:

 

UCB Globus Collections/Endpoints

UC Berkeley’s Globus High Assurance and HIPAA/BAA subscription provides users with access to premium storage connectors (which support storage systems such as Google Drive, AWS S3, Google Cloud Storage, Wasabi, and Cloudian), and thus mapped collections, which are created by the endpoint administrator. Access to mapped collections requires an account on the endpoint’s host system. Mapped collections created on storage gateways that are flagged for high-assurance data are automatically configured for use with protected data.

The endpoint administrator specifies the identities required for access and an authentication assurance timeout period. If a user attempts access without having authenticated as required within the timeout period, the user will be prompted to authenticate with the required identity.

The steps for discovering and using mapped collections to access data are described in the Globus how-to guide, "Find and use a mapped collection from the Globus web app".

The names of the mapped collections users should use to transfer data and set up guest collections for the various services/connectors are specified in the table below. Users can search for the collections within the Globus web app when they wish to transfer files among collections (and to set up guest collections), including their own Globus Connect Personal endpoint. Users can also bookmark these for later reuse without having to search for them.

UCB System UCB Globus Collection/Endpoint Name(s)
BRC (Savio) ucb#brc
ucb#brc-basic
UCB BRC Savio Posix Data
Google Cloud Storage UCB Google Cloud Storage Collection
Wasabi UCB S3 Wasabi Data
Amazon Web Services (AWS) UCB AWS S3 Collections
Google Drive (bDrive) UCB Google Drive Collections
Cloudian Storage UCB Cloudian Storage Collections

Note that the ucb#brc and UCB BRC Savio Posix Data endpoints/collections point to the same data on Savio.

Also, note that the Globus Timer service (see here also), which can be used to automate and schedule data transfers with Globus, is not supported for high assurance collections at this time, so it can't be used with the Savio ucb#brc endpoint or with the other mapped collections listed in the table above. Savio users need to use the ucb#brc-basic (non-High Assurance) endpoint in order to use the Globus Timer service.