Research IT Documentation¶
Welcome to the Research IT documentation website. Research IT supports computing and storage services via the Berkeley Research Computing (BRC) group, as well as tools and expertise for managing research data via the Research Data Management (RDM) group.
- Analytics Environments on Demand (AEoD)
- Cloud Computing Support
- High Performance Computing (Savio)
- Research Data Management
- Secure Research Data & Compute (SRDC)
Research IT offers consulting for UC Berkeley researchers. From identifying the best computing solution to working with sensitive data, Research IT consultants are poised at the intersection of research and technology and can help. We are here to help understand your needs, match you to appropriate resources, and help you get started using them. Our diverse team is made up of experts in both data and computing from a wide variety of domains. Visit our office hours or get in touch to get help.
Research IT Facilitated Resources¶
The sections below list information on computational resources both directly provided and facilitated by Research IT. It is meant to be a summary and overview of the research computing landscape and is not exhaustive. For more information about any of the resources listed here, please either click the linked name of the resource to visit its webpage or get in touch with BRC consulting so that we may discuss the system(s) most appropriate for your workflows.
Resources directly provided by Research IT¶
Research IT facilitates access to three main resources on the UCB campus: the Savio Linux cluster, AEoD virtual machines, and Secure Research Data and Compute (SRDC) virtual machines and Linux cluster.
|Savio||Analytics Environments on Demand (AEoD)||Secure Research Data and Compute|
|Best suited for||Traditional HPC, Linux workflows, batch job submission||Interactive work in Windows / Linux environments, scalable memory needs (2-64 GB of RAM or more)||Interactive and traditional HPC workflows for highly sensitive data|
|Max. job time||72 hours (unlimited for Condo users)||Variable based on MOU||Variable based on MOU|
|Storage options||10GB Home Directory, Condo Storage||5GB personal storage, optional CIFS NAS mounting||Variable based on MOU|
|Data transfer options||scp & sftp, rclone to Berkeley Box & bDrive, Globus endpoints via dtn.brc.berkeley.edu||bDrive via Google Drive folder sync (recommended), Berkeley Box via Box Sync or CyberDuck||Secure data mover via scp & sftp|
|Sensitive data? 1||P1/P2/P3||P1/P2/P3||P4|
|Cost||Free (optional purchasing)||Free (with MOU)||Free (with MOU), additional purchasing|
If you are interested in using the Savio cluster, please see our extensive documentation for more information, including technical specifications. If you have questions about the cluster and how it might fit into your workflow, please email us at email@example.com with a brief description of your project, a concise overview of the software and data you're using, and a summary of what stage you are at in your research process.
If you are interested in using AEoD with your Windows software, please email us at firstname.lastname@example.org.
Resources facilitated (not directly provided) by Research IT¶
ACCESS (formerly XSEDE) is an NSF-funded platform that provides access to a variety of computational and storage resources for US-based university and non-profit researchers at no cost. BRC can provide interested UCB researchers with access to our limited trial allocation to test out the various resources ACCESS provides. Below we list a few of the main computational resources available on ACCESS, along with their associated storage options.
|Best suited for||Rapid access trial accounts, GPU applications, traditional HPC, large (up to 2 TB) memory needs, long-running jobs, burst-to-cloud, composable systems||Extreme (up to 4 TB) memory needs, web server use, Hadoop/Spark use, GPU applications, traditional HPC||Rapid access trial accounts, virtual machines, reproducibility, interactive work||GPU applications, visualization, traditional HPC|
|Max. job time||48 hours default, up to one week with approval||48 hours||N/A||Typically 48 hours, 120 hours with
|Storage options||100 GB home directory, SSD Scratch, Parallel Lustre Scratch, Ceph Object Store, Storage Allocations for medium-term disk storage||25 GB home directory, node-local scratch, memory-local scratch, temporary file storage, Storage Allocations for persistent project space||Storage Allocations for virtual "drives"||10 GB Home directory, 1 TB temporary file storage, Scratch filesystem, Storage Allocations for long-term tape archival storage|
|Data transfer options||rsync, scp, sftp, rclone, Globus||rsync, scp, sftp, rclone, Globus||scp/rsync to/from VMs, Globus Connect Personal to/from VMs||rsync, scp, Globus, rsync/scp to Ranch for long-term tape archival storage|
|Sensitive data? 1||No||No||No||No|
|Cost||Free with project application||Free with project application||Free with project application||Free with project application|
If you are interested in using any of the ACCESS resources, please contact us at email@example.com so that we can work with you directly to obtain access to the appropriate resource(s). For the most up-to-date information on ACCESS resources, see the ACCESS webpage.
Commercial cloud providers¶
The commercial cloud providers offer hundreds of services and at times it can be difficult to decide which are right for your research needs and budget. Cloud providers are best suited for on-demand virtual machines, website hosting, networking, cloud storage, mobile development, and other specialized use-cases.
For sensitive data needs, please see UCOP's Cloud Services Contracts and Guidance page.
For payment options (including paying with a chartstring through Berkeley IT's consolidated billing for AWS and GCP) and other information about UC agreements with clould computing providers, see Berkeley IT's Public Cloud Service.
In the table below we list some of the storage and data transfer options available from four popular cloud providers:
|Amazon Web Services (AWS)||Microsoft Azure||Google Cloud Platform (GCP)||Digital Ocean|
|Storage options||S3, Elastic File System (EFS), Glacier for long-term storage, and others||Azure Blob Storage, Azure Disk Storage, and more||GCP Cloud Storage, Persistent disk, Filestore, and others||Block Storage Volumes, Spaces Object Storage|
|Data transfer options||AWS DataSync, AWS Transfer Family||Azure Data Box||GCP Storage Transfer Service||web interface, rsync, scp, sftp|
|Cost||AWS Pricing Calculator||Azure Pricing Calculator||GCP Pricing Calculator||Digital Ocean Pricing Calculator|
For questions on data storage/transfer or other services available from commercial cloud providers, please contact us at firstname.lastname@example.org and CC the Cloud Resource Center at email@example.com with a brief description of your project, a concise overview of the software and data you're using, and a summary of what stage you are at in your research process. Additionally, you may be interested in attending the Cloud Working Group meetings.
Sensitive and protected data¶
Research IT also helps researchers working with sensitive data. On these pages you can learn how to determine the sensitivity of your data and secure your computing and storage systems. Also learn about our services for working with minimally to highly sensitive data, classified by campus as P1 - P4.
Highly sensitive compute and storage (P4)¶
The Secure Research Data and Compute (SRDC) platform includes high performance computing, virtual machines, and secure storage. This may include medical information protected under HIPAA, biometric data used for authentication, genetic data, or government issued ID numerical data.
Low & moderately sensitive compute and storage (P3/P2)¶
Savio, UC Berkeley's high performance computing cluster, is available for use with moderately sensitive data. This may include de-identified public health or human genetic data, especially if analyses can run in parallel or benefit from access to a large number of processors.
Analytics Environment on Demand (AEoD), a Windows and Linux friendly virtual machine service. This may include data analytics or geospatial analysis with individually-identifiable human subjects research data not deemed by the IRB to be highly-sensitive. Public health information and data related to animal research are also strong candidates for AEoD.
Minimally sensitive compute and storage (P1)¶
Research IT consultants are happy to meet with you to understand your data and computing needs and help match you to appropriate resources. Visit our office hours or get in touch to schedule an appointment.