In recognition of the increasing importance of research computing across many disciplines, UC Berkeley has made a significant investment in developing the BRC High Performance Computing service, as a way to grow and sustain high performance computing for UC Berkeley.
This service, offering and supporting access to the Savio Institutional/Condo Cluster, is intended to provide Berkeley’s campus researchers with state-of-the-art, professionally-administered computing systems and ancillary infrastructure. Beyond its central mission of meeting the campus’s computational research needs, some auxiliary benefits of the cluster include improving competitiveness on grants which favor or require institutional resources, providing an incentive for recruitment and retention, and achieving significant economies of scale with centralized computing systems and data center facilities.
Our mission is to deliver reliable, sustainable computing resources and services to facilitate the use of high-performance computing that meets the computational research demands of the UC Berkeley community.
Computing continues to be a tool as vital as experimentation and theory in solving the scientific challenges of the twenty-first century. Fundamental to our mission is enabling computational science, in which interdisciplinary teams of researchers address fundamental problems in research and engineering that require computation and have broad research and economic impacts. Examples of these problems include global climate modeling, nanoscience, combustion modeling, carbon sequestration, astrophysics, computational biology, political science, and many more.
Savio is a 470-node, 11,620 processor-core Linux cluster rated at nearly 450 peak teraFLOPS. About 40% of the compute nodes is provided by the institution for general access, with the remaining 60% contributed by researchers in the Condo program. A number of nodes also have nVIDIA GPU in them (GTX 1080Ti, V100) as such, Savio is suitable for a wide diversity of research applications, including tightly coupled applications that require a low latency, high bandwidth interconnect, or very fast I/O.
For more information on the Savio cluster’s hardware, software, and more, please see the System Overview.
The model for sustaining Savio is premised on an institutional/condo model, with faculty and principal investigators purchasing compute nodes (individual servers) from their grants or other available funds, which are then added to the institution’s compute cluster. This allows researcher-owned nodes to take advantage of the low-latency Infiniband interconnect and high performance parallel filesystem storage provided by the institution. Operating costs for managing and housing researcher-owned compute nodes are waived in exchange for letting other users make use of any idle compute cycles on the researcher-owned nodes. Researchers participating in the Condo program have priority access to computing resources equivalent to those purchased with their funds, but can also access more nodes for their research if needed. This provides much greater flexibility than owning a standalone cluster.
This service is supported by a collaboration with the High Performance Computing Services Group at Lawrence Berkeley National Laboratory.
Berkeley Research Computing - HPC Staff:
Gary Jung - Manager, BRC High Performance Computing
John White - Parallel filesystems and HPC storage
Krishna Muriki - HPC user engagement, Science Gateways
Wei Feinstein - HPC user engagement, Consulting Services
Karen Fernsler - Globus Online support
Tin Ho - HPC Engineer, Consulting Services
Program: Berkeley Research Computing
Partnership: Lawrence Berkeley National Lab