Tracking usage of computing time, on the UC Berkeley campus's Savio high-performance computing cluster, is done via abstract measurement units called "Service Units." Note that this tracking does not apply to computing done using a condo.
A Service Unit (abbreviated as "SU") is equivalent to one “core hour”: that is, the use of one processor core, for one hour of wall-clock time, on one of Savio's standard, current generation compute nodes.
The video below covers node types on Savio as of fall 2019:
Note: Tracking of usage is only relevant for Faculty Computing Allowance users. Usage tracking does not impact Condo users, who have no Service Unit-based limits on the use of their associated compute pools.
Calculating Service Units Used by a Compute Job
As described in more detail below, the cost (in "Service Units") of running any particular compute job on Savio is calculated via a straightforward formula, which simply multiplies together the following three values:
- How many processor cores are reserved for use by the job. (Important: when using certain pools of compute nodes, your job will be charged with using all the cores on that node, even if it actually uses only some of those.)
- How long the job takes to run (in ordinary "wall-clock" time).
- The scaling factor for the pool of compute nodes ("partition") on which the job is run.
For instance, if you run a computational job on the
savio2 pool of nodes via an
srun command and reserve one compute node (which means you're effectively reserving all 24 of that node's cores), and your job runs for one hour, that job will use 24 Service Units; i.e., 24 cores x 1 hour x a scaling factor of 1.00 = 24 Service Units.
A charge of 24 Service Units would then be made against your group's Faculty Computing Allowance scheduler account (with an account name like
fc_projectname). So if you started out with 300,000 Service Units before running this job, for instance, after running it you would now have 299,976 Service Units remaining, for running additional jobs under the
Similarly, if you run a computational job on the
savio2_htc pool of nodes, and reserve just 5 processor cores (since, when using the cluster's High Throughput Computing nodes, you can optionally schedule the use of just individual cores, rather than entire nodes), and your job runs for 10 hours, that job will use 60 Service Units; i.e., 5 cores x 10 hours x a scaling factor of 1.20 = 60 Service Units.
Scheduling Nodes vs. Cores
When you schedule jobs on Savio, depending on the pool of compute nodes (scheduler partition) on which you're running them, you may be automatically provided with exclusive access to entire nodes (including all of their cores), or you may be able to request access just to one or more individual cores on those nodes. When a job you run is provided with exclusive access to an entire node, please note that your account will be charged for using all of that node's cores.
Thus, for example, if you run a job for one hour on a standard 24-core compute node on the savio2 partition, because jobs are given exclusive access to entire nodes on that partition, your job will always use 24 core hours, even if it actually requires just a single core or a few cores. Accordingly, your account will be charged 24 Service Units for one hour of computational time on a savio2 node.
For that reason, if you plan to run single-core jobs - or any other jobs requiring fewer than the total number of cores on a node - you have two recommended options:
- When running on a pool of compute nodes that always gives you exclusive access to entire nodes, you should bundle up multiple, smaller jobs that only require a single core (or a small number of cores) into one, larger job using GNU parallel. This allows you to use many or all of the cores on that node during the duration of that job.
- Alternately, run your jobs on a pool of compute nodes that offers per-core scheduling of jobs and is appropriately suited for your jobs. When doing so, make sure that your job script file also specifies exactly how many cores your job needs to use.
Scaling of Service Units
When you're using types of compute nodes other than Savio's current generation of standard nodes (at this writing, the nodes in the savio2 partition are "standard nodes"), your account will be charged with using more than - or fewer than - one Service Unit per hour of compute time.
These scaled values primarily reflect the varying costs of acquiring and replacing different types of nodes in the cluster. When using older pools of standard compute nodes, with earlier generations of hardware, your account will use less than one SU per hour, while when using higher-cost nodes, such as Big Memory or Graphics Processing Unit (GPU) nodes, it will use more than one SU per hour.
As of January 24, 2018, here are the rates for using various types of nodes on Savio, in Service Units per hour. (Please see the Savio User Guide for more detailed information about each pool of compute nodes listed below.)
|Pool of Compute Nodes (Partition)||Service Units used per Core Hour|
*Charges for the use of Savio's GPU nodes are based on the number of CPU processor cores used (rather than on GPUs used), as is the case for the charges for other types of compute nodes. Because all jobs using GPUs must request the use of at least two CPU cores for each GPU requested, the effective cost of using one GPU on Savio will be a minimum of 5.34 (2 x 2.67 SUs per CPU) Service Units per hour. This also applies to the savio2_1080ti partition, in which one GPU card has a cost of 1.67 SUs per wall-hour.
Viewing Your Service Units
You can view how many Service Units have been used to date under a Faculty Computing Allowance, or by a particular user account on Savio, via the check_usage.sh script.
Getting More Compute Time
Are you running low on Service Units under your Faculty Computing Allowance? Or have exhausted them entirely? There are a number of options for getting more computing time for your project.