What Resources Can I Use?

Charges for Running Jobs¶

When running your SLURM batch or interactive jobs on the Savio cluster under a Faculty Computing Allowance account (i.e. a scheduler account whose name begins with fc_), your usage of computational time is tracked (in effect, "charged" for, although no costs are incurred) via abstract measurement units called "Service Units." (Please see Service Units on Savio for a description of how this usage is calculated.) When all of the Service Units provided under an Allowance have been exhausted, no more jobs can be run under that account. (You can check your usage or total usage under an FCA or MOU allocation.)

Danger

Please also note that, when running jobs on many of Savio's pools of compute nodes, you are provided with exclusive access to those nodes, and thus are "charged" for using all of that node's cores. For example, if you run a job for one hour on a standard 24-core compute node on the savio2 partition, your job will always be charged for using 24 core hours, even if it requires just a single core or a few cores. (For more details, including information on ways you can most efficiently use your computational time on the cluster, please see the Scheduling Nodes v. Cores section of Service Units on Savio.)

Usage tracking does not affect jobs run under a Condo account (i.e. a scheduler account whose name begins with co_), which has no Service Unit-based limits.

What resources (hardware) your jobs have access to¶

You can view the accounts you have access to, partitions you can use, and the QoS options available to you using the sacctmgr command:

sacctmgr -p show associations user=$USER

This will return output such as the following for a hypothetical example user lee who has access to both the physics condo and to a Faculty Computing Allowance. Each line of this output indicates a specific combination of an account, a partition, and QoSes that you can use in a job script file, when submitting any individual batch job:

Cluster|Account|User|Partition|...|QOS|Def QOS|GrpTRESRunMins| brc|co_physics|lee|savio2_1080ti|...|savio_lowprio|savio_lowprio|| brc|co_physics|lee|savio2_knl|...|savio_lowprio|savio_lowprio|| brc|co_physics|lee|savio2_bigmem|...|savio_lowprio|savio_lowprio|| brc|co_physics|lee|savio2_gpu|...|savio_lowprio|savio_lowprio|| brc|co_physics|lee|savio2_htc|...|savio_lowprio|savio_lowprio|| brc|co_physics|lee|savio2|...|physics_savio2_normal,savio_lowprio|physics_savio2_normal|| brc|fc_lee|lee|savio2_1080ti|...|savio_debug,savio_normal|savio_normal|| brc|fc_lee|lee|savio2_knl|...|savio_debug,savio_normal|savio_normal|| brc|fc_lee|lee|savio2_bigmem|...|savio_debug,savio_normal|savio_normal|| brc|fc_lee|lee|savio2_gpu|...|savio_debug,savio_normal|savio_normal|| brc|fc_lee|lee|savio2_htc|...|savio_debug,savio_normal|savio_normal|| brc|fc_lee|lee|savio2|...|savio_debug,savio_normal|savio_normal||

The Account, Partition, and QOS fields indicate which partitions and QoSes you have access to under each of your account(s). The Def QoS field identifies the default QoS that will be used if you do not explicitly identify a QoS when submitting a job. Thus as per the example above, if the user lee submitted a batch job using their fc_lee account, they could submit their job to either the savio2_gpu, savio2_htc, savio2_bigmem, or savio2 partitions. (And when doing so, they could also choose either the savio_debug or savio_normal QoS, with a default of savio_normal if no QoS was specified.)

If you are running your job in a condo, be sure to note which of the line(s) of output associated with the condo account (those beginning with "co_" ) have their Def QoS being a lowprio QoS and which have a normal QoS. Those with a normal QoS (such as the line highlighted in boldface text in the above example) are the QoS to which you have priority access, while those with a lowprio QoS are those to which you have only low priority access. Thus, in the above example, the user lee should select the co_physics account and the savio2 partition when they want to run jobs with normal priority, using the resources available via their condo membership.

You can find more details on the hardware specifications for the machines in the various partitions here for the Savio and CGRL (Vector/Rosalind) clusters.

You can find more details on each partition and the QoS available in those partitions here for the Savio and CGRL (Vector/Rosalind) clusters.