Skip to content

Savio scheduler configuration

Savio partitions

Partition Nodes Node Features Nodes shared? SU/core hour ratio
savio2 163 savio2 or savio2_c24 or savio2_c28 exclusive 0.75
savio2_bigmem 36 savio2_bigmem or savio2_m128 exclusive 1.20
savio2_htc 20 savio2_htc shared 1.20
savio2_1080ti 8 savio2_1080ti shared 1.67 (3.34 / GPU)
savio2_knl 28 savio2_knl exclusive 0.40
savio3 192 savio3 or savio3_c40 exclusive 1.00
savio3_bigmem 20 savio3_bigmem or savio3_m384; (savio3_c40 for 40 cores) exclusive 2.67
savio3_htc 24 savio3_htc or savio3_c40 shared 2.67
savio3_xlmem 4 savio3_xlmem or savio3_c52 exclusive 4.67
savio3_gpu 2 savio3_gpu (2x V100) shared 3.67
savio3_gpu 9 4rtx (4x GTX2080TI) shared 3.67
savio3_gpu 6 8rtx (8x TITAN) shared 3.67
savio3_gpu 16 a40 (2x A40) shared 3.67
savio3_gpu 6 a40 (4x A40) shared 3.67
savio4_htc 156 savio4_m256 or savio4_m512 shared 3.67
savio4_gpu 26 a5000 (8x A5000) shared TBD

Overview of QoS Configurations for Savio

For details on specific Condo QoS configurations, see below.

QoS Accounts allowed QoS Limits Partitions
savio_normal FCA*, ICA 24 nodes max per job, 72 hour (72:00:00) wallclock limit all**
savio_debug FCA*, ICA 4 nodes max per job, 4 nodes in total, 3 hour (03:00:00) wallclock limit all**
savio_long FCA*, ICA 4 cores max per job, 24 cores in total, 10 day (10-00:00:00) wallclock limit savio2_htc
Condo QoS condos specific to each condo, see next section as purchased
savio_lowprio condos 24 nodes max per job, 72 hour (72:00:00) wallclock limit all

(*) Including purchases of additional SUs for an FCA.

(**) Note that savio3 nodes (including the various bigmem, GPU, etc. nodes) are not yet available for use by FCAs or ICAs.

QoS Configurations for Savio Condos

One can determine the resources available for a Condo by examining the Savio QoS configurations. This invocation prints out relevant information

sacctmgr show qos format=Name%24,Priority%8,GrpTRES%22,MinTRES%26
(Retired) Savio Condo QoS Configurations

Account QoS QoS Limit
co_acrb acrb_savio_normal 8 nodes max per group
co_aiolos aiolos_savio_normal 12 nodes max per group
24:00:00 wallclock limit
co_astro

astro_savio_debug


astro_savio_normal

4 nodes max per group
4 nodes max per job
00:30:00 wallclock limit


32 nodes max per group
16 nodes max per job

co_dlab dlab_savio_normal 4 nodes max per group
co_nuclear nuclear_savio_normal 24 nodes max per group
co_praxis praxis_savio_normal 4 nodes max per group
co_rosalind rosalind_savio_normal 8 nodes max per group
4 nodes max per job per user

Savio2 Condo QoS Configurations

Account QoS QoS Limit
co_biostat biostat_savio2_normal 20 nodes max per group
co_chemqmc chemqmc_savio2_normal 16 nodes max per group
co_dweisz dweisz_savio2_normal 8 nodes max per group
co_econ econ_savio2_normal 2 nodes max per group
co_hiawatha hiawatha_savio2_normal 40 nodes max per group
co_lihep lihep_savio2_normal 4 nodes max per group
co_mrirlab mrirlab_savio2_normal 4 nodes max per group
co_planets planets_savio2_normal 4 nodes max per group
co_stat stat_savio2_normal 2 nodes max per group
co_bachtrog bachtrog_savio2_normal 4 nodes max per group
co_noneq noneq_savio2_normal 8 nodes max per group
co_kranthi kranthi_savio2_normal 4 nodes max per group

Savio2 Bigmem Condo QoS Configurations

Account QoS QoS Limit
co_laika laika_bigmem2_normal 4 nodes max per group
co_dweisz dweisz_bigmem2_normal 4 nodes max per group
co_aiolos aiolos_bigmem2_normal 4 nodes max per group
24:00:00 wallclock limit
co_bachtrog bachtrog_bigmem2_normal 4 nodes max per group
co_msedcc msedcc_bigmem2_normal 8 nodes max per group

Savio2 HTC Condo QoS Configurations

Account QoS QoS Limit
co_rosalind rosalind_htc2_normal 8 nodes max per group

(Retired; GPUs no longer accessible) Savio2 GPU Condo QoS Configurations

Account QoS QoS Limit
co_acrb acrb_gpu2_normal 44 GPUs max per group
co_stat stat_gpu2_normal 8 GPUs max per group

Savio2 1080Ti Condo QoS Configurations

Account QoS QoS Limit
co_acrb acrb_1080ti2_normal 12 GPUs max per group
co_mlab mlab_1080ti2_normal 16 GPUs max per group

Savio2 KNL Condo QoS Configurations

Account QoS QoS Limit
co_lsdi lsdi_knl2_normal 28 nodes max per group
5 running jobs max per user
20 total jobs max per user

Savio3 Condo QoS Configurations

Account QoS QoS Limit
co_chemqmc chemqmc_savio3_normal 4 nodes max per group
co_laika laika_savio3_normal 4 nodes max per group
co_noneq noneq_savio3_normal 8 nodes max per group
co_aiolos aiolos_savio3_normal 36 nodes max per group
24:00:00 wallclock limit
co_jupiter jupiter_savio3_normal 12 nodes max per group
co_aqmodel aqmodel_savio3_normal 4 nodes max per group
co_esmath esmath_savio3_normal 4 nodes max per group
co_biostat biostat_savio3_normal 8 nodes max per group
co_fishes fishes_savio3_normal 4 nodes max per group
co_geomaterials geomaterials_savio3_normal 16 nodes max per group
co_kpmol kpmol_savio3_normal 40 nodes max per group
co_eisenlab eisenlab_savio3_normal 4 nodes max per group

Savio3 Bigmem Condo QoS Configurations

Account QoS QoS Limit
co_genomicdata genomicdata_bigmem3_normal 1 nodes max per group
co_kslab kslab_bigmem3_normal 4 nodes max per group
co_moorjani moorjani_bigmem3_normal 1 nodes max per group
co_armada2 armada2_bigmem3_normal 14 nodes max per group

Savio3 HTC Condo QoS Configurations

Account QoS QoS Limit
co_genomicdata genomicdata_htc3_normal 120 cores max per group
co_moorjani moorjani_htc3_normal 120 cores max per group
co_armada2 armada2_htc3_normal 80 cores max per group
co_songlab songlab_htc3_normal 160 cores max per group

Savio3 Extra Large Memory Condo QoS Configurations
Account QoS QoS Limit
co_genomicdata genomicdata_xlmem3_normal 1 nodes max per group
co_rosalind rosalind_xlmem3_normal 2 nodes max per group
Savio3 GPU Condo QoS Configurations
Account QoS QoS Limit Required gres Type (*)
co_nilah nilah_gpu3_normal 2 GPUs max per group V100
co_esmath esmath_gpu3_normal 16 GPUs max per group GTX2080TI
co_rail rail_gpu3_normal 48 GPUs max per group TITAN
co_jksim jksim_gpu3_normal 12 GPUs max per group N/A
co_memprotmd memprotmd_gpu3_normal 2 GPUs max per group A40
co_dweisz dweisz_gpu3_normal 2 GPUs max per group A40
co_condoceder condoceder_gpu3_normal 4 GPUs max per group A40
co_noneq noneq_gpu3_normal 8 GPUs max per group A40
co_biohub biohub_gpu3_normal 4 GPUs max per group A40
co_armada2 armada2_gpu3_normal 20 GPUs max per group A40
co_cph20bnodes cph200bnodes_gpu3_normal 8 GPUs max per group A40

(*)Type required in Slurm gres specification (of form gres=gpu:<type>:<number of gpus> eg: --gres=gpu:A40:1) for regular savio3_gpu condo jobs (i.e., not submitted under low priority).

Savio4 HTC Condo Qos Configurations
Account QoS Qos Limit
co_minium minium_htc4_normal 224 cores max per group
co_chrzangroup chrzangroup_htc4_normal 448 cores max per group
co_moorjani moorjani_htc4_normal 616 cores max per group
co_haas haas_htc4_normal 224 cores max per group
co_dweisz dweisz_htc4_normal 672 cores max per group
co_aqmel2 aqmel2_htc4_normal 224 cores max per group
co_condoceder condoceder_htc4_normal 1120 cores max per group
co_genomicdata genomicdata_htc4_normal 224 cores max per group
co_stratflows stratflows_htc4_normal 224 cores max per group
co_kslab kslab_htc4_normal 224 cores max per group
co_chemqmc chemqmc_savio4_normal 448 cores max per group
co_rosalind rosalind_htc4_normal 448 cores max per group
co_armada2 armada2_htc4_normal 672 cores max per group
Savio4 GPU Condo Qos Configurations
Account QoS Qos Limit
co_rail rail_gpu4_normal 208 GPUs max per group

CGRL Scheduler Configuration

The clusters uses the SLURM scheduler to manage jobs. When submitting your jobs via sbatch or srun commands, use the following SLURM options:

  • The settings for a job in Vector (Note: you don't need to set the "account"): --partition=vector --qos=vector_batch
  • The settings for a job in Rosalind (Savio2 HTC): --partition=savio2_htc --account=co_rosalind --qos=rosalind_htc2_normal

Alert

To check which QoS you are allowed to use, simply run "sacctmgr -p show associations user=$USER"

Here are the details for each CGRL partition and associated QoS.

Partition Account Nodes Node List Node Feature QoS QoS Limit
vector N/A 11 n00[00-03].vector0 n0004.vector0 n00[05-08].vector0 n00[09]-n00[10].vector0 vector,vector_c12,vector_m96 vector,vector_c48,vector_m256 vector,vector_c16,vector_m128 vector,vector_c12,vector_m48 vector_batch 48 cores max per job 96 cores max per user
savio2_htc co_rosalind 8 n0[000-011].savio2, n0[215-222].savio2 savio2_htc rosalind_htc2_normal 8 nodes max per group
savio3_xlmem co_rosalind 2 n0[000-003].savio3 savio3_xlmem rosalind_xlmem3_normal 2 nodes max per group
savio4_htc co_rosalind 8 n0[170-177].savio4 savio4_htc rosalind_htc4_normal 448 cores max per group