Skip to content

Scheduler Examples

Here we show some example job scripts that allow for various kinds of parallelization, jobs that use fewer cores than available on a node, GPU jobs, low-priority condo jobs, and long-running FCA jobs.

1. Threaded/OpenMP job script

#!/bin/bash
# Job name:
#SBATCH --job-name=test
#
# Account:
#SBATCH --account=account_name
#
# Partition:
#SBATCH --partition=partition_name
#
# Request one node:
#SBATCH --nodes=1
#
# Specify one task:
#SBATCH --ntasks-per-node=1
#
# Number of processors for single task needed for use case (example):
#SBATCH --cpus-per-task=4
#
# Wall clock limit:
#SBATCH --time=00:00:30
#
## Command(s) to run (example):
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
./a.out

Here --cpus-per-task should be no more than the number of cores on a Savio node in the partition you request. You may want to experiment with the number of threads for your job to determine the optimal number, as computational speed does not always increase with more threads. Note that if --cpus-per-task is fewer than the number of cores on a node, your job will not make full use of the node. Strictly speaking the --nodes and --ntasks-per-node arguments are optional here because they default to 1.

2. Simple multi-core job script (multiple processes on one node)

#!/bin/bash
# Job name:
#SBATCH --job-name=test
#
# Account:
#SBATCH --account=account_name
#
# Partition:
#SBATCH --partition=partition_name
#
# Request one node:
#SBATCH --nodes=1
#
# Specify number of tasks for use case (example):
#SBATCH --ntasks-per-node=20
#
# Processors per task:
#SBATCH --cpus-per-task=1
#
# Wall clock limit:
#SBATCH --time=00:00:30
#
## Command(s) to run (example):
./a.out

This job script would be appropriate for multi-core R, Python, or MATLAB jobs. In the commands that launch your code and/or within your code itself, you can reference the SLURM_NTASKS environment variable to dynamically identify how many tasks (i.e., processing units) are available to you.

Here the number of CPUs used by your code at at any given time should be no more than the number of cores on a Savio node.

For a way to run many individual jobs on one or more nodes (more jobs than cores), see this information on our ht_helper tool.

3. MPI job script

#!/bin/bash
# Job name:
#SBATCH --job-name=test
#
# Account:
#SBATCH --account=account_name
#
# Partition:
#SBATCH --partition=partition_name
#
# Number of MPI tasks needed for use case (example):
#SBATCH --ntasks=40
#
# Processors per task:
#SBATCH --cpus-per-task=1
#
# Wall clock limit:
#SBATCH --time=00:00:30
#
## Command(s) to run (example):
module load gcc openmpi
mpirun ./a.out

As noted in the introduction, for all partitions except for savio2_htc and savio2_gpu, you probably want to set the number of tasks to be a multiple of the number of cores per node in that partition, thereby making use of all the cores on the node(s) to which your job is assigned.

This example assumes that each task will use a single core; otherwise there could be resource contention amongst the tasks assigned to a node.

4. Alternative MPI job script

#!/bin/bash
# Job name:
#SBATCH --job-name=test
#
# Account:
#SBATCH --account=account_name
#
# Partition:
#SBATCH --partition=partition_name
#
# Number of nodes needed for use case:
#SBATCH --nodes=2
#
# Tasks per node based on number of cores per node (example):
#SBATCH --ntasks-per-node=20
#
# Processors per task:
#SBATCH --cpus-per-task=1
#
# Wall clock limit:
#SBATCH --time=00:00:30
#
## Command(s) to run (example):
module load gcc openmpi
mpirun ./a.out

This alternative explicitly specifies the number of nodes, tasks per node, and CPUs per task rather than simply specifying the number of tasks and having SLURM determine the resources needed. As before, one would generally want the number of tasks per node to equal a multiple of the number of cores on a node, assuming only one CPU per task.

5. Hybrid OpenMP+MPI job script

#!/bin/bash
# Job name:
#SBATCH --job-name=test
#
# Account:
#SBATCH --account=account_name
#
# Partition:
#SBATCH --partition=partition_name
#
# Number of nodes needed for use case (example):
#SBATCH --nodes=2
#
# Tasks per node based on --cpus-per-task below and number of cores
# per node (example):
#SBATCH --ntasks-per-node=4
#
# Processors per task needed for use case (example):
#SBATCH --cpus-per-task=5
#
# Wall clock limit:
#SBATCH --time=00:00:30
#
## Command(s) to run (example):
module load gcc openmpi
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
mpirun ./a.out

Here we request a total of 8 (=2x4) MPI tasks, with 5 cores per task.  This would make use of all the cores on two, 20-core nodes in the "savio" partition.

6. Jobs scheduled on a per-core basis (jobs that use fewer cores than available on a node)

#!/bin/bash
# Job name:
#SBATCH --job-name=test
#
# Account:
#SBATCH --account=account_name
#
# Partition:
#SBATCH --partition=savio2_htc
#
# Number of tasks needed for use case (example):
#SBATCH --ntasks=4
#
# Processors per task:
#SBATCH --cpus-per-task=1
#
# Wall clock limit:
#SBATCH --time=00:00:30
#
## Command(s) to run (example):
./a.out

In the savio2_htc pool you are only charged for the actual number of cores used, so the notion of making best use of resources by saturating a node is not relevant.

7. GPU job script

#!/bin/bash
# Job name:
#SBATCH --job-name=test
#
# Account:
#SBATCH --account=account_name
#
# Partition:
#SBATCH --partition=savio2_gpu
#
# Number of nodes:
#SBATCH --nodes=1
#
# Number of tasks (one for each GPU desired for use case) (example):
#SBATCH --ntasks=1
#
# Processors per task (please always specify the total number of processors twice the number of GPUs):
#SBATCH --cpus-per-task=2
#
#Number of GPUs, this can be in the format of "gpu:[1-4]", or "gpu:K80:[1-4] with the type included
#SBATCH --gres=gpu:1
#
# Wall clock limit:
#SBATCH --time=00:00:30
#
## Command(s) to run (example):
./a.out

Requesting a GPU type in savio3_gpu

savio3_gpu regular condo jobs (those not using the low priority queue) should request the specific type of GPU bought for the condo as detailed here.

savio3_gpu regular FCA jobs (those not using the low priority queue) should request either the GTX2080TI or V100 GPU type, e.g., --gres=gpu:GTX2080TI:1. If requesting a V100 GPU, note that you also need to specifically specify the QoS via -q v100_gpu3_normal.

To help the job scheduler effectively manage the use of GPUs, your job submission script must request multiple CPUs (usually two) for each GPU you use. Jobs submitted that do not request sufficient CPUs for every GPU will be rejected by the scheduler. Generally this ratio should be two, except that in savio3_gpu, when using TITAN or V100 GPUs the ratio should be four CPUs per GPU and for A40 GPUs the ratio should be eight CPUs per GPU.

Here’s how to request two CPUs for each GPU: the total of CPUs requested results from multiplying two settings: the number of tasks (“--ntasks=”) and CPUs per task ("--cpus-per-task=").

For instance, in the above example, one GPU was requested via “--gres=gpu:1”, and the required total of two CPUs was thus requested via the combination of “--ntasks=1” and "--cpus-per-task=2" . Similarly, if your job script requests four GPUs via "--gres=gpu:4", and uses "--ntasks=8", it should also include "--cpus-per-task=1" in order to request the required total of eight CPUs.

Note that the "--gres=gpu:[1-4]" specification must be between 1 and 4. This is because the feature is associated with a node, and the nodes each have 4 GPUs. (The exception to this is that in savio3_gpu the TITAN RTX nodes have 8 GPUs per node and the V100 and A40 nodes have 2 GPUs per node.) If you wish to use more than 4 GPUs, your "--gres=gpu:[1-4]" specification should include how many GPUs to use per node requested. For example, if you wish to use eight GPUs, your job script should include options to the effect of "--gres=gpu:4", "--nodes=2", "--ntasks=8", and "--cpus-per-task=2".

8. Long-running jobs (up to 10 days and 4 cores per job)

#!/bin/bash
# Job name:
#SBATCH --job-name=test
#
# Account:
#SBATCH --account=account_name
#
# QoS: must be savio_long for jobs > 3 days
#SBATCH --qos=savio_long
#
# Partition:
#SBATCH --partition=savio2_htc
#
# Number of tasks needed for use case (example):
#SBATCH --ntasks=2
#
# Processors per task:
#SBATCH --cpus-per-task=1
#
# Wall clock limit (7 days in this case):
#SBATCH --time=7-00:00:00
#
## Command(s) to run (example):
./a.out

A given job in the long queue can use no more than 4 cores and a maximum of 10 days. Collectively across the entire Savio cluster, at most 24 cores are available for long-running jobs, so you may find that your job may sit in the queue for a while before it starts.

In the savio2_htc pool you are only charged for the actual number of cores used, so the notion of making best use of resources by saturating a node is not relevant.

9. Low-priority jobs

Low-priority jobs can only be run using condo accounts. By default any jobs run in a condo account will use the default QoS (generally savio_normal) if not specified. To use the low-priority queue, you need to specify the low-priority QoS, as follows.

#!/bin/bash
# Job name:
#SBATCH --job-name=test
#
# Account:
#SBATCH --account=account_name
#
# Partition:
#SBATCH --partition=partition_name
#
# Quality of Service:
#SBATCH --qos=savio_lowprio
#
# Wall clock limit:
#SBATCH --time=00:00:30
#
## Command(s) to run:
echo "hello world"