Run and Review Your Jobs
Overview
To submit and run jobs, cancel jobs, and check the status of jobs on the Savio cluster, you'll use the Simple Linux Utility for Resource Management (SLURM), an open-source resource manager and job scheduling system. (SLURM manages jobs, job steps, nodes, partitions (groups of nodes), and other entities on the cluster.)
There are several basic SLURM commands you'll likely use often:
sbatch
- Submit a job to the batch queue system, e.g.,sbatch myjob.sh
, wheremyjob.sh
is a SLURM job scriptsrun
- Submit an interactive job to the batch queue systemscancel
scancel 123
, where 123 is a job IDsqueue
- Check the current jobs in the batch queue system, e.g.,squeue -u $USER
to view your own jobssq
- Check why your job is not running, e.g.,module sq; sq
sacctmgr
- Check what resources (accounts, partitions, and QoS you have access to, e.g.,sacctmgr -p show associations user=$USER
sinfo
- View the status of the cluster's compute nodes, including how many nodes - of what types - are currently available for running jobs.
Please see the following for detailed information on: