Run and Review Your Jobs
Overview
To submit and run jobs, cancel jobs, and check the status of jobs on the Savio cluster, you'll use the Simple Linux Utility for Resource Management (SLURM), an open-source resource manager and job scheduling system. (SLURM manages jobs, job steps, nodes, partitions (groups of nodes), and other entities on the cluster.)
There are several basic SLURM commands you'll likely use often:
sbatch
- Submit a job to the batch queue system, e.g.,sbatch myjob.sh
, wheremyjob.sh
is a SLURM job scriptsrun
- Submit an interactive job to the batch queue systemscancel
scancel 123
, where 123 is a job IDsqueue
- Check the current jobs in the batch queue system, e.g.,squeue -u $USER
to view your own jobssq
- Check why your job is not running, e.g.,module load sq; sq
sacctmgr
- Check what resources (accounts, partitions, and QoS) you or your FCA or Condo project have access to, e.g.,sacctmgr -p show associations user=$USER
orsacctmgr -p show associations account=project_name
sinfo
- View the status of the cluster's compute nodes, including how many nodes - of what types - are currently available for running jobs.sacct
- Display accounting data for your submitted jobs and job steps from the SLURM job accounting log or SLURM database. This command allows one to inspect jobs which are already completed (although you can look at queued and running jobs with this command as well).
Please see the following for detailed information on: