GNU parallel

GNU Parallel is a shell tool for executing jobs in parallel on one or multiple computers. It's a helpful tool for automating the parallelization of multiple (often serial) jobs, in particular allowing one to group jobs into a single SLURM submission to take advantage of the multiple cores on a given Savio node.

A job can be a single core serial task, multi-core job, or MPI application. A job can also be a command that reads from a pipe. The typical input is a list of input parameters needed as input for all the jobs collectively. GNU parallel can then split the input and pipe it into commands in parallel. GNU parallel makes sure output from the commands is the same output as you would get had you run the commands sequentially, and output names can be easily tied to input file names for simple post-processing. This makes it possible to use output from GNU parallel as input for other programs.

Below we'll show basic usage of GNU parallel and then provide an extended example illustrating submission of a Savio job that uses GNU parallel.

For full documentation see the GNU parallel man page and GNU parallel tutorial.

Basic usage¶

To motivate usage of GNU parallel, consider how you might automate running multiple individual tasks using a simple bash for loop. In this case, our example command involves copying a file. We will copy file1.in to file1.out, file2.in to file2.out, etc.

for (( i=1; i <= 3; i++ )); do
    cp file${i}.in file${i}.out
done

That's fine, but it won't run the tasks in parallel. Let's use GNU parallel to do it in parallel:

module load parallel
parallel -j 2 cp file{}.in file{}.out ::: 1 2 3
ls file*out
# file1.out  file2.out  file3.out

Based on -j, that will use two cores to process the three tasks, starting the third task when a core becomes free from having finished either the first or second task. The ::: syntax separates the input values: 1 2 3 from the command being run. Each input value is used in place of {} and the cp command is run.

Some bells and whistles¶

We can use multiple inputs per task, distinguishing the inputs by {1}, {2}, etc.:

parallel --link -j 2 cp file{1}.in file{2}.out ::: 1 2 3 ::: 4 5 6
ls file*out
# file4.out  file5.out  file6.out

Note that --link is needed so that 1 is paired with 4, 2 with 5, etc., instead of doing all possible pairs amongst the two sets.

Of course in many contexts we don't want to have to write out all the input numbers. We can instead generate them using seq:

parallel -j 2 cp file{}.in file{}.out ::: `seq 3`

We can use a file containing the list of inputs as a "task list" instead of using the ::: syntax. Here we'll also illustrate the special syntax {.} to remove the filename extension.

parallel -j 2 -a task.lst cp {} {.}.out

task.lst looks like this; it should have the parameter(s) for separate tasks on separate lines:

file1.in
file2.in
file3.in

Next we could use a shell script instead of putting the command inline:

parallel -j 2 -a task.lst bash mycp.sh {} {.}.out
# copying file1.in to file1.out
# copying file2.in to file2.out
# copying file3.in to file3.out

Here's what mycp.sh looks like:

#!/bin/bash 
echo copying ${1} to ${2}
cp ${1} ${2}

We could also parallelize an arbitrary set of commands rather than using the same command on a set of arbitrary inputs.

parallel -j 2 < commands.lst
# hello
# wilkommen
# hola

Not surprisingly, here's the content of commands.lst:

echo hello
echo wilkommen
echo hola

Finally, let's see how we would use GNU parallel within the context of a SLURM batch job.

To parallelize on one node, using all the cores on the node that are available to the SLURM job:

module load parallel
parallel -j $SLURM_CPUS_ON_NODE < commands.lst

The SLURM_CPUS_ON_NODE variable will equal the number of cores on a machine in per-node partitions such as savio3 and the product of --cpus-per-task and the number of Slurm tasks on per-core partitions such as savio4_htc and the GPU partitions.

Note that if the code for an individual task is parallelized (e.g., threaded code), you would want to modify the value of the -j (--jobs) flag so that GNU parallel runs fewer tasks at once (presumably you'd want it set to the number of cores divided by the number of threads per task).

Multi-node usage¶

To parallelize across all the cores on multiple nodes we need to use the --slf flag:

module load parallel/20220522
echo $SLURM_JOB_NODELIST |sed s/\,/\\n/g > hostfile
parallel -j $SLURM_CPUS_ON_NODE --slf hostfile < commands.lst

Explicitly request the same number of cores on each node

When using multiple nodes on partitions with per-core scheduling (e.g., savio4_htc, savio3_htc, savio3_gpu), you should request the same number of cores on each node, because the parallel -j syntax specifies a single number for how many cores to use per node. You can do this using --ntasks-per-node (or --cpus-per-task if your code is not threaded), rather than --ntasks.

When using multiple nodes on partitions with per-node scheduling (e.g., savio3 and savio2), in some cases your job can be given access to nodes with different numbers of cores. This can complicate setting the -j flag, since GNU parallel will run the same number of tasks per node. Some options are (1) to set --ntasks-per-node (or --cpus-per-task if your code is not threaded) as well as --nodes, (2) to hard-code the value to the smallest number of cores on machines in the partition (e.g., 24 for savio2 and 32 for savio3), or (3) to use the -C (or --constraint) flag to request a node "feature" that ensures you get all nodes with the same number of cores.

Working directory when using --slf

When using multiple nodes, the working directory will be your home directory, unless you specify otherwise using the --wd flag (see below for example usage), and NOT the directory from which parallel was called.

Warning messages when using --slf and --progress

If you use the --progress flag with the --slf flag, you'll probably see a warning like this:

parallel: Warning: Could not figure out number of cpus on n0021.savio1 (). Using 1.

This occurs because GNU parallel tries to count the cores on each node and this process fails if the parallel module is not loaded on all the nodes available to your job. This should not be a problem for your job, provided you set the -j flag to explicitly tell GNU parallel how many jobs to run in parallel on each node. (Also note that you can silence the warning by adding module load parallel/20220522 to your .bashrc file so that GNU parallel is on your PATH on all the nodes in your Slurm allocation.)

Extended example¶

Here we'll put it all together (and include even more useful syntax) to parallelize use of the bioinformatics software BLAST across multiple biological input sequences.

Here's our example task list, task.lst:

../blast/data/protein1.faa
../blast/data/protein2.faa
<snip>

Here's the script we want to run BLAST on a single input file, run-blast.sh:

#!/bin/bash
blastp -query $1 -db ../blast/db/img_v400_PROT.00 -out $2  -outfmt 7 -max_target_seqs 10 -num_threads $3

Now let's use GNU parallel in the context of a SLURM job script:

#!/bin/bash
#SBATCH --job-name=<job_name>
#SBATCH --account=<account_name>
#SBATCH --partition=<partition_name>
#SBATCH --nodes=2
#SBATCH --cpus-per-task=2
#SBATCH --time=2:00:00

## Command(s) to run (example):
module load bio/blast-plus/2.14.1-gcc-11.4.0
module load parallel/20220522

export WDIR=/your/desired/path
cd $WDIR

# set number of jobs based on number of cores available and number of threads per job
export JOBS_PER_NODE=$(( $SLURM_CPUS_ON_NODE / $SLURM_CPUS_PER_TASK ))

echo $SLURM_JOB_NODELIST |sed s/\,/\\n/g > hostfile

parallel --jobs $JOBS_PER_NODE --slf hostfile --wd $WDIR --joblog task.log --resume --progress -a task.lst sh run-blast.sh {} output/{/.}.blst $SLURM_CPUS_PER_TASK

Some things to notice:

Here BLAST will use multiple threads for each job, based on the SLURM_CPUS_PER_TASK variable that is set based on the -c (or --cpus-per-task) SLURM flag.
We programmatically determine how many jobs to run on each node, accounting for the threading.
Setting the working directory with --wd is optional; without that your home directory will be used (if using multiple nodes via --slf) or the current working directory will be used (if using one node).
The --resume and --joblog flags allow you to easily restart interrupted work without redoing already completed tasks.
The --progress flag causes a progress bar to be displayed.
In this case, only one of the three inputs to run-blast.sh is provided in the task list. The second argument is determined from the first, after discarding the path and file extension, and the third is constant across tasks.

Troubleshooting tips¶

We've occasionally seen strange behavior when the --resume flag is included. If you don't need it, you might try omitting it.
If you see the warning "sh: /dev/tty: No such device or address", it shouldn't cause problems and your job should run successfully. Omitting the --progress flag should silence the warning, but you won't see updates on progress.
If you see a message about "simultaneous logins/connections/MaxStartups", that may occur because the -j (--jobs) flag is being set incorrectly relative to the cores actually available on each node. This can occur with multi-node jobs when one tries to automatically compute the value of the flag as done in the extended example above. Try hard-coding the value and see if the message/error goes away. You can also print the value (e.g., putting echo ${JOBS_PER_NODE} in your job script) to check it.
If you need a module loaded for your individual tasks to work, load the module as part of the task, and do not load modules (apart from module load parallel) in your overall Slurm job script.

If you run into other problems, please let us know, both so that we can try to help and so we can improve this documentation.