Skip to content

Using Singularity Containers

Summary

This document describes how to use Singularity, a software tool that allows you to run Linux containers. Using containers facilitates the movement of software applications and workflows between various computational environments.

Overview

Singularity is a software tool provided on the Savio cluster. It allows you to bring already-built research applications and workflows from other Linux environments onto Savio and run them on the cluster, without any installation or reconfiguration required. Singularity packages those applications and workflows in “containers,” and runs them within the Singularity container's boundaries.

  • Containerization provides "lightweight, standalone, executable packages of software that include everything needed to run an application: code, runtime, system tools, system libraries and settings".
  • A container provides a self-contained (isolated) filesystem.
  • Containers are similar to virtual machines in some ways, but much lighter-weight.
  • Containers are portable, shareable, and reproducible.

Singularity allows you to create containers, or find and obtain containers from others, and then run them on any Linux platform where Singularity is installed. Research software that you or others have packaged up into Singularity containers can be copied to -- and run on -- multiple clusters, cloud environments, workstations, and laptops.

Singularity thus enables “Bring Your Own Environment” computing. It is conceptually similar to Docker, a well-known software containerization platform that isn’t compatible with the security models used on Savio and other traditional High Performance Computing (HPC) environments. Both Singularity and Docker, in turn, have some similarities to virtual machines.

Singularity containers that you use on Savio must be created on a different computer. Root permission is required to create Singularity containers, and users are not allowed to run as root on the cluster. Options for creating image-based Singularity containers, which can then be run on Savio under a user’s normal set of permissions, are described below. One option includes using existing Docker images directly in Singularity on Savio.

In addition to this documentation, more information can be found in our April 2021 training on using Singularity on Savio.

Running Singularity containers on Savio

Trying things out on a login node

Assuming you have a Singularity container in a directory on Savio you can run it as follows.

singularity run mycontainer.sif

We can run a Docker container available from DockerHub (behind the scenes the Docker image will be downloaded and converted to a Singularity image) like this:

singularity run docker://ubuntu:20.04

That will put you into a shell inside a container running Ubuntu Linux 20.04. Note the change in prompt after the container starts. Inside the container you could do things like the following to convince yourself that you are running in the container and not on Savio, although your working directory will generally be a Savio directory.

cat /etc/issue   # not the Savio OS!
which python     # not much here!
pwd

Singularity containers can be used in three ways:

  • shell sub-command: invokes an interactive shell within a container
    singularity shell mycontainer.sif
    
  • run sub-command: executes the container’s runscript (i.e., the primary way the container's builder intends for the container to be used)
    singularity run mycontainer.sif
    
  • exec sub-command: execute an arbitrary command within container
    singularity exec mycontainer.sif cat /etc/os-release
    

In the example with ubuntu:20.04 above, the container's runscript simply starts a shell inside the container.

Running via Slurm

Of course in most cases you will be running jobs under Slurm.

To submit a batch job that runs Singularity on Savio, create a SLURM job script file. A simple example follows:

# Job name:
#SBATCH --job-name=test_singularity
#
# Account:
#SBATCH --account=account_name
#
# Partition:
#SBATCH --partition=partition_name
#
# Wall clock limit:
#SBATCH --time=00:30:00
#
## Command(s) to run:
singularity run /path/to/container/mycontainer.sif 

Creating Singularity container images

Overview

You have a variety of options for creating Singularity container images that you can run on Savio.

  1. Directly create a Singularity image by importing a Docker or Singularity image from an image registry.
  2. Create a Docker image on your own machine.
    • One option is then to push it to a Docker registry (such as DockerHub), and then import the Docker image as above.
    • A second option is to archive the Docker image, transfer it to Savio, and then convert to a Singularity image.
    • These options rely on installing and running Docker on your own machine.
  3. Create a Singularity image on your own machine and transfer it to Savio.
    • One option is to install Singularity on your own machine.
    • A second option is to install Docker on your own machine and run Singularity within the quay.io Singularity Docker image.
  4. Use a cloud service that allows you to build images, such as Sylabs Remote Builder

Here we provide more details on some of these options. More details can be found in our training and in various online documentation for Docker and Singularity.

Note that when building an image from either a Dockerfile or Singularity definition file, you generally want to base your image (i.e., bootstrap it) on an existing image that may have key software already installed. Some examples include images with Tensorflow, PyTorch or R/RStudio.

Import an existing Docker image or Singularity image

  • You can simply ask Singularity to run a Docker container as shown above; behind the scenes this will create a Singularity image file.

To create the Singularity image and run in two explicit steps:

singularity pull docker://ubuntu:20.04   # this creates ubuntu_20.04.sif
singularity run ubuntu_20.04.sif

Note that this also creates a copy of the container in your Singularity cache, which can quickly fill up your home directory. Some useful comands for working with the cache are:

singularity cache list
singularity cache clean

You can control where Singularity locates the cache and the temporary directory that Singularity uses by setting SINGULARITY_CACHEDIR and SINGULARITY_TMPDIR, respectively. In particular, you may want to use your scratch directory or /tmp as the location of the cache.

  • Here's how to run an existing Singularity image from a registry:
singularity pull hello-world.sif shub://singularityhub/hello-world
singularity run hello-world.sif

Create a Docker image on your own machine

For this you will need a Dockerfile that defines what you want in your image. There are lots of examples online.

  • To create an image and push to a Docker image registry (in this case DockerHub), the process looks like this, assuming you have a Dockerfile in your working directory:
docker login --username=paciorek
docker build -t tagname .
docker tag tagname paciorek/name_of_image:0.1  # version number is set to be 0.1
docker push paciorek/name_of_image

This assumes you have created a DockerHub account; here the DockerHub username is paciorek.

Once the image is on the registry you can run it (or pull and then run it) using the commands shown above for pre-existing Docker images.

  • To create an image, archive it locally on the machine you are using, and convert to a Singularity image, the process looks like this:
docker build -t tagname .
docker save tagname > name_of_image.tar

Then transfer the .tar file to Savio and run:

singularity build name_of_image.sif docker-archive://name_of_image.tar

Create a Singularity image on your own machine

For this you will need to create a Singularity definition file. Please see the Singularity documentation for more details about these or our training for an example.

If you are running Singularity directly on a machine where you have root access, you can build from the definition file like this:

singularity build alpine-example.sif alpine-example.def

If you are using the Singularity Docker image, it would look like this:

docker run --privileged -t --rm -v $PWD:/app quay.io/singularity/singularity:v3.7.1 \
       build /app/alpine-example.sif /app/alpine-example.def

In either case, you then simply transfer the resulting .sif image file to Savio.

Accessing your Savio storage from within Singularity containers

It can be useful for scripts running within Singularity to reference directories outside the Singularity container, i.e., directories on the Savio filesystem. In fact, when using a container you would generally do input and output to/from files on the Savio filesystem rather than files in the container.

  1. A user's home directory and scratch directory (as well as /tmp) are automatically available inside the container via the usual paths, /global/home/users/<username> and /global/scratch/users/<username>.
  2. Your working directory in the container will generally be the working directory on Savio from which you started the container (or in some cases simply your Savio home directory).

To reference other directories from within the container you need to create your own 'bind paths' that indicate which directory on the Savio filesystem to associate with a path in the container file system. The basic syntax is:

-B /path/on/host:/path/on/container

For example here we start a shell inside a container that mounts a subdirectory of a user's Savio scratch directory to /data in the container and creates a new file called erase-me that is accessible at /global/scratch/users/paciorek/some_dir/erase-me outside the container.

singularity shell -B /global/scratch/users/paciorek/some_dir:/data hello-world.sif
ls /data
echo "hello from inside the container" >> /data/erase-me
exit
ls -l /global/scratch/users/paciorek/some_dir/erase-me  

Using MPI with Singularity

You can run Singularity containers via MPI. You'll need to have MPI installed within the container.

  • If you are working on a single node, you can run MPI within a container.
  • However, more commonly you would use the MPI executable on Savio to execute Singularity containers.

The key thing in order to use the system MPI to run Singularity containers is to make sure the MPI installed inside the container is compatible with the MPI installed on Savio. The easiest way to ensure this is to have the version inside the container be the same version as the MPI module you plan to use on Savio. You can see these modules with:

module load gcc # load the gcc version of interest
module avail openmpi  # see the MPI versions available for that gcc

Here is an example of running a Singularity container via MPI:

module load gcc openmpi
mpirun singularity exec my_singularity_container_with_mpi.sif \
       /path/to/my/mpi/executable

That will launch /path/to/my/mpi/executable (which should be on Savio, not in the container) on as many processes as the number of tasks specified in your Slurm job.

Using Singularity with GPUs

You can easily use a Singularity container that does computation on a GPU.

Singularity supports NVIDIA’s CUDA GPU compute framework or AMD’s ROCm solution.

By using the --nv flag when running Singularity, the NVIDIA drivers on Savio are dynamically mounted into the container at run time. The container should provide the CUDA toolkit, using a version of the toolkit that is compatible with the NVIDIA driver version on Savio.

The minimal driver requirement for a specific version of the CUDA runtime/toolkit can be found in Table 1 here. E.g., CUDA 11.2 requires NVIDIA driver version >= 450.80.02.

Savio's NVIDIA driver version can be found by running nvidia-smi on a GPU node. Currently Savio has version 460.84, which supports at least up through CUDA 11.4. However, at some point this will no longer support newer CUDA versions, so at that point one would not want to use or create a container with that newer CUDA version, but one could use CUDA 11.4.

Here's an example of running a Singularity container based on a Docker container that provides GPU-using software. I am using an older version of PyTorch because newer versions depend on CUDA versions not supported by Savio's NVIDIA driver version.

singularity run --nv docker://pytorch/pytorch:1.6.0-cuda10.1-cudnn7-runtime

Of course it only makes sense to do this after using sbatch or srun to get access to a node with a GPU.

Persistent overlays for improved I/O performance and portability

Important

TL;DR: Use containers + persistent overlays to reduce I/O operations and improve performance when working with many small files (in particular this arises with Python packages) or to make your containerized analysis even more portable.

The ability to read and write large amounts of input and output (I/O) in rapid succession is important for many scientific workflows, such as image processing. Unfortunately, if you are working with many, many files (for example, Python packages tend to contain many small files), this can slow down your workflow and even reduce filesystem performance for other users.

A persistent overlay is an empty writeable filesystem that will be mounted in the Singularity container at runtime and will retain any changes made to its filesystem while running the container. From the container's perspective, the overlay is just another directory, but from the Savio filesystem perspective the overlay is a single file to which all I/O operations are applied. This is the key feature of the overlay: the Savio filesystem only has to manage metadata for the single overlay file rather than for all the files in the overlay.

Another reason to use overlays is to increase portability of your work, on top of what containers alone already provide. Using an overlay, you can package up an analysis and all of its data, including results, into a single file, making it easier to share the full analysis with collaborators or when it is time to publish. Moreover, the overlay can optionally be embedded directly into the container or it can be kept separate and used across different containers.

Creating an overlay image in your scratch directory

Important

You must put the overlay image in your scratch directory or it will not function properly. Please run the commands below on scratch.

To create an overlay image in the current working directory named overlay.img with a storage capacity of 1GB and a directory /data owned by you, run the following command:

singularity overlay create --size 1024 --create-dir /data overlay.img

The right --size will depend on your application and its I/O requirements.

Important

Don't use a directory with --create-dir that is already mounted on the Singularity container, or one that is is a subdirectory of an already-mounted directory. By default, your home directory, scratch directory, and the /tmp directory are mounted in the container. If need be, you can unmount default directories using the --no-mount flag when running the container, though this shouldn't normally be needed.

As an alternative, you can add the overlay directly to your Singularity container with the following command:

singularity overlay create --size 1024 --create-dir /data my_container.sif

For more information on the singularity overlay create command, see the documentation.

If you ran the first command, then at this point, you have a file overlay.img in the current directory which is ready to use with your container. For example, you can start a shell on your Singularity image using the following command:

# omit --overlay if you added it directly to my_container.sif
singularity shell --overlay=overlay.img my_container.sif
Now, any writes to the directory you created via --create-dir will be written to the overlay. For example, if you created the directory /data with the overlay, you can write various intermediate outputs to that directory while running the container:

Singularity> bash ~/my_containerized_compute.sh # writes outputs to /data
Singularity> ls /data
interim1.csv interim2.csv interim3.csv final_result.csv
Singularity> exit
ls /data # /data only exists on the overlay
ls: cannot access /data: No such file or directory
Next time you run the container with the same overlay, you will find that the files you wrote to the directory you created are persistent. You can also move or copy your final results from the overlay to the Savio filesystem while running the container:

singularity shell --overlay=overlay.img my_container.sif
Singularity> ls /data
interim1.csv interim2.csv interim3.csv final_result.csv
Singularity> mv /data/final_result.csv /global/scratch/users/$USER/

Manual overlay image creation for more control (advanced)

Note

Most users should use the singularity overlay create command as described above.

Run the following in your scratch directory, modifying the of, bs, and count flags to suit your workflow:

dd if=/dev/zero of=overlay.img bs=1M count=50
To take a deeper dive into the dd utility, run man dd. Below are the relevant flags for this use case.

dd flag Purpose
of The name of the overlay image file to create.
bs The number of bytes (or other unit, e.g. 1M for megabytes or 1G for gigabytes) to write to each block.
count The number of blocks to add to the overlay image.

The example above creates a file called overlay.img. The total storage capacity of the overlay image is determined by multiplying bs and count. In this example, we created an overlay image with a capacity of 50 MB.

Note

The dd command will take longer to run with larger filesystem sizes.

Next, you'll create a writeable ext3 filesystem via the following two steps (run on scratch in the same directory where you created overlay.img):

  1. Create a directory called overlay, and inside of it two directories: upper and work. Optionally, change permissions on the upper and work directories directories as needed for your workflow.

  2. Create an ext3 filesystem in overlay.img, copying upper and work into the root of the filesystem.

Here is one way to carry out those steps:

mkdir -p overlay/{upper,work}
mkfs.ext3 -d overlay overlay.img

The directories upper and work are required by Singularity so that the overlay is writeable according to the permissions set on them. You won't actually need to interact with these directories in any way (on scratch or in the container). The -d flag of mkfs.ext3 simply copies the contents of overlay into the root directory of the new filesystem in the file overlay.img, so you can safely delete the overlay directory from scratch after running the mkfs.ext3 command.

Now you can run your container with the overlay. Note that unlike with singularity create overlay, any directories you create in unmounted locations will be written to the overlay.

singularity shell --no-mount tmp --overlay=overlay.img ubuntu_20.04.sif
Singularity> mkdir /data # persists