Using Apptainer (Singularity) Containers

Summary

This document describes how to use <a href="https://apptainer.org/docs/user/latest/" target="_blank">Apptainer</a>, a software tool that allows you to run Linux containers. Using containers facilitates the movement of software applications and workflows between various computational environments.

Singularity is now Apptainer

Singularity is now known as Apptainer. You can run it either by invoking `apptainer` or `singularity` -- there is no difference.

Overview¶

Apptainer is a software tool provided on the Savio cluster. It allows you to bring already-built research applications and workflows from other Linux environments onto Savio and run them on the cluster, without any installation or reconfiguration required. Singularity packages those applications and workflows in “containers,” and runs them within the Apptainer container's boundaries.

Containerization provides "lightweight, standalone, executable packages of software that include everything needed to run an application: code, runtime, system tools, system libraries and settings".
A container provides a self-contained (isolated) filesystem.
Containers are similar to virtual machines in some ways, but much lighter-weight.
Containers are portable, shareable, and reproducible.

Apptainer allows you to create containers, or find and obtain containers from others, and then run them on any Linux platform where Apptainer is installed. Research software that you or others have packaged up into Apptainer containers can be copied to -- and run on -- multiple clusters, cloud environments, workstations, and laptops.

Apptainer thus enables “Bring Your Own Environment” computing. It is conceptually similar to Docker, a well-known software containerization platform that isn’t compatible with the security models used on Savio and other traditional High Performance Computing (HPC) environments. Both Apptainer and Docker, in turn, have some similarities to virtual machines.

Apptainer containers that you use on Savio must be created on a different computer. Root permission is required to create Apptainer containers, and users are not allowed to run as root on the cluster. Options for creating image-based Apptainer containers, which can then be run on Savio under a user’s normal set of permissions, are described below. One option includes using existing Docker images directly in Apptainer on Savio.

In addition to this documentation, more information can be found in our April 2021 training on using Singularity (now Apptainer) on Savio.

Running Apptainer containers on Savio¶

Assuming you have a Apptainer container in a directory on Savio you can run it as follows.

apptainer run mycontainer.sif

We can run a Docker container available from DockerHub (behind the scenes the Docker image will be downloaded and converted to a Apptainer image) like this:

apptainer run docker://ubuntu:20.04

That will put you into a shell inside a container running Ubuntu Linux 20.04. Note the change in prompt after the container starts. Inside the container you could do things like the following to convince yourself that you are running in the container and not on Savio, although your working directory will generally be a Savio directory.

cat /etc/issue   # not the Savio OS!
which python     # not much here!
pwd

Apptainer containers can be used in three ways:

shell sub-command: invokes an interactive shell within a container
```
apptainer shell mycontainer.sif
```
run sub-command: executes the container’s runscript (i.e., the primary way the container's builder intends for the container to be used)
```
apptainer run mycontainer.sif
```
exec sub-command: execute an arbitrary command within container
```
apptainer exec mycontainer.sif cat /etc/os-release
```

In the example with ubuntu:20.04 above, the container's runscript simply starts a shell inside the container.

Running via Slurm¶

Of course in most cases you will be running jobs under Slurm.

To submit a batch job that runs Apptainer on Savio, create a SLURM job script file. A simple example follows:

# Job name:
#SBATCH --job-name=test_Apptainer
#
# Account:
#SBATCH --account=account_name
#
# Partition:
#SBATCH --partition=partition_name
#
# Wall clock limit:
#SBATCH --time=00:30:00
#
## Command(s) to run:
apptainer run /path/to/container/mycontainer.sif

Creating Apptainer container images¶

Overview¶

You have a variety of options for creating Apptainer container images that you can run on Savio.

Directly create a Apptainer image by importing a Docker or Apptainer image from an image registry.
- Docker image registries include DockerHub and the GitHub container registry.
- Singularity image registries include the Sylabs container registry. Docker image registry such as DockerHub.
- This does not require root access in any form but restricts you to only the images already available.
Create a Docker image on your own machine.
- One option is then to push it to a Docker registry (such as DockerHub), and then import the Docker image as above.
- A second option is to archive the Docker image, transfer it to Savio, and then convert to a Apptainer image.
- These options rely on installing and running Docker on your own machine.
Create a Apptainer image on your own machine and transfer it to Savio.
- One option is to install Apptainer on your own machine.
- A second option is to install Docker on your own machine and run Singularity within the quay.io Singularity Docker image.
Use a cloud service that allows you to build images, such as Sylabs Remote Builder

Here we provide more details on some of these options. More details can be found in our training and in various online documentation for Docker and Apptainer/Singularity.

Note that when building an image from either a Dockerfile or Apptainer definition file, you generally want to base your image (i.e., bootstrap it) on an existing image that may have key software already installed. Some examples include images with Tensorflow, PyTorch or R/RStudio.

Import an existing Docker image or Apptainer image¶

You can simply ask Apptainer to run a Docker container as shown above; behind the scenes this will create a Apptainer image file.

To create the Apptainer image and run in two explicit steps:

apptainer pull docker://ubuntu:20.04   # this creates ubuntu_20.04.sif
apptainer run ubuntu_20.04.sif

Note that this also creates a copy of the container in your Apptainer cache, which can quickly fill up your home directory. Some useful comands for working with the cache are:

apptainer cache list
apptainer cache clean

You can control where Apptainer locates the cache and the temporary directory that Apptainer uses by setting APPTAINER_CACHEDIR and APPTAINER_TMPDIR, respectively. In particular, you may want to use your scratch directory or /tmp as the location of the cache.

Create a Docker image on your own machine¶

For this you will need a Dockerfile that defines what you want in your image. There are lots of examples online.

To create an image and push to a Docker image registry (in this case DockerHub), the process looks like this, assuming you have a Dockerfile in your working directory:

docker login --username=paciorek
docker build -t tagname .
docker tag tagname paciorek/name_of_image:0.1  # version number is set to be 0.1
docker push paciorek/name_of_image

This assumes you have created a DockerHub account; here the DockerHub username is paciorek.

Once the image is on the registry you can run it (or pull and then run it) using the commands shown above for pre-existing Docker images.

To create an image, archive it locally on the machine you are using, and convert to a Apptainer image, the process looks like this:

docker build -t tagname .
docker save tagname > name_of_image.tar

Then transfer the .tar file to Savio and run:

apptainer build name_of_image.sif docker-archive://name_of_image.tar

Create a Apptainer image on your own machine¶

For this you will need to create a Apptainer definition file. Please see the Apptainer documentation for more details about these or our training for an example.

If you are running Apptainer directly on a machine where you have root access, you can build from the definition file like this:

apptainer build alpine-example.sif alpine-example.def

If you are using the Apptainer Docker image, it would look like this:

docker run --privileged -t --rm -v $PWD:/app quay.io/singularity/singularity:v3.7.1 \
       build /app/alpine-example.sif /app/alpine-example.def

In either case, you then simply transfer the resulting .sif image file to Savio.

Accessing your Savio storage from within Apptainer containers¶

It can be useful for scripts running within Apptainer to reference directories outside the Apptainer container, i.e., directories on the Savio filesystem. In fact, when using a container you would generally do input and output to/from files on the Savio filesystem rather than files in the container.

A user's home directory and scratch directory (as well as /tmp) are automatically available inside the container via the usual paths, /global/home/users/<username> and /global/scratch/users/<username>.
Your working directory in the container will generally be the working directory on Savio from which you started the container (or in some cases simply your Savio home directory).

To reference other directories from within the container you need to create your own 'bind paths' that indicate which directory on the Savio filesystem to associate with a path in the container file system. The basic syntax is:

-B /path/on/host:/path/on/container

For example here we start a shell inside a container that mounts a subdirectory of a user's Savio scratch directory to /data in the container and creates a new file called erase-me that is accessible at /global/scratch/users/paciorek/some_dir/erase-me outside the container.

apptainer shell -B /global/scratch/users/paciorek/some_dir:/data hello-world.sif
ls /data
echo "hello from inside the container" >> /data/erase-me
exit
ls -l /global/scratch/users/paciorek/some_dir/erase-me

Using MPI with Apptainer¶

You can run Apptainer containers via MPI. You'll need to have MPI installed within the container.

If you are working on a single node, you can run MPI within a container.
However, more commonly you would use the MPI executable on Savio to execute Apptainer containers.

The key thing in order to use the system MPI to run Apptainer containers is to make sure the MPI installed inside the container is compatible with the MPI installed on Savio. The easiest way to ensure this is to have the version inside the container be the same version as the MPI module you plan to use on Savio. You can see these modules with:

module load gcc # load the gcc version of interest
module avail openmpi  # see the MPI versions available for that gcc

Here is an example of running a Apptainer container via MPI:

module load gcc openmpi
mpirun apptainer exec my_apptainer_container_with_mpi.sif \
       /path/to/my/mpi/executable

That will launch /path/to/my/mpi/executable (which should be in the container, not on Savio) on as many processes as the number of tasks specified in your Slurm job.

Using Apptainer with GPUs¶

You can easily use a Apptainer container that does computation on a GPU.

Apptainer supports NVIDIA’s CUDA GPU compute framework or AMD’s ROCm solution.

By using the --nv flag when running Apptainer, the NVIDIA drivers on Savio are dynamically mounted into the container at run time. The container should provide the CUDA toolkit, using a version of the toolkit that is compatible with the NVIDIA driver version on Savio.

The minimal driver requirement for a specific version of the CUDA runtime/toolkit can be found in Table 1 here. E.g., CUDA 11.2 requires NVIDIA driver version >= 450.80.02.

Savio's NVIDIA driver version can be found by running nvidia-smi on a GPU node. Currently Savio has version 460.84, which supports at least up through CUDA 11.4. However, at some point this will no longer support newer CUDA versions, so at that point one would not want to use or create a container with that newer CUDA version, but one could use CUDA 11.4.

Here's an example of running a Apptainer container based on a Docker container that provides GPU-using software. I am using an older version of PyTorch because newer versions depend on CUDA versions not supported by Savio's NVIDIA driver version.

apptainer run --nv docker://pytorch/pytorch:1.6.0-cuda10.1-cudnn7-runtime

Of course it only makes sense to do this after using sbatch or srun to get access to a node with a GPU.

Persistent overlays for improved I/O performance and portability¶

Important

TL;DR: Use containers + persistent overlays to reduce I/O operations and improve performance when working with many small files (in particular this arises with Python packages) or to make your containerized analysis even more portable.

The ability to read and write large amounts of input and output (I/O) in rapid succession is important for many scientific workflows, such as image processing. Unfortunately, if you are working with many, many files (for example, Python packages tend to contain many small files), this can slow down your workflow and even reduce filesystem performance for other users.

A persistent overlay is an empty writeable filesystem that will be mounted in the Apptainer container at runtime and will retain any changes made to its filesystem while running the container. From the container's perspective, the overlay is just another directory, but from the Savio filesystem perspective the overlay is a single file to which all I/O operations are applied. This is the key feature of the overlay: the Savio filesystem only has to manage metadata for the single overlay file rather than for all the files in the overlay.

Another reason to use overlays is to increase portability of your work, on top of what containers alone already provide. Using an overlay, you can package up an analysis and all of its data, including results, into a single file, making it easier to share the full analysis with collaborators or when it is time to publish. Moreover, the overlay can optionally be embedded directly into the container or it can be kept separate and used across different containers.

Creating an overlay image in your scratch directory¶

Important

You must put the overlay image in your scratch directory or it will not function properly. Please run the commands below on scratch.

To create an overlay image in the current working directory named overlay.img with a storage capacity of 1GB and a directory /data owned by you, run the following command:

apptainer overlay create --size 1024 --create-dir /data overlay.img

The right --size will depend on your application and its I/O requirements.

Important

Don't use a directory with --create-dir that is already mounted on the Apptainer container, or one that is is a subdirectory of an already-mounted directory. By default, your home directory, scratch directory, and the /tmp directory are mounted in the container. If need be, you can unmount default directories using the --no-mount flag when running the container, though this shouldn't normally be needed.

As an alternative, you can add the overlay directly to your Apptainer container with the following command:

apptainer overlay create --size 1024 --create-dir /data my_container.sif

For more information on the apptainer overlay create command, see the documentation.

If you ran the first command, then at this point, you have a file overlay.img in the current directory which is ready to use with your container. For example, you can start a shell on your Apptainer image using the following command:

# omit --overlay if you added it directly to my_container.sif
apptainer shell --overlay=overlay.img my_container.sif

Now, any writes to the directory you created via --create-dir will be written to the overlay. For example, if you created the directory /data with the overlay, you can write various intermediate outputs to that directory while running the container:

Apptainer> bash ~/my_containerized_compute.sh # writes outputs to /data
Apptainer> ls /data
interim1.csv interim2.csv interim3.csv final_result.csv
Apptainer> exit
ls /data # /data only exists on the overlay
ls: cannot access /data: No such file or directory

Next time you run the container with the same overlay, you will find that the files you wrote to the directory you created are persistent. You can also move or copy your final results from the overlay to the Savio filesystem while running the container:

Apptainer shell --overlay=overlay.img my_container.sif
Apptainer> ls /data
interim1.csv interim2.csv interim3.csv final_result.csv
Apptainer> mv /data/final_result.csv /global/scratch/users/$USER/

Manual overlay image creation for more control (advanced)¶

Note

Most users should use the Apptainer overlay create command as described above.

Run the following in your scratch directory, modifying the of, bs, and count flags to suit your workflow:

dd if=/dev/zero of=overlay.img bs=1M count=50

To take a deeper dive into the dd utility, run man dd. Below are the relevant flags for this use case.

`dd` flag	Purpose
`of`	The name of the overlay image file to create.
`bs`	The number of bytes (or other unit, e.g. `1M` for megabytes or `1G` for gigabytes) to write to each block.
`count`	The number of blocks to add to the overlay image.

The example above creates a file called overlay.img. The total storage capacity of the overlay image is determined by multiplying bs and count. In this example, we created an overlay image with a capacity of 50 MB.

Note

The dd command will take longer to run with larger filesystem sizes.

Next, you'll create a writeable ext3 filesystem via the following two steps (run on scratch in the same directory where you created overlay.img):

Create a directory called overlay, and inside of it two directories: upper and work. Optionally, change permissions on the upper and work directories directories as needed for your workflow.
Create an ext3 filesystem in overlay.img, copying upper and work into the root of the filesystem.

Here is one way to carry out those steps:

mkdir -p overlay/{upper,work}
mkfs.ext3 -d overlay overlay.img

The directories upper and work are required by Apptainer so that the overlay is writeable according to the permissions set on them. You won't actually need to interact with these directories in any way (on scratch or in the container). The -d flag of mkfs.ext3 simply copies the contents of overlay into the root directory of the new filesystem in the file overlay.img, so you can safely delete the overlay directory from scratch after running the mkfs.ext3 command.

Now you can run your container with the overlay. Note that unlike with apptainer create overlay, any directories you create in unmounted locations will be written to the overlay.

apptainer shell --no-mount tmp --overlay=overlay.img ubuntu_20.04.sif
apptainer> mkdir /data # persists

Using Apptainer (Singularity) Containers

Overview¶

Running Apptainer containers on Savio¶

Trying things out on a login node¶

Running via Slurm¶

Creating Apptainer container images¶

Overview¶

Import an existing Docker image or Apptainer image¶

Create a Docker image on your own machine¶

Create a Apptainer image on your own machine¶

Accessing your Savio storage from within Apptainer containers¶

Using MPI with Apptainer¶

Using Apptainer with GPUs¶

Persistent overlays for improved I/O performance and portability¶

Creating an overlay image in your scratch directory¶

Manual overlay image creation for more control (advanced)¶