Installing Software
This document describes how to install software to Savio for personal or group
use. Before installing software yourself, first check if it is already provided
on the cluster by running module avail
as described here.
Please note that some modules are listed hierarchically and will only appear on
the list after their parent module has been loaded (e.g. C libraries will only
appear after you’ve loaded their parent compiler.)
To instead request software to be installed or updated for use by all Savio users see our instructions here.
We also provide for general use certain large public datasets that are needed for widely-used workflows/software.
Requirements for software on Savio¶
Software you install on the cluster will need to:
- Be runnable (executable) on Rocky Linux 8. Choose the x86_64 tarball where available.
- Run in command line mode, or if graphical access is required, users will need to use the OOD Desktop app.
- Be capable of installation without root/sudo privileges. This
may involve adding command line options to installation commands or
changing values in installation-related configuration files; see
your vendor- or community-provided documentation for instructions.
- This means Docker can not be installed. Instead use Apptainer (Singularity).
- Be capable of installation in the storage space you have available. (For instance, the source code, intermediate products of installation scripts, and installed binaries must fit within your 30 GB space provided for your HOME directory, or within a group directory if the software is to be shared with other members of your group.)
- If compiled from source, be capable of being built using the compiler suites on Savio (GCC and Intel), or via user-installed compilers.
- Be capable of running without a persistent database running on Savio. An externally hosted database, to which your software on Savio connects, is OK. So is a database that is run on Savio only during execution of your job(s), which is populated by reading files from disk and whose state is saved (if necessary) by exporting the database state to files on disk.
- If commercial or otherwise license-restricted, come with a license that permits cluster usage (multi-core, and potentially multi-node), as well as a license enforcement mechanism (if any) that's compatible with the Savio environment.
If your software has installation dependencies – such as libraries, interfaces, or modules – first check whether they are already provided on the cluster before installing them yourself. Make sure that you've first loaded the relevant compiler, interpreted language, or application before examining the list of provided software, because that list is dynamically adjusted based on your current environment.
Installation location¶
The most important part of installing software on Savio is identifying where you should install it and how you should modify the installation script to point to the right location.
If you are installing software exclusively for your use, you can install
it in your HOME directory (/global/home/users/YOUR-USER-NAME
).
To install software for your whole group to use, you should install it
in your group directory (/global/home/groups/YOUR-GROUP-NAME
). If your
group does not have a shared directory defined yet, please email
brc-hpc-help@berkeley.edu.
In any case, be cognizant of space limitations in your HOME directory (10GB) or group directory (see documentation on storage limits for different types of groups).
If you will be doing a lot of software installation, you may want to add
sub-directories for sources
(source files downloaded), modules
(the
installed software), scripts
(if you want to document and routinize
your installation process using a script -- which is recommended), and
modfiles
(to create module files that will make software installed in
a group directory visible to your group members via the modules
command).
Installation using package managers¶
Packages for the most popular scripting languages, as well as other
applications, can be installed using the relevant package manager,
such as pip
or conda
. Generally, an option must be set to install
to your HOME directory or the path to install to must be provided.
See the instructions specific to each software under Using Software.
Conda can install more than just Python packages
Conda is a general package manager and can be used for more than just installing Python packages. (For example, you could actually install Julia, R, Ruby, and Java and their associated software packages and tools using Conda.) A good option for installing a piece of software is to check if there is a Conda package for it, before you try to install from source code. Executables installed when you install a Conda package will be placed in the bin
subdirectory of the active Conda environment.
Using containerized software with Apptainer (Singularity)¶
Apptainer (formerly Singularity) can be used to package software and environments together to create self-contained workflows. This allows you to run software that might otherwise be incompatible or difficult on Savio, and if an image has already been created, requires minimal installation. See our documentation on Apptainer for more info.
Adding software to the module system¶
Software you install for yourself or your group can be added to the module system in a limited scope, rather than for all users. To do this you must create a modulefile containing runtime environment information such as PATH and LD_LIBRARY_PATH. The path to user or group created modulefiles must be appended to the users' MODULEPATH.
Creating the modulefile¶
Name the module file as the version number of your installed software, then
place the module file within a directory bearing the name of the software. For
example, after installing gdal version 2.2.1 you would create a module file
gdal/2.2.1
.
The easiest way to write your modulefile is to use the example below as a template, but details on writing module files is provided in the Environment Module documentation.
Adding modulefile/s to your MODULEPATH¶
The module
commands will find any modules in subdirectories of directories
listed in your MODULEPATH environment variable. We highly recommend keeping
all your modulefiles in a single directory such as
/global/home/users/YOUR-USER-NAME/modfiles
or
/global/home/groups/YOUR-GROUP-NAME/modfiles
.
To automatically append this directory to your MODULEPATH when you log in you will need to add the appending command to your login profile.
- For bash users (most users), add
to your
export MODULEPATH=$MODULEPATH:/location/to/my/modulefiles
~/.bashrc
file. - For csh/tcsh users, add
to your
setenv MODULEPATH "$MODULEPATH":/location/to/my/modulefiles
~/.cshrc
file.
To do this for all users in your group you can add these lines to a .bashrc
or
.cshrc
file in your group directory (/global/home/groups/YOUR-GROUP-NAME
)
instead of your HOME directory.
Example installation process¶
The following example illustrates how to install the GDAL geospatial library for your group. It assumes that you have set up sub-directories as [discussed above] (#installation-location).
- Find the URL for the Linux binary tarball for your source code.
- For this example, we can find the tarball linked to from the GDAL website, under Download and then Sources: http://download.osgeo.org/gdal/2.2.1/gdal-2.2.1.tar.gz
-
Change to the directory where you want to install the software (e.g. your HOME or group directory.) If you have a created a
sources
sub-directory to help keep things tidy, move there; otherwise, you can simply download the source within your installation directory.cd /global/home/groups/my_group/sources
-
Download the source tarball.
wget http://download.osgeo.org/gdal/2.2.1/gdal-2.2.1.tar.gz
-
Untar the file you downloaded.
tar -zxvf gdal-2.2.1.tar.gz
-
Change to the new directory that was created with the contents of the tarball.
cd gdal-2.2.1
-
Load your requisite compiler and other dependencies.
- In most cases, we recommend using the default compiler; in Rocky Linux 8, this is GCC 11.4.0. If you have a particular need for a different compiler, such as the Intel compiler, or version, substitute it below.
module load gcc
-
Check the documentation for your software to determine where and how you can set the parameters for where the software will be installed. This varies from package to package, and may require modifying the configuration files in the source code itself. Make those changes as needed.
- For the gdal example (and this is the case for lots of other
software), the documentation indicates that we can specify the
installation location by adding
--prefix=/path/to/your/location
when running the config file.
- For the gdal example (and this is the case for lots of other
software), the documentation indicates that we can specify the
installation location by adding
-
Run the config file, adding in any required parameters for specifying location. If you’ve created a
modules
subfolder in your target directory, you may want to additionally create a directory for the software package, and a subdirectory for each version. If your software doesn’t have a config file, you will have to modify the Makefile itself to build it. Building the software can be done from any directory where you have the correct permissions. Once you have a binary, you can copy it to the correct location.mkdir -p /global/home/groups/my_group/modules/gdal/2.2.1 ./configure --prefix=/global/home/groups/my_group/modules/gdal/2.2.1
-
Debug the configuration process as needed.
- If the configuration fails due to insufficient permissions, then something in the process is probably trying to use a default path. Double-check that you’ve overridden the default paths for every aspect of the configuration process, to ensure that files are written to directories for which you have write permission.
-
One way to log everything from the configuration process for later debugging is as follows:
script /global/home/groups/my_group/sources/logfile-software-version
-
When you want to stop logging, run exit. All the output will be logged to the file logfile-software-version (e.g.,
logfile-gdal-2.2.1
) in the sources sub-directory.
-
Build the software.
make
-
Install the software.
make install
-
Change the permissions.
-
To allow the group to use and modify the software, change its UNIX group.
cd modules chgrp -R my_group gdal/
-
Make the software executable by group members.
chmod -R g+rwX gdal
-
Alternatively, to make the software executable only for yourself use:
chmod -R u+x gdal
-
-
OPTIONAL: Create a modulefile.
-
The following file should be named by the version number,
2.2.1
.#%Module1.0 ## gdal 2.2.1 ## by Lizzy Borden` proc ModulesHelp { } { puts stderr "loads the environment for gdal 2.2.1" } module-whatis "loads the environment for gdal 2.2.1" set GDAL_DIR /global/home/groups/my_group/modules/gdal/2.2.1/ setenv GDAL_DIR $GDAL_DIR prepend-path PATH $GDAL_DIR/bin prepend-path LD_LIBRARY_PATH $GDAL_DIR/lib prepend-path MANPATH $GDAL_DIR/man
-
Place the modulefile within your modulefile directory structure and alter its permissions accordingly.
mkdir modfiles/gdal mv 2.2.1 modfiles/gdal cd modfiles chgrp -R my_group gdal chmod -R g+rwX gdal
-
Finally, add the
modfile
directory to your groupsMODULEPATH
environment variable.echo "export MODULEPATH=$MODULEPATH:/global/home/groups/my_group/modfiles" >> /global/home/groups/my_group/.bashrc
-
Example installation scripts¶
These examples use tee instead of script to create log files. They also
compile the software in parallel with the -j8
flag.
gnuplot¶
#!/bin/sh
make distclean
./configure --prefix=/global/home/groups/my_group/modules/gnuplot/4.6.0 --with-readline=gnu --with-gd=/usr 2>&1 | tee gnuplot-4.6.0.configure.log
make -j8 2>&1 | tee gnuplot-4.6.0.make.log
make check 2>&1 | tee gnuplot-4.6.0.check.log
make install 2>&1 | tee gnuplot-4.6.0.install.log
make distclean
cgal¶
#!/bin/sh
module load gcc/4.4.7 openmpi/1.6.5-gcc qt/4.8.0 cmake/2.8.11.2 boost/1.54.0-gcc
make distclean
cmake -DCMAKE_INSTALL_PREFIX=/global/home/groups/my_group/modules/cgal/4.4-gcc . 2>&1 | tee cgal-4.4-gcc.cmake.log
make -j8 2>&1 | tee cgal-4.4-make.log
make install 2>&1 | tee cgal-4.4-install.log
make distclean
What if the software doesn’t come with a configure script?¶
If the software doesn’t come with a configure script, you will have to modify the Makefile itself to build it. Building the software can be done from any directory where you have the correct permissions. Once you have a binary, you can copy it to the correct location.