Skip to content

Installing Software

This document describes how to install software to Savio for personal or group use. Before installing software yourself, first check if it is already provided on the cluster by running module avail as described here. Please note that some modules are listed hierarchically and will only appear on the list after their parent module has been loaded (e.g. C libraries will only appear after you’ve loaded their parent compiler.)

To instead request software to be installed or updated for use by all Savio users see our instructions here.

We also provide for general use certain large public datasets that are needed for widely-used workflows/software.

Requirements for software on Savio

Software you install on the cluster will need to:

  • Be runnable (executable) on Scientific Linux 7 (i.e. essentially Red Hat Enterprise Linux 7). Choose the x86_64 tarball where available.
  • Run in command line mode, or if graphical access is required, users will need to use the OOD Desktop app.
  • Be capable of installation without root/sudo privileges. This may involve adding command line options to installation commands or changing values in installation-related configuration files; see your vendor- or community-provided documentation for instructions.
  • Be capable of installation in the storage space you have available. (For instance, the source code, intermediate products of installation scripts, and installed binaries must fit within your 10 GB space provided for your HOME directory, or within a group directory if the software is to be shared with other members of your group.)
  • If compiled from source, be capable of being built using the compiler suites on Savio (GCC and Intel), or via user-installed compilers.
  • Be capable of running without a persistent database running on Savio. An externally hosted database, to which your software on Savio connects, is OK. So is a database that is run on Savio only during execution of your job(s), which is populated by reading files from disk and whose state is saved (if necessary) by exporting the database state to files on disk.
  • If commercial or otherwise license-restricted, come with a license that permits cluster usage (multi-core, and potentially multi-node), as well as a license enforcement mechanism (if any) that's compatible with the Savio environment.

If your software has installation dependencies – such as libraries, interfaces, or modules – first check whether they are already provided on the cluster before installing them yourself. Make sure that you've first loaded the relevant compiler, interpreted language, or application before examining the list of provided software, because that list is dynamically adjusted based on your current environment.

Installation location

The most important part of installing software on Savio is identifying where you should install it and how you should modify the installation script to point to the right location.

If you are installing software exclusively for your use, you can install it in your HOME directory (/global/home/users/YOUR-USER-NAME).

To install software for your whole group to use, you should install it in your group directory (/global/home/groups/YOUR-GROUP-NAME). If your group does not have a shared directory defined yet, please email brc-hpc-help@berkeley.edu.

In any case, be cognizant of space limitations in your HOME directory (10GB) or group directory (see documentation on storage limits for different types of groups).

If you will be doing a lot of software installation, you may want to add sub-directories for sources (source files downloaded), modules (the installed software), scripts (if you want to document and routinize your installation process using a script -- which is recommended), and modfiles (to create module files that will make software installed in a group directory visible to your group members via the modules command).

Installation using package managers

Packages for the most popular scripting languages, as well as other applications, can be installed using the relevant package manager, such as pip or conda. Generally, an option must be set to install to your HOME directory or the path to install to must be provided.

See the instructions specific to each software under Using Software.

Conda can install more than just Python packages

Conda is a general package manager and can be used for more than just installing Python packages. (For example, you could actually install Julia or R using Conda.) A good option for installing a piece of software is to check if there is a Conda package for it, before you try to install from source code. Executables installed when you install a Conda package will be placed in the bin subdirectory of the active Conda environment.

Using containerized software with Apptainer (Singularity)

Apptainer (formerly Singularity) can be used to package software and environments together to create self-contained workflows. This allows you to run software that might otherwise be incompatible or difficult on Savio, and if an image has already been created, requires minimal installation. See our documentation on Apptainer for more info.

Adding software to the module system

Software you install for yourself or your group can be added to the module system in a limited scope, rather than for all users. To do this you must create a modulefile containing runtime environment information such as PATH and LD_LIBRARY_PATH. The path to user or group created modulefiles must be appended to the users' MODULEPATH.

Creating the modulefile

Name the module file as the version number of your installed software, then place the module file within a directory bearing the name of the software. For example, after installing gdal version 2.2.1 you would create a module file gdal/2.2.1.

The easiest way to write your modulefile is to use the example below as a template, but details on writing module files is provided in the Environment Module documentation.

Adding modulefile/s to your MODULEPATH

The module commands will find any modules in subdirectories of directories listed in your MODULEPATH environment variable. We highly recommend keeping all your modulefiles in a single directory such as /global/home/users/YOUR-USER-NAME/modfiles or /global/home/groups/YOUR-GROUP-NAME/modfiles.

To automatically append this directory to your MODULEPATH when you log in you will need to add the appending command to your login profile.

  • For bash users (most users), add
    export MODULEPATH=$MODULEPATH:/location/to/my/modulefiles
    
    to your ~/.bashrc file.
  • For csh/tcsh users, add
    setenv MODULEPATH "$MODULEPATH":/location/to/my/modulefiles
    
    to your ~/.cshrc file.

To do this for all users in your group you can add these lines to a .bashrc or .cshrc file in your group directory (/global/home/groups/YOUR-GROUP-NAME) instead of your HOME directory.

Example installation process

The following example illustrates how to install the GDAL geospatial library for your group. It assumes that you have set up sub-directories as [discussed above] (#installation-location).

  • Find the URL for the Linux binary tarball for your source code.
  • Change to the directory where you want to install the software (e.g. your HOME or group directory.) If you have a created a sources sub-directory to help keep things tidy, move there; otherwise, you can simply download the source within your installation directory.

    cd /global/home/groups/my_group/sources
    
  • Download the source tarball.

    wget http://download.osgeo.org/gdal/2.2.1/gdal-2.2.1.tar.gz
    
  • Untar the file you downloaded.

    tar -zxvf gdal-2.2.1.tar.gz
    
  • Change to the new directory that was created with the contents of the tarball.

    cd gdal-2.2.1
    
  • Load your requisite compiler and other dependencies.

    • In most cases, we recommend using the default compiler; in SL7, this is GCC 6.3.0. If you have a particular need for a different compiler, such as the Intel compiler, or version, substitute it below.
    module load gcc
    
  • Check the documentation for your software to determine where and how you can set the parameters for where the software will be installed. This varies from package to package, and may require modifying the configuration files in the source code itself. Make those changes as needed.

    • For the gdal example (and this is the case for lots of other software), the documentation indicates that we can specify the installation location by adding --prefix=/path/to/your/location when running the config file.
  • Run the config file, adding in any required parameters for specifying location. If you’ve created a modules subfolder in your target directory, you may want to additionally create a directory for the software package, and a subdirectory for each version. If your software doesn’t have a config file, you will have to modify the Makefile itself to build it. Building the software can be done from any directory where you have the correct permissions. Once you have a binary, you can copy it to the correct location.

    mkdir -p /global/home/groups/my_group/modules/gdal/2.2.1
    ./configure --prefix=/global/home/groups/my_group/modules/gdal/2.2.1
    
  • Debug the configuration process as needed.

    • If the configuration fails due to insufficient permissions, then something in the process is probably trying to use a default path. Double-check that you’ve overridden the default paths for every aspect of the configuration process, to ensure that files are written to directories for which you have write permission.
    • One way to log everything from the configuration process for later debugging is as follows:

      script /global/home/groups/my_group/sources/logfile-software-version
      
    • When you want to stop logging, run exit. All the output will be logged to the file logfile-software-version (e.g., logfile-gdal-2.2.1) in the sources sub-directory.

  • Build the software.

    make
    
  • Install the software.

    make install
    
  • Change the permissions.

    • To allow the group to use and modify the software, change its UNIX group.

      cd modules
      chgrp -R my_group gdal/
      
    • Make the software executable by group members.

      chmod -R g+rwX gdal
      
    • Alternatively, to make the software executable only for yourself use:

      chmod -R u+x gdal
      
  • OPTIONAL: Create a modulefile.

    • The following file should be named by the version number, 2.2.1.

      #%Module1.0
      ## gdal 2.2.1
      ## by Lizzy Borden`
      
      proc ModulesHelp { } { puts stderr "loads the environment for gdal 2.2.1" }
      
      module-whatis "loads the environment for gdal 2.2.1"
      
      set GDAL_DIR /global/home/groups/my_group/modules/gdal/2.2.1/
      setenv GDAL_DIR $GDAL_DIR
      prepend-path PATH $GDAL_DIR/bin
      prepend-path LD_LIBRARY_PATH $GDAL_DIR/lib
      prepend-path MANPATH $GDAL_DIR/man
      

    • Place the modulefile within your modulefile directory structure and alter its permissions accordingly.

      mkdir modfiles/gdal
      mv 2.2.1 modfiles/gdal
      cd modfiles
      chgrp -R my_group gdal
      chmod -R g+rwX gdal
      
    • Finally, add the modfile directory to your groups MODULEPATH environment variable.

      echo "export MODULEPATH=$MODULEPATH:/global/home/groups/my_group/modfiles" >> /global/home/groups/my_group/.bashrc
      

Example installation scripts

These examples use tee instead of script to create log files. They also compile the software in parallel with the -j8 flag.

gnuplot

#!/bin/sh
make distclean
./configure --prefix=/global/home/groups/my_group/modules/gnuplot/4.6.0 --with-readline=gnu --with-gd=/usr 2>&1 | tee gnuplot-4.6.0.configure.log
make -j8 2>&1 | tee gnuplot-4.6.0.make.log
make check 2>&1 | tee gnuplot-4.6.0.check.log
make install 2>&1 | tee gnuplot-4.6.0.install.log
make distclean

cgal

#!/bin/sh
module load gcc/4.4.7 openmpi/1.6.5-gcc qt/4.8.0 cmake/2.8.11.2 boost/1.54.0-gcc
make distclean
cmake -DCMAKE_INSTALL_PREFIX=/global/home/groups/my_group/modules/cgal/4.4-gcc . 2>&1 | tee cgal-4.4-gcc.cmake.log
make -j8 2>&1 | tee cgal-4.4-make.log
make install 2>&1 | tee cgal-4.4-install.log
make distclean

What if the software doesn’t come with a configure script?

If the software doesn’t come with a configure script, you will have to modify the Makefile itself to build it. Building the software can be done from any directory where you have the correct permissions. Once you have a binary, you can copy it to the correct location.