Skip to content

Modules

Graham Gower edited this page Jun 16, 2017 · 52 revisions

In order to provide users with access to a range of software, the ACAD servers provide a number of software modules via lmod. Typically, the user will need to load a module or modules prior to running an analyses. Please read the lmod documentation for the complete details on how the module system works. Below is a basic introduction, including a more specific description on how modules are used on the ACAD servers.

Quick start

Avoid modules with foss-2016a in the name, instead favour modules with foss-2016b.

Add the following to your ~/.bash_profile

export MODULEPATH=/data/acad/apps/modules/all:$MODULEPATH

Delete module cache, which may be required the first time you update your MODULEPATH.

$ rm -r $HOME/.lmod.d

List all available modules

$ module avail

Search for modules named 'python' (case insensitive)

$ module spider python

Load a module named 'Python' (case sensitive)

$ module load Python

Load a specific version of the Python module

$ module load Python/2.7.13-foss-2016b

Unload the Python module

$ module unload Python

List currently loaded modules

$ module list

How commands are executed

When you type a command in your Terminal, there is some software known as the shell which interprets the command. The default shell on Mac and many Linux systems is bash, but other shells behave similarly. A command typed into the shell may refer to either a shell builtin (such as cd and echo), a shell alias/function, or a standalone program (such as vim or gzip). The shell first compares your command with its list of builtins, if not found it then compares the command with aliases and functions, and if still not found it looks for standalone programs in your PATH. If the command exists in multiple places, the first one takes precedence (builtin,alias,function,order in PATH) and if no corresponding command is found, an error is reported.

PATH is a special environment variable containing a colon separated list of directories. You can look at the value of an environment variable using the echo builtin. Note that when referencing an environment variable, the variable has a dollar symbol prefix (the prefix is not used when setting an environment variable).

$ echo $PATH
/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin

The above is a standard set of locations for binary files on Unix-like systems. The type builtin can be used to determine whether a command is a builtin, alias, function, or a standalone program.

$ type type
type is a shell builtin
$ type ls
ls is aliased to `ls --color=auto'
$ type vim
vim is /usr/bin/vim
$ type fixworldpoverty
-bash: type: fixworldpoverty: not found

Note that when we asked for the type of command vim, we were told its location. For the ls command, we were told it is an alias and what the alias is. An alias is a simple translation of one command into another command, and in this case ls is an alias to ls --color=auto. In fact, ls is a standalone program and the alias gives it a default parameter without it having to be typed in full each time. We can find the location of ls in the PATH too, using the which command.

$ which ls
alias ls='ls --color=auto'
        /bin/ls

The module system modifies your PATH

The module command can be used to load or unload software. Loading software causes new directories to be prepended to your PATH.

$ echo $PATH
/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin
$ module load Python
$ echo $PATH
/apps/software/Python/2.7.13-foss-2016b/bin:/apps/software/SQLite/3.13.0-foss-2016b/bin:/apps/software/Tcl/8.6.5-foss-2016b/bin:/apps/software/libreadline/6.3-foss-2016b/bin:/apps/software/ncurses/6.0-foss-2016b/bin:/apps/software/bzip2/1.0.6-foss-2016b/bin:/apps/software/FFTW/3.3.4-gompi-2016b/bin:/apps/software/OpenBLAS/0.2.18-GCC-5.4.0-2.26-LAPACK-3.6.1/bin:/apps/software/OpenMPI/1.10.3-GCC-5.4.0-2.26/bin:/apps/software/hwloc/1.11.3-GCC-5.4.0-2.26/sbin:/apps/software/hwloc/1.11.3-GCC-5.4.0-2.26/bin:/apps/software/numactl/2.0.11-GCC-5.4.0-2.26/bin:/apps/software/binutils/2.26-GCCcore-5.4.0/bin:/apps/software/GCCcore/5.4.0/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin

module is a shell function

$ type module
module is a function
module () 
{ 
    eval $($LMOD_CMD bash "$@");
    [ $? = 0 ] && eval $(${LMOD_SETTARG_CMD:-:} -s sh)
}
$ echo $LMOD_CMD
/usr/share/lmod/lmod/libexec/lmod
$ file /usr/share/lmod/lmod/libexec/lmod
/usr/share/lmod/lmod/libexec/lmod: a /usr/bin/lua script text executable

The module command is a shell function. The shell is a fully turing-complete programming language, and as such functions can be written in this language. type shows us the function definition. This function evaluates whatever is in the LMOD_CMD environment variable and passes it the 'bash' parameter and also any additional parameters given to the module command ($@). file shows the type of file for the lmod executable. Lmod is written in the Lua programming language.

This appears to be a very awkward and opaque design for calling a program. Please do not follow its example when implementing your own programs.

Location of modules

Looking at the lmod documentation, we find that lmod looks for modules in directories listed in the MODULEPATH environment variable. Actually, when we load a module, lmod looks in the MODULEPATH directories for a script (written in either the TCL or Lua programming languages) with the corresponding name, and then runs that script. The module script for the software is responsible for modifying PATH and other environment variables, and may load additional modules which are dependencies. So the software might really be installed anywhere, as long as the module script for that software modifies the PATH accordingly.

$ echo $MODULEPATH
/apps/modules/all:/usr/share/modulefiles/Linux:/usr/share/modulefiles/Core:/usr/share/lmod/lmod/modulefiles/Core
$ ls /apps/modules/all
AdapterRemoval  BLAST      FastQC         HDF5           metafast   QIIME
AdmixTools      BLAST+     FASTX-Toolkit  HTSlib         NASM       R
Anaconda2       Boost      FFTW           hwloc          ncurses    RAxML
Anaconda3       Bowtie     flex           Java           netCDF     SAMtools
angsd           Bowtie2    foss           Kaiju          numactl    ScaLAPACK
ART             BWA        GATK           libcerf        OpenBLAS   seqtk
Autoconf        bwa-meth   GCC            libevent       OpenMPI    SQLite
Automake        bwa-pssm   GCCcore        libgtextutils  paleomix   Szip
Autotools       bzip2      GDAL           libjpeg-turbo  Pango      tabix
BCFtools        cairo      GDB            libpng         parallel   taxator-tk
bcl2fastq       capnproto  gdsl           libreadline    PCRE       Tcl
bedops          CMake      gettext        LibTIFF        Perl       Tk
BEDOPS          cURL       GHC            libtool        picard     tmux
BEDTools        cutadapt   ghostscript    libxml2        PileOMeth  toolshed
binutils        Doxygen    GMAP-GSNAP     libxslt        PLINK      treemix
bioawk          EasyBuild  GMP            M4             preseq     VCFtools
Biopython       EIGENSOFT  gnuplot        MAFFT          PROJ       XZ
Bismark         ExaML      gompi          mapDamage      Pysam      zlib
Bison           expat      GSL            mash           Python
$ ls /apps/modules/all/Python/
2.7.11-foss-2016a
2.7.13-foss-2016b.lua

ACAD modules

By modifying the MODULEPATH, we can tell lmod to search additional locations for module scripts.

$ ls /data/acad/apps/modules/all
AdapterRemoval  Boost      HTSlib  mapDamage        SAMtools
ART             BWA        HUMAnN  paleomix-meta    SCons
BCFtools        freetype   Kaiju   Python-packages  texlive
BEDTools        grg-utils  LMAT    R-packages
$ ls /data/acad/apps/modules/all/LMAT
1.2.6-foss-2016b.lua
$ export MODULEPATH=/data/acad/apps/modules/all:$MODULEPATH
$ module load LMAT/1.2.6-foss-2016b

You can save your changes to the MODULEPATH by adding the export line to the file $HOME/.bash_profile, a script that is run by your shell when you logon.

foss/2016b is a toolchain

Building software from source typically requires a suite of programs, such as a compiler, linker and libc. This suite of programs is often referred to as a toolchain. The compiler turns a human readable source code file into machine readable object files and the linker links the object files, together with any addition libraries of functions, into an executable which can be run (executed) from the command line. The C library (libc) is a standardised collection of useful functions that may be used by programs and can be included directly into an executable by the linker. Most software on Linux is either written in the C programming language or uses components that are written in C (e.g. Python and R are both implemented in C), and thus depend upon libc.

The libc also provides a linker, the dynamic linker, which enables executables to find library functions at run time in dynamic libraries instead of requiring them to be included directly into the executable by the (static) linker. Most libraries are built in both dynamic and static forms, and can therefore be linked to an executable statically (included at compile time) or dynamically (finding the library is deferred until the program is executed). Dynamic linking reduces disk space by allowing many programs that are linked to the library to share the same library file, whereas statically linking requires each program to contain its own copy of the library functions. For this reason, dynamic linking is very common. Also note that an executable may be linked statically to one library and dynamically to another.

The ACAD servers have a toolchain installed already, but additional toolchains are provided as modules. One reason for this is that different software can depend upon different toolchain versions. The recommended toolchain module on the ACAD servers is foss/2016b, and most other modules have been built using this toolchain for consistency. There is also a foss/2016a toolchain, however it has been found to produce executables that are unable to run on both ACAD1 and ACAD2 and should thus be avoided.

Most modules have a version that indicates the toolchain from which the software was built. Because of the problem with the foss/2016a toolchain, some programs that specify foss-2016a as their version will be unable to run on either ACAD1 or ACAD2. A symptom of this problem is seeing the error: illegal instruction (core dumped).

Clone this wiki locally