-
Notifications
You must be signed in to change notification settings - Fork 2
Modules
In order to provide users with access to a range of software, the ACAD servers provide a number of software modules via lmod. Typically, the user will need to load a module or modules prior to running an analyses. Please read the lmod documentation for the complete details on how the module system works. Below is a basic introduction, including a more specific description on how modules are used on the ACAD servers.
Avoid modules with foss-2016a in the name, instead favour modules with foss-2016b.
Add the following to your ~/.bash_profile
export MODULEPATH=/data/acad/apps/modules/all:$MODULEPATH
Delete module cache, which may be required the first time you update your MODULEPATH
.
$ rm -r $HOME/.lmod.d
List all available modules
$ module avail
Search for modules named 'python' (case insensitive)
$ module spider python
Load a module named 'Python' (case sensitive)
$ module load Python
Load a specific version of the Python module
$ module load Python/2.7.13-foss-2016b
Unload the Python module
$ module unload Python
Unload all modules
$ module purge
List currently loaded modules
$ module list
When you type a command in your Terminal
, there is some software known as the shell which interprets the command. The default shell on Mac and many Linux systems is bash
, but other shells behave similarly. A command typed into the shell may refer to either a shell builtin (such as cd
and echo
), a shell alias/function, or a standalone program (such as vim
or gzip
). The shell first compares your command with its list of builtins, if not found it then compares the command with aliases and functions, and if still not found it looks for standalone programs in your PATH
. If the command exists in multiple places, the first one takes precedence (builtin,alias,function,order in PATH
) and if no corresponding command is found, an error is reported.
PATH
is a special environment variable containing a colon separated list of directories. You can look at the value of an environment variable using the echo
builtin. Note that when referencing an environment variable, the variable has a dollar symbol prefix (the prefix is not used when setting an environment variable).
$ echo $PATH /usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin
The above is a standard set of locations for binary files on Unix-like systems. The type
builtin can be used to determine whether a command is a builtin, alias, function, or a standalone program.
$ type type type is a shell builtin $ type ls ls is aliased to `ls --color=auto' $ type vim vim is /usr/bin/vim $ type fixworldpoverty -bash: type: fixworldpoverty: not found
Note that when we asked for the type
of command vim
, we were told its location. For the ls
command, we were told it is an alias and what the alias is. An alias is a simple translation of one command into another command, and in this case ls
is an alias to ls --color=auto
. In fact, ls
is a standalone program and the alias gives it a default parameter without it having to be typed in full each time. We can find the location of ls
in the PATH
too, using the which
command.
$ which ls alias ls='ls --color=auto' /bin/ls
The module
command can be used to load or unload software. Loading software causes new directories to be prepended to your PATH
.
$ echo $PATH /usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin $ module load Python $ echo $PATH /apps/software/Python/2.7.13-foss-2016b/bin:/apps/software/SQLite/3.13.0-foss-2016b/bin:/apps/software/Tcl/8.6.5-foss-2016b/bin:/apps/software/libreadline/6.3-foss-2016b/bin:/apps/software/ncurses/6.0-foss-2016b/bin:/apps/software/bzip2/1.0.6-foss-2016b/bin:/apps/software/FFTW/3.3.4-gompi-2016b/bin:/apps/software/OpenBLAS/0.2.18-GCC-5.4.0-2.26-LAPACK-3.6.1/bin:/apps/software/OpenMPI/1.10.3-GCC-5.4.0-2.26/bin:/apps/software/hwloc/1.11.3-GCC-5.4.0-2.26/sbin:/apps/software/hwloc/1.11.3-GCC-5.4.0-2.26/bin:/apps/software/numactl/2.0.11-GCC-5.4.0-2.26/bin:/apps/software/binutils/2.26-GCCcore-5.4.0/bin:/apps/software/GCCcore/5.4.0/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin
$ type module module is a function module () { eval $($LMOD_CMD bash "$@"); [ $? = 0 ] && eval $(${LMOD_SETTARG_CMD:-:} -s sh) } $ echo $LMOD_CMD /usr/share/lmod/lmod/libexec/lmod $ file /usr/share/lmod/lmod/libexec/lmod /usr/share/lmod/lmod/libexec/lmod: a /usr/bin/lua script text executable
The module
command is a shell function. The shell is a fully fledged programming language, which provides the ability to write functions. type
shows us the function definition too. This function evaluates whatever is in the LMOD_CMD
environment variable and passes it the 'bash' parameter and also any additional parameters given to the module command ($@
). file
shows the type of file for the lmod
executable. Lmod is written in the Lua programming language.
This appears to be a very awkward and opaque design for calling a program. Please do not follow its example when implementing your own programs.
Looking at the lmod documentation, we find that lmod looks for modules in directories listed in the MODULEPATH
environment variable. Actually, when we load a module, lmod looks in the MODULEPATH
directories for a script (written in either the TCL or Lua programming languages) with the corresponding name, and then runs that script. The module script for the software is responsible for modifying PATH
and other environment variables, and may load additional modules which are dependencies. So the software might really be installed anywhere, as long as the module script for that software modifies the PATH
accordingly.
$ echo $MODULEPATH /apps/modules/all:/usr/share/modulefiles/Linux:/usr/share/modulefiles/Core:/usr/share/lmod/lmod/modulefiles/Core $ ls /apps/modules/all AdapterRemoval BLAST FastQC HDF5 metafast QIIME AdmixTools BLAST+ FASTX-Toolkit HTSlib NASM R Anaconda2 Boost FFTW hwloc ncurses RAxML Anaconda3 Bowtie flex Java netCDF SAMtools angsd Bowtie2 foss Kaiju numactl ScaLAPACK ART BWA GATK libcerf OpenBLAS seqtk Autoconf bwa-meth GCC libevent OpenMPI SQLite Automake bwa-pssm GCCcore libgtextutils paleomix Szip Autotools bzip2 GDAL libjpeg-turbo Pango tabix BCFtools cairo GDB libpng parallel taxator-tk bcl2fastq capnproto gdsl libreadline PCRE Tcl bedops CMake gettext LibTIFF Perl Tk BEDOPS cURL GHC libtool picard tmux BEDTools cutadapt ghostscript libxml2 PileOMeth toolshed binutils Doxygen GMAP-GSNAP libxslt PLINK treemix bioawk EasyBuild GMP M4 preseq VCFtools Biopython EIGENSOFT gnuplot MAFFT PROJ XZ Bismark ExaML gompi mapDamage Pysam zlib Bison expat GSL mash Python $ ls /apps/modules/all/Python/ 2.7.11-foss-2016a 2.7.13-foss-2016b.lua
By modifying the MODULEPATH
, we can tell lmod to search additional locations for module scripts.
$ ls /data/acad/apps/modules/all AdapterRemoval Boost HTSlib mapDamage SAMtools ART BWA HUMAnN paleomix-meta SCons BCFtools freetype Kaiju Python-packages texlive BEDTools grg-utils LMAT R-packages $ ls /data/acad/apps/modules/all/LMAT 1.2.6-foss-2016b.lua $ export MODULEPATH=/data/acad/apps/modules/all:$MODULEPATH $ module load LMAT/1.2.6-foss-2016b
You can save your changes to the MODULEPATH
by adding the export line to the file $HOME/.bash_profile
, a script that is run by your shell when you logon.
Building software from source typically requires a suite of programs, such as a compiler, linker and libc. This suite of programs is often referred to as a toolchain. The compiler turns a human readable source file into a machine readable object file and the linker links one or more object files, together with any libraries of functions, into an executable which can be run (executed) from the command line. The C library (libc) is a standardised collection of useful functions that may be used by programs and can be included directly into an executable by the linker. Most software on Linux is either written in the C programming language or uses components that are written in C (e.g. Python and R are both implemented in C), and thus depend upon libc.
The ACAD servers have a toolchain installed already, but additional toolchains are provided as modules. One reason for this is that different software can depend upon different toolchain versions. The recommended toolchain module on the ACAD servers is foss/2016b
, and most other modules have been built using this toolchain for consistency. There is also a foss/2016a toolchain, however it has been found to produce executables that are unable to run on both ACAD1 and ACAD2 and should thus be avoided. Most modules have a version that indicates the toolchain from which the software was built. Because of the problem with the foss/2016a toolchain, some programs that specify foss-2016a as their version will be unable to run on either ACAD1 or ACAD2. A symptom of this problem is seeing the error: illegal instruction (core dumped)
.
[a1158147@acad1 ~]$ module purge [a1158147@acad1 ~]$ module load R/3.3.1-foss-2016a [a1158147@acad1 ~]$ R Illegal instruction (core dumped)
The libc also provides a linker, the dynamic linker, which enables executables to find library functions at run time in dynamic libraries instead of requiring them to be included directly into the executable by the (static) linker. Most libraries are built in both dynamic and static forms, and can therefore be linked to an executable statically (included at build time) or dynamically (finding the library is deferred until the program is executed). Dynamic linking reduces disk space by allowing many programs that are linked to the library to share the same library file, whereas statically linking requires each program to contain its own copy of the library functions. For this reason, dynamic linking is very common. Also note that an executable may be linked statically to one library and dynamically to another.
The dynamic linker, ld.so
, is called implicitly when a dynamically linked program is executed, and attempts to resolve the location of dynamically linked libraries and the location of required functions within those libraries. See the ld.so
manual page for more details. We can trace which libraries are resolved by the dynamic linker using the ldd
command.
$ module load SAMtools $ which samtools /data/acad/apps/software/SAMtools/1.4.1-foss-2016b/bin/samtools $ ldd `which samtools` linux-vdso.so.1 => (0x00007fff54dff000) libz.so.1 => /apps/software/zlib/1.2.8-foss-2016b/lib/libz.so.1 (0x00007f70c78da000) libm.so.6 => /lib64/libm.so.6 (0x0000003fb0800000) libbz2.so.1.0 => /apps/software/bzip2/1.0.6-foss-2016b/lib/libbz2.so.1.0 (0x00007f70c78b8000) liblzma.so.5 => /apps/software/XZ/5.2.2-foss-2016b/lib/liblzma.so.5 (0x00007f70c7892000) libcurl.so.4 => /apps/software/cURL/7.49.1-foss-2016b/lib/libcurl.so.4 (0x00007f70c782c000) libcrypto.so.10 => /usr/lib64/libcrypto.so.10 (0x000000312bc00000) libncursesw.so.6 => /apps/software/ncurses/6.0-foss-2016b/lib/libncursesw.so.6 (0x00007f70c77c0000) libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003fb0000000) libc.so.6 => /lib64/libc.so.6 (0x0000003fafc00000) librt.so.1 => /lib64/librt.so.1 (0x0000003e02200000) libssl.so.10 => /usr/lib64/libssl.so.10 (0x0000003e2b600000) libdl.so.2 => /lib64/libdl.so.2 (0x0000003faf800000) /lib64/ld-linux-x86-64.so.2 (0x0000003faf400000) libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x0000003e2ba00000) libkrb5.so.3 => /lib64/libkrb5.so.3 (0x0000003e2be00000) libcom_err.so.2 => /lib64/libcom_err.so.2 (0x000000312ac00000) libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x0000003e2c200000) libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x0000003e2c600000) libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x0000003fb6400000) libresolv.so.2 => /lib64/libresolv.so.2 (0x0000003fb1c00000) libselinux.so.1 => /lib64/libselinux.so.1 (0x0000003e2a200000)
We can see that samtools
is dynamically linked to a range of libraries. Actually, many of these are indirect - one library can be linked other libraries, and ldd
shows all dependencies. The dynamic linker, ld.so
, has a list of directories that it searches in order to resolve libraries. The dynamic linker first tries directories built in to the executable when it was built (known as the rpath), then it tries directories specified in the environment variable LD_LIBRARY_PATH
(colon separated entries, like PATH
), then directories that are configured from ldconfig
(usually via a configuration file /etc/ld.so.conf
) and finally the dynamic linker searches built in paths such as /lib64
. Many libraries are provided as modules, and these modules modify the LD_LIBRARY_PATH
in order for the dynamic linker to resolve the libraries when a dependent program is run.
$ module purge $ echo $LD_LIBRARY_PATH $ module load SAMtools $ echo $LD_LIBRARY_PATH /data/acad/apps/software/SAMtools/1.4.1-foss-2016b/lib:/apps/software/ncurses/6.0-foss-2016b/lib:/data/acad/apps/software/HTSlib/1.4.1-foss-2016b/lib:/apps/software/cURL/7.49.1-foss-2016b/lib:/apps/software/XZ/5.2.2-foss-2016b/lib:/apps/software/bzip2/1.0.6-foss-2016b/lib:/apps/software/zlib/1.2.8-foss-2016b/lib:/apps/software/ScaLAPACK/2.0.2-gompi-2016b-OpenBLAS-0.2.18-LAPACK-3.6.1/lib:/apps/software/FFTW/3.3.4-gompi-2016b/lib:/apps/software/OpenBLAS/0.2.18-GCC-5.4.0-2.26-LAPACK-3.6.1/lib:/apps/software/OpenMPI/1.10.3-GCC-5.4.0-2.26/lib:/apps/software/hwloc/1.11.3-GCC-5.4.0-2.26/lib:/apps/software/numactl/2.0.11-GCC-5.4.0-2.26/lib:/apps/software/binutils/2.26-GCCcore-5.4.0/lib:/apps/software/GCCcore/5.4.0/lib/gcc/x86_64-unknown-linux-gnu/5.4.0:/apps/software/GCCcore/5.4.0/lib64:/apps/software/GCCcore/5.4.0/lib
Comparing the LD_LIBRARY_PATH
above, with the ldd
output for samtools
, we can see that some libraries are found outside the default /lib64
directory, presumably having been resolved using entries from the LD_LIBRARY_PATH
. Lets modify the LD_LIBRARY_PATH and see what happens.
$ export LD_LIBRARY_PATH= $ echo $LD_LIBRARY_PATH $ ldd `which samtools` linux-vdso.so.1 => (0x00007fffbe7ff000) libz.so.1 => /lib64/libz.so.1 (0x0000003f51c00000) libm.so.6 => /lib64/libm.so.6 (0x0000003fb0800000) libbz2.so.1.0 => not found liblzma.so.5 => not found libcurl.so.4 => /usr/lib64/libcurl.so.4 (0x0000003e28a00000) libcrypto.so.10 => /usr/lib64/libcrypto.so.10 (0x000000312bc00000) libncursesw.so.6 => not found libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003fb0000000) libc.so.6 => /lib64/libc.so.6 (0x0000003fafc00000) libidn.so.11 => /lib64/libidn.so.11 (0x0000003fb7400000) libldap-2.4.so.2 => /lib64/libldap-2.4.so.2 (0x00000034d3000000) librt.so.1 => /lib64/librt.so.1 (0x0000003e02200000) libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x0000003e2ba00000) libkrb5.so.3 => /lib64/libkrb5.so.3 (0x0000003e2be00000) libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x0000003e2c200000) libcom_err.so.2 => /lib64/libcom_err.so.2 (0x000000312ac00000) libssl3.so => /usr/lib64/libssl3.so (0x00000034d2400000) libsmime3.so => /usr/lib64/libsmime3.so (0x0000003e74400000) libnss3.so => /usr/lib64/libnss3.so (0x0000003e73800000) libnssutil3.so => /usr/lib64/libnssutil3.so (0x0000003e73c00000) libplds4.so => /lib64/libplds4.so (0x0000003e73000000) libplc4.so => /lib64/libplc4.so (0x0000003e73400000) libnspr4.so => /lib64/libnspr4.so (0x0000003e74000000) libdl.so.2 => /lib64/libdl.so.2 (0x0000003faf800000) libssh2.so.1 => /usr/lib64/libssh2.so.1 (0x0000003e28e00000) /lib64/ld-linux-x86-64.so.2 (0x0000003faf400000) liblber-2.4.so.2 => /lib64/liblber-2.4.so.2 (0x00000034d2800000) libresolv.so.2 => /lib64/libresolv.so.2 (0x0000003fb1c00000) libsasl2.so.2 => /usr/lib64/libsasl2.so.2 (0x0000003fb4c00000) libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x0000003e2c600000) libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x0000003fb6400000) libssl.so.10 => /usr/lib64/libssl.so.10 (0x0000003e2b600000) libcrypt.so.1 => /lib64/libcrypt.so.1 (0x0000003fb3400000) libselinux.so.1 => /lib64/libselinux.so.1 (0x0000003e2a200000) libfreebl3.so => /lib64/libfreebl3.so (0x0000003fb2800000)
Some of the libraries cannot be resolved. Note also, that libz.so.1
is now found in the builtin /lib64
directory, instead of in /apps/software/zlib/1.2.8-foss-2016b/lib/
. If we try to run samtools
now, it fails.
$ samtools samtools: error while loading shared libraries: libbz2.so.1.0: cannot open shared object file: No such file or directory
In order to provide an additional module, you must write a script and place it in one of the directories specified in the MODULEPATH
. The filesystem mounted at /data/acad
is accessible from both ACAD1 and ACAD2, and is writeable for all ACAD users. It is recommended that new module scripts be placed in /data/acad/apps/modules/all
and that this be prepended to your MODULEPATH
.
Here is an example of a custom module, with the software (a git checkout) placed under /data/acad/software/grg-utils
.
$ cat /data/acad/apps/modules/all/grg-utils/git.lua help([[Utilities written by Graham. -grg]]) whatis([[Misc bits and pieces, mostly python and some c.]]) if not isloaded("Python/2.7.13-foss-2016b") then load("Python/2.7.13-foss-2016b") end if not isloaded("Python-packages/Python-2.7.13-foss-2016b") then load("Python-packages/Python-2.7.13-foss-2016b") end local root = "/data/acad/apps/software/grg-utils" prepend_path("PATH", root)
This module has name grg-utils
, from the directory in which the script is located, and has version git
, from the filename. This script has a .lua
extension, hence it is written in the Lua
programming language, and uses functions such as isloaded
and prepend_path
that are provided by lmod. Scripts without an extension are instead written in TCL
. Most of the programs provided by grg-utils are written in Python and thus this module script loads a specific Python module that is known to work with these programs. Using module
commands, we can now inspect and load the grg-utils module.
$ echo $MODULEPATH /data/acad/apps/modules/all:/apps/modules/all:/usr/share/modulefiles/Linux:/usr/share/modulefiles/Core:/usr/share/lmod/lmod/modulefiles/Core $ module spider grg-utils --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- grg-utils: grg-utils/git --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- This module can be loaded directly: module load grg-utils/git Help: Utilities written by Graham. -grg $ module load grg-utils/git $ echo $PATH | cut -d: -f1 /data/acad/apps/software/grg-utils
Please see the lmod documentation for more details about writing module scripts, and other scripts in your MODULEPATH
for more examples.
Manually building and then writing a module file for a program is not always necessary. Most of the modules on ACAD servers have been automatically downloaded and built using EasyBuild
, which also generates a Lua module script. See the EasyBuild documentation for details and /data/acad/apps/eb
for example easybuild scripts. The general process is as follows.
$ module purge $ module load foss/2016b EasyBuild $ vim foo.eb # write the easybuild script $ eb --prefix=/data/acad/apps foo.eb ...
R allows local package installation into a user's home directory. I provide the R-packages
module for a shared location of additional R packages by modifying the R_LIBS_USER
environment variable. E.g. to install package 'foo':
$ module load R-packages $ echo $R_LIBS_USER /data/acad/apps/software/R-packages/R-3.3.1-foss-2016b/lib $ R ... > install.packages('foo')
There are many ways to install Python packages, including native package managers, pip
, conda
, virtualenv
. I have provided the Python-packages
module as a shared place for additional Python packages which can be installed by anyone. This package exports the PYTHONUSERBASE
environment variable, which is used to specify user python modules and has the advantage that new packages can be installed with pip. E.g. to install package foo
:
$ module load Python-packages $ echo $PYTHONUSERBASE /data/acad/apps/software/Python-packages/Python-2.7.13-foss-2016b $ pip install --user foo
After you have installed a new module, Python, or R package, the new files will be owned by your user. They will be accessible by other acad_users
, but only you will be able to modify/delete these files which may be problematic in the future. Please run chmod g+w
on any new files, or use the following script to fix permissions for any files under /data/acad/apps
(errors can be safely ignored).
$ sh /data/acad/apps/fix_perms.sh 2>/dev/null