-
Notifications
You must be signed in to change notification settings - Fork 2
Modules
In order to provide users with access to a range of software, the ACAD servers provide a number of software modules via lmod. Typically, the user will need to load a module or modules prior to running an analyses. Please read the lmod documentation for the complete details on how the module system works. Below is a basic introduction, including a more specific description on how modules are used on the ACAD servers.
Avoid modules with foss-2016a in the name, instead favour modules with foss-2016b.
Add the following to your ~/.bash_profile
export MODULEPATH=/data/acad/apps/modules/all:$MODULEPATH
Delete module cache, which may be required the first time you update your MODULEPATH
.
$ rm -r $HOME/.lmod.d
List all available modules
$ module avail
Search for modules named 'python' (case insensitive)
$ module spider python
Load a module named 'Python' (case sensitive)
$ module load Python
Load a specific version of the Python module
$ module load Python/2.7.13-foss-2016b
Unload the Python module
$ module unload Python
List currently loaded modules
$ module list
When you type a command in your Terminal
, there is some software known as the shell which interprets the command. The default shell on Mac and many Linux systems is bash
, but other shells behave similarly. A command typed into the shell may refer to either a shell builtin (such as cd
and echo
), a shell alias/function, or a standalone program (such as vim
or gzip
). The shell first compares your command with its list of builtins, if not found it then compares the command with aliases and functions, and if still not found it looks for standalone programs in your PATH
. If the command exists in multiple places, the first one takes precedence (builtin,alias,function,order in PATH) and if no corresponding command is found, an error is reported.
PATH
is a special environment variable containing a colon separated list of directories. You can look at the value of an environment variable using the echo
builtin. Note that when referencing an environment variable, the variable has a dollar symbol prefix (the prefix is not used when setting an environment variable).
$ echo $PATH /usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin
The above is a standard set of locations for binary files on Unix-like systems. The type
builtin can be used to determine whether a command is a builtin, alias, function, or a standalone program.
$ type type type is a shell builtin $ type ls ls is aliased to `ls --color=auto' $ type vim vim is /usr/bin/vim $ type fixworldpoverty -bash: type: fixworldpoverty: not found
Note that when we asked for the type
of command vim
, we were told its location. For the ls
command, we were told it is an alias and what the alias is. An alias is a simple translation of one command into another command, and in this case ls
is an alias to ls --color=auto
. In fact, ls
is a standalone program and the alias gives it a default parameter without it having to be typed in full each time. We can find the location of ls
in the PATH
too, using the which
command.
$ which ls alias ls='ls --color=auto' /bin/ls
The module
command can be used to load or unload software. Loading software causes new directories to be prepended to your PATH
.
$ echo $PATH /usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin $ module load Python $ echo $PATH /apps/software/Python/2.7.13-foss-2016b/bin:/apps/software/SQLite/3.13.0-foss-2016b/bin:/apps/software/Tcl/8.6.5-foss-2016b/bin:/apps/software/libreadline/6.3-foss-2016b/bin:/apps/software/ncurses/6.0-foss-2016b/bin:/apps/software/bzip2/1.0.6-foss-2016b/bin:/apps/software/FFTW/3.3.4-gompi-2016b/bin:/apps/software/OpenBLAS/0.2.18-GCC-5.4.0-2.26-LAPACK-3.6.1/bin:/apps/software/OpenMPI/1.10.3-GCC-5.4.0-2.26/bin:/apps/software/hwloc/1.11.3-GCC-5.4.0-2.26/sbin:/apps/software/hwloc/1.11.3-GCC-5.4.0-2.26/bin:/apps/software/numactl/2.0.11-GCC-5.4.0-2.26/bin:/apps/software/binutils/2.26-GCCcore-5.4.0/bin:/apps/software/GCCcore/5.4.0/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin
$ type module module is a function module () { eval $($LMOD_CMD bash "$@"); [ $? = 0 ] && eval $(${LMOD_SETTARG_CMD:-:} -s sh) } $ echo $LMOD_CMD /usr/share/lmod/lmod/libexec/lmod $ file /usr/share/lmod/lmod/libexec/lmod /usr/share/lmod/lmod/libexec/lmod: a /usr/bin/lua script text executable
The module
command is a shell function. The shell is a fully turing-complete programming language, and as such functions can be written in this language. type
shows us the function definition. This function evaluates whatever is in the LMOD_CMD
environment variable and passes it the 'bash' parameter and also any additional parameters given to the module command ($@
). file
shows the type of file for the lmod
executable. Lmod is written in the Lua programming language.
This appears to be a very awkward and opaque design for calling a program. Please do not follow its example when implementing your own programs.
Looking at the lmod documentation, we find that lmod looks for modules in directories listed in the MODULEPATH
environment variable. Actually, when we load a module, lmod looks in the MODULEPATH directories for a script (written in either the TCL or Lua programming languages) with the corresponding name, and then runs that script. The module script for the software is responsible for modifying PATH
and other environment variables, and may load additional modules which are dependencies. So the software might really be installed anywhere, as long as the module script for that software modifies the PATH
accordingly.
$ echo $MODULEPATH /apps/modules/all:/usr/share/modulefiles/Linux:/usr/share/modulefiles/Core:/usr/share/lmod/lmod/modulefiles/Core $ ls /apps/modules/all AdapterRemoval BLAST FastQC HDF5 metafast QIIME AdmixTools BLAST+ FASTX-Toolkit HTSlib NASM R Anaconda2 Boost FFTW hwloc ncurses RAxML Anaconda3 Bowtie flex Java netCDF SAMtools angsd Bowtie2 foss Kaiju numactl ScaLAPACK ART BWA GATK libcerf OpenBLAS seqtk Autoconf bwa-meth GCC libevent OpenMPI SQLite Automake bwa-pssm GCCcore libgtextutils paleomix Szip Autotools bzip2 GDAL libjpeg-turbo Pango tabix BCFtools cairo GDB libpng parallel taxator-tk bcl2fastq capnproto gdsl libreadline PCRE Tcl bedops CMake gettext LibTIFF Perl Tk BEDOPS cURL GHC libtool picard tmux BEDTools cutadapt ghostscript libxml2 PileOMeth toolshed binutils Doxygen GMAP-GSNAP libxslt PLINK treemix bioawk EasyBuild GMP M4 preseq VCFtools Biopython EIGENSOFT gnuplot MAFFT PROJ XZ Bismark ExaML gompi mapDamage Pysam zlib Bison expat GSL mash Python $ ls /apps/modules/all/Python/ 2.7.11-foss-2016a 2.7.13-foss-2016b.lua
By modifying the MODULEPATH
, we can tell lmod to search additional locations for module scripts.
$ ls /data/acad/apps/modules/all AdapterRemoval Boost HTSlib mapDamage SAMtools ART BWA HUMAnN paleomix-meta SCons BCFtools freetype Kaiju Python-packages texlive BEDTools grg-utils LMAT R-packages $ ls /data/acad/apps/modules/all/LMAT 1.2.6-foss-2016b.lua $ export MODULEPATH=/data/acad/apps/modules/all:$MODULEPATH $ module load LMAT/1.2.6-foss-2016b
You can save your changes to the MODULEPATH
by adding the export line to the file $HOME/.bash_profile
, a script that is run by your shell when you logon.
Building software from source typically requires a suite of programs, such as a compiler, linker and libc. This suite of programs is often referred to as a toolchain. The compiler turns a human readable source code file into machine readable object files and the linker links the object files, together with any addition libraries of functions, into an executable which can be run (executed) from the command line. The C library (libc) is a standardised collection of useful functions that may be used by programs and can be included directly into an executable by the linker. Most software on Linux is either written in the C programming language or uses components that are written in C (e.g. Python and R are both implemented in C), and thus depend upon libc.
The libc also provides a linker, the dynamic linker, which enables executables to find library functions at run time in dynamic libraries instead of requiring them to be included directly into the executable by the (static) linker. Most libraries are built in both dynamic and static forms, and can therefore be linked to an executable statically (included at compile time) or dynamically (finding the library is deferred until the program is executed). Dynamic linking reduces disk space by allowing many programs that are linked to the library to share the same library file, whereas statically linking requires each program to contain its own copy of the library functions. For this reason, dynamic linking is very common. Also note that an executable may be linked statically to one library and dynamically to another.
The ACAD servers have a toolchain installed already, but additional toolchains are provided as modules. One reason for this is that different software can depend upon different toolchain versions. The recommended toolchain module on the ACAD servers is foss/2016b
, and most other modules have been built using this toolchain for consistency. There is also a foss/2016a toolchain, however it has been found to produce executables that are unable to run on both ACAD1 and ACAD2 and should thus be avoided.
Most modules have a version that indicates the toolchain from which the software was built. Because of the problem with the foss/2016a toolchain, some programs that specify foss-2016a as their version will be unable to run on either ACAD1 or ACAD2. A symptom of this problem is seeing the error: illegal instruction (core dumped)
.