Modules: Difference between revisions

From HPCwiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 10: Line 10:


* legacy - old unmaintained packages
* legacy - old unmaintained packages
* 2023 - packages for the 2023 toolchain
* groups - bucket containing modules for individual groups. 
* 2023 - packages built with the 2023 toolchain
* 2024 - packages built with the 2024 toolchain
* GPU - mainly CUDA and related packages that are independent of toolchains.  
* GPU - mainly CUDA and related packages that are independent of toolchains.  


Line 19: Line 21:


* Current modules are crowded, disorganized and inefficient
* Current modules are crowded, disorganized and inefficient
* Modules are being organized in buckets for each year with a legacy bucket for old software
* Modules are being organized in buckets for each year, with a legacy bucket for old software
* legacy software is no longer maintained
* legacy software is no longer maintained
* Module in the main location are going to be removed after the 10 December 2024 downtime
* Module in the main location are going to be removed after the 10 December 2024 downtime
* 2023 and legacy bucket are already available
* 2023 and legacy bucket are already available
* Users are encouraged to adapt their code to the new bucket system as soon as possible to avoid disruption
* Users are encouraged to adapt their code to the new bucket system as soon as possible to avoid disruption




Line 32: Line 35:
* legacy - old software that is no longer maintained or updated, but it is still used in active research.
* legacy - old software that is no longer maintained or updated, but it is still used in active research.
* 2023 - software built using the 2023 compilers and toolchain. It is meant to contain a single version of each software.
* 2023 - software built using the 2023 compilers and toolchain. It is meant to contain a single version of each software.
* 2024 - software built using the 2024 compilers and toolchain. It is meant to contain a single version of each software.
* groups -  This bucket contains subfolders containing module files for groups in and outside the WUR.
* GPU - CUDA, cuDNN and related packages that are independent of toolchains
* GPU - CUDA, cuDNN and related packages that are independent of toolchains



Revision as of 10:43, 13 December 2024

Modules in anunna

Anunna uses modules via Lmod to provide software to their users. A module configures the environment of the user and/or their jobs to enable the desired application to run. These modules are organized in "buckets" for each year. We intend to keep three buckets of software, one for the current year, one for the previous year and another for legacy software. The legacy module should contain software that is older than two years and is still used or relevant.

The point is to have a conveyor belt of software and have more up-to-date software with more modern build tools (GCC, MPI, intel), which makes the software more maintainable. The conveyor belt also enables the use of more modern toolchains (foss, intel), which will enable software to run more efficiently.

For each bucket, we intend to keep one version of software.Currently there are three buckets available

  • legacy - old unmaintained packages
  • groups - bucket containing modules for individual groups.
  • 2023 - packages built with the 2023 toolchain
  • 2024 - packages built with the 2024 toolchain
  • GPU - mainly CUDA and related packages that are independent of toolchains.

The new modules have been built with the aid of EasyBuild


Module Migration

  • Current modules are crowded, disorganized and inefficient
  • Modules are being organized in buckets for each year, with a legacy bucket for old software
  • legacy software is no longer maintained
  • Module in the main location are going to be removed after the 10 December 2024 downtime
  • 2023 and legacy bucket are already available
  • Users are encouraged to adapt their code to the new bucket system as soon as possible to avoid disruption


Module Organization

Modules are to be organized into buckets by year or additional categories. Current buckets are

  • legacy - old software that is no longer maintained or updated, but it is still used in active research.
  • 2023 - software built using the 2023 compilers and toolchain. It is meant to contain a single version of each software.
  • 2024 - software built using the 2024 compilers and toolchain. It is meant to contain a single version of each software.
  • groups - This bucket contains subfolders containing module files for groups in and outside the WUR.
  • GPU - CUDA, cuDNN and related packages that are independent of toolchains

In order to access the modules of the 2023 bucket one needs to execute the following commands:

module load 2023

Afterwards, the list of available modules is expanded and this can be verified by running

module avail

Why Buckets?

As time goes by, software is developed with newer compilers and tools. So the buckets are snapshots of these new compilers and tools that have been used to develop and build these pieces of software. The compilers will determine which processor operations will be supported by the software, so if a job runs software from two different compilers conflicts, errors or unwanted behaviour may occur.

Therefore, it is best to have jobs with software built from the same compiler. This is the purpose of the buckets, where all the software should be built with the same compiler.

Usage

Listing Modules

The commands that hereby follow will list the modules available to the user in increansing detail. overview provides a top level view of the software available without going into detail about the different versions available. It will only list the software and the number of versions. The avail command will list the different versions of the same software. Finally, spider will provide a verbose list with all the different versions and the description of each.

module overview
module avail
module spider

Searching For Modules

The same commands used for listing modules can be used for searching, the only difference is that that the name of the module is passed as an argument. Like the listing in the section above, the commands provide different levels of verbosity.

module overview <nameOfModule>
module avail <nameOfModule>
module spider <nameOfModule>

Searching For Keywords

As a more advanced search feature, one can search for keywords inside of modules. This is useful when searching for which modules contain a specific Python or R extension. There are bundle modules for both languages that contain a list of their extensions. Lmod will also search inside the description of the modules, which can be useful for discoverability.

This feature can be used with the following command template:


module key <keyword>


To illustrate, say that one needs to find a module with the R packager terra installed. The first step would be to load one of the buckets, for instance 2023.

module load 2023

Then, the next step would be to apply the key template above

module key terra

which yields the following results

The following modules match your search criteria: "terra"
--------------------------------------------------------------------------------------------------------

  R-bundle-CRAN: R-bundle-CRAN/2023.12-foss-2023a
    Bundle of R packages from CRAN

--------------------------------------------------------------------------------------------------------


Hence, one would need to load the module R-bundle-CRAN/2023.12-foss-2023a to have access to the terra package

Loading Modules

Modules are loaded through the following command template

module load <moduleName>


The example below show how to load the python module from the 2023 bucket

module load 2023
module load Python/3.11.3

It is good practice to specify the version of the module being loaded for consistency and reproducibility.

If the version of the module is not specified, lmod will choose the default available version at the time and that may change.

By specifying the version in your submit scripts, it transforms the script into additional documentation.

When loading modules, the dependencies of that module will also be loaded with it.

List Loaded Modules

Loaded modules can be listed with following command

module list

Following the example in the previous section, after loading the 2023 and the Python/3.11.3 modules (and its dependencies), one can this list the modules loaded

user001@login201:~$ module list

Currently Loaded Modules:
  1) slurm/24.05.1              (S)   5) binutils/2.40-GCCcore-12.3.0     9) Tcl/8.6.13-GCCcore-12.3.0     13) OpenSSL/1.1
  2) 2023                             6) bzip2/1.0.8-GCCcore-12.3.0      10) SQLite/3.42.0-GCCcore-12.3.0  14) Python/3.11.3-GCCcore-12.3.0
  3) GCCcore/12.3.0                   7) ncurses/6.4-GCCcore-12.3.0      11) XZ/5.4.2-GCCcore-12.3.0
  4) zlib/1.2.13-GCCcore-12.3.0       8) libreadline/8.2-GCCcore-12.3.0  12) libffi/3.4.4-GCCcore-12.3.0

  Where:
   S:  Module is Sticky, requires --force to unload or purge

we can see that aside from the slurm modules (which is loaded by default), the 2023 module and the Python/3.11.3, 11 other dependencies are loaded with the Python/3.11.3 module

Removing Modules

Modules can be removed with the following template command

module unload <moduleName>

Following the example of the python module above, the module can be removed with the following command

module unload Python/3.11.3

This command will only unload the Python/3.11.3 module and not its dependencies.

We can see this if we list the loaded modules again

user001@login201:~$ module unload Python/3.11.3
user001@login201:~$ module list

Currently Loaded Modules:
  1) slurm/24.05.1              (S)   5) binutils/2.40-GCCcore-12.3.0     9) Tcl/8.6.13-GCCcore-12.3.0     13) OpenSSL/1.1
  2) 2023                             6) bzip2/1.0.8-GCCcore-12.3.0      10) SQLite/3.42.0-GCCcore-12.3.0  14) Python/3.11.3-GCCcore-12.3.0
  3) GCCcore/12.3.0                   7) ncurses/6.4-GCCcore-12.3.0      11) XZ/5.4.2-GCCcore-12.3.0
  4) zlib/1.2.13-GCCcore-12.3.0       8) libreadline/8.2-GCCcore-12.3.0  12) libffi/3.4.4-GCCcore-12.3.0


We can completely clean the environment by using the purge command

module purge

From our example,

user001@login201:~$ module purge
The following modules were not unloaded:
  (Use "module --force purge" to unload all):

  1) slurm/24.05.1


The only remaining module is the slurm module with we have set as sticky.

This command is useful to execute in job scripts since it clear the environment of unwanted software that may be loaded by mistake.