HPCwiki - User contributions [en]

Old binaries

2023-06-19T09:27:57Z

Dawes0011:

On Linux, most binaries rely on shared libraries that are loaded by the OS before the binary is executed.

This allows them to share common runtimes, and avoid having to redo the same work multiple times. This is useful for optimisation and security, as debugging only has to be done in one place for multiple applications.

However, a significant number of these libraries are automatically inserted by the OS, and as the cluster upgrades and changes some of these will no longer be automatically available. This will make older installs fail to execute, giving, for instance, errors related to glibc, or missing symbols.

Thankfully, the entire library from previous OS builds is made available on the shared filesystem, granting older binaries the ability to access them whilst also running on a younger OS.

There is a special environment variable: <code>LD_LIBRARY_PATH</code> which tells the linker (ld) where extra to go find libraries. We use this extensively in the underlying module system on Anunna to control which libraries can be used. One such module, <code>sl7_libs</code> automatically adds the older libraries in to older binaries, and has been added as a prerequisite for most older installs.

If you've compiled something yourself that isn't a module, you may find yourself needing this module. Simply:

<code>
module load sl7-libs/main
</code>

To add this path.

== Jupyter ==

This may also extend to custom-built kernels on Jupyter. Here, however, you can't add this module, as instantiating a kernel is done though <code>kernel.json</code> instead of bash.

To add the needed libraries, you need to modify your notebook to look like the following:
<code>
{
"language": "python",
"argv": [
"/path/to/my/venv/bin/python",
"-m",
"ipykernel",
"-f",
"{connection_file}"
],
"display_name": "myvenv",
"env" : {"LD_LIBRARY_PATH": "/usr/lib:/usr/lib64:/usr/lib/x86_64-linux-gnu:/shared/legacyapps/sl7-libs/lib:/shared/legacyapps/sl7-libs/lib64" }
}
</code>

When editing JSON, please be mindful that all elements in an object must be separated by commas, and that only doublequotes may be used.

To test your JSON, try the following:

<code>
cat ~/.local.share/jupyter/kernels/mykernel/kernel.json | jq
</code>

If <code>jq</code> repeats your JSON, then it can parse it and you are good to go.

Old binaries

2023-06-19T09:27:26Z

Dawes0011:

On Linux, most binaries rely on shared libraries that are loaded by the OS before the binary is executed.

This allows them to share common runtimes, and avoid having to redo the same work multiple times. This is useful for optimisation and security, as debugging only has to be done in one place for multiple applications.

However, a significant number of these libraries are automatically inserted by the OS, and as the cluster upgrades and changes some of these will no longer be automatically available. This will make older installs fail to execute, giving, for instance, errors related to glibc, or missing symbols.

Thankfully, the entire library from previous OS builds is made available on the shared filesystem, granting older binaries the ability to access them whilst also running on a younger OS.

There is a special environment variable: <code>LD_LIBRARY_PATH</code> which tells the linker (ld) where extra to go find libraries. We use this extensively in the underlying module system on Anunna to control which libraries can be used. One such module, <code>sl7_libs</code> automatically adds the older libraries in to older binaries, and has been added as a prerequisite for most older installs.

If you've compiled something yourself that isn't a module, you may find yourself needing this module. Simply:
<code>
module load sl7-libs/main
</code>

To add this path.

== Jupyter ==

This may also extend to custom-built kernels on Jupyter. Here, however, you can't add this module, as instantiating a kernel is done though <code>kernel.json</code> instead of bash.

To add the needed libraries, you need to modify your notebook to look like the following:
<code>
{
"language": "python",
"argv": [
"/path/to/my/venv/bin/python",
"-m",
"ipykernel",
"-f",
"{connection_file}"
],
"display_name": "myvenv",
"env" : {"LD_LIBRARY_PATH": "/usr/lib:/usr/lib64:/usr/lib/x86_64-linux-gnu:/shared/legacyapps/sl7-libs/lib:/shared/legacyapps/sl7-libs/lib64" }
}
</code>

When editing JSON, please be mindful that all elements in an object must be separated by commas, and that only doublequotes may be used.

To test your JSON, try the following:

<code>
cat ~/.local.share/jupyter/kernels/mykernel/kernel.json | jq
</code>

If <code>jq</code> repeats your JSON, then it can parse it and you are good to go.

Old binaries

2023-06-19T09:27:09Z

Dawes0011: Created page with "On Linux, most binaries rely on shared libraries that are loaded by the OS before the binary is executed. This allows them to share common runtimes, and avoid having to redo the same work multiple times. This is useful for optimisation and security, as debugging only has to be done in one place for multiple applications. However, a significant number of these libraries are automatically inserted by the OS, and as the cluster upgrades and changes some of these will no l..."

On Linux, most binaries rely on shared libraries that are loaded by the OS before the binary is executed.

This allows them to share common runtimes, and avoid having to redo the same work multiple times. This is useful for optimisation and security, as debugging only has to be done in one place for multiple applications.

However, a significant number of these libraries are automatically inserted by the OS, and as the cluster upgrades and changes some of these will no longer be automatically available. This will make older installs fail to execute, giving, for instance, errors related to glibc, or missing symbols.

Thankfully, the entire library from previous OS builds is made available on the shared filesystem, granting older binaries the ability to access them whilst also running on a younger OS.

There is a special environment variable: <code>LD_LIBRARY_PATH</code> which tells the linker (ld) where extra to go find libraries. We use this extensively in the underlying module system on Anunna to control which libraries can be used. One such module, <code>sl7_libs</code> automatically adds the older libraries in to older binaries, and has been added as a prerequisite for most older installs.

If you've compiled something yourself that isn't a module, you may find yourself needing this module. Simply:
<code>
module load sl7-libs/main
</code>

To add this path.

== Jupyter ==

This may also extend to custom-built kernels on Jupyter. Here, however, you can't add this module, as instantiating a kernel is done though <code>kernel.json</code> instead of bash.

To add the needed libraries, you need to modify your notebook to look like the following:
<code>
{
"language": "python",
"argv": [
"/path/to/my/venv/bin/python",
"-m",
"ipykernel",
"-f",
"{connection_file}"
],
"display_name": "myvenv",
"env" : {"LD_LIBRARY_PATH": "/usr/lib:/usr/lib64:/usr/lib/x86_64-linux-gnu:/shared/legacyapps/sl7-libs/lib:/shared/legacyapps/sl7-libs/lib64" }
}
</code>

When editing JSON, please be mindful that all elements in an object must be separated by commas, and that only doublequotes may be used.

To test your JSON, try the following:
<code>
cat ~/.local.share/jupyter/kernels/mykernel/kernel.json | jq
</code>

If <code>jq</code> repeats your JSON, then it can parse it and you are good to go.

Main Page

2023-06-19T09:11:41Z

Dawes0011: /* Miscellaneous */

Anunna is a [http://en.wikipedia.org/wiki/High-performance_computing High Performance Computer] (HPC) infrastructure hosted by [http://www.wageningenur.nl/nl/activiteit/Opening-High-Performance-Computing-cluster-HPC.htm Wageningen University & Research Centre]. It is open for use for all WUR research groups as well as other organizations, including companies, that have collaborative projects with WUR.

= Using Anunna =
* [[Tariffs | Costs associated with resource usage]]

== Gaining access to Anunna==
Access to the cluster and file transfer are traditionally done via [http://en.wikipedia.org/wiki/Secure_Shell SSH and SFTP].
* [[log_in_to_B4F_cluster | Logging into cluster using ssh]]
* [[file_transfer | File transfer options]]
* [[Services | Alternative access methods, and extra features and services on Anunna]]
* [[Filesystems | Data storage methods on Anunna]]

== Access Policy ==
[[Access_Policy | Main Article: Access Policy]]

Access needs to be granted actively (by creation of an account on the cluster by FB-IT). Use of resources is limited by the scheduler. Depending on availability of queues ('partitions') granted to a user, priority to the system's resources is regulated. Note that the use of Anunna is not free of charge. List price of CPU time and storage, and possible discounts on that list price for your organisation, can be retrieved from Shared Research Facilities or FB-IT.

= Events =

* [[Courses]] that have happened and are happening
* [[Downtime]] that will affect all users
* [[Meetings]] that may affect the policies of Anunna

= Other Software =

== Cluster Management Software and Scheduler ==
Anunna uses Bright Cluster Manager software for overall cluster management, and Slurm as job scheduler.
* [[BCM_on_B4F_cluster | Monitor cluster status with BCM]]
* [[Using_Slurm | Submit jobs with Slurm]]
* [[node_usage_graph | Be aware of how much work the cluster is under right now with 'node_usage_graph']]
* [[SLURM_Compare | Rosetta Stone of Workload Managers]]

== Installation of software by users ==

* [[Domain_specific_software_on_B4Fcluster_installation_by_users | Installing domain specific software: installation by users]]
* [[Setting local variables]]
* [[Installing_R_packages_locally | Installing R packages locally]]
* [[Setting_up_Python_virtualenv | Setting up and using a virtual environment for Python3 ]]
* [[Virtual_environment_Python_3.4_or_higher | Setting up and using a virtual environment for Python3.4 or higher ]]
* [[Installing WRF and WPS]]
* [[Running scripts on a fixed timeschedule (cron)]]

== Installed software ==

* [[Globally_installed_software | Globally installed software]]
* [[ABGC_modules | ABGC specific modules]]

= Useful Notes =

== Being in control of Environment parameters ==

* [[Using_environment_modules | Using environment modules]]
* [[Setting local variables]]
* [[Setting_TMPDIR | Set a custom temporary directory location]]
* [[Installing_R_packages_locally | Installing R packages locally]]
* [[Setting_up_Python_virtualenv | Setting up and using a virtual environment for Python3 ]]

== Controlling costs ==

* [[SACCT | using SACCT to see your costs]]
* [[get_my_bill | using the "get_my_bill" script to estimate costs]]

== Management ==
Product Owner of Anunna is Alexander van Ittersum (Wageningen UR,FB-IT, C&PS). [[User:dawes001 | Gwen Dawes (Wageningen UR, FB-IT, C&PS)]] and [[User:haars001 | Jan van Haarst (Wageningen UR,FB-IT, C&PS)]] are responsible for [[Maintenance_and_Management | Maintenance and Management]] of the cluster.

* [[Roadmap | Ambitions regarding innovation, support and administration of Anunna ]]

= Miscellaneous =
* [[Mailinglist | Electronic mail discussion lists]]
* [[History_of_the_Cluster | Historical information on the startup of Anunna]]
* [[Bioinformatics_tips_tricks_workflows | Bioinformatics tips, tricks, and workflows]]
* [[Parallel_R_code_on_SLURM | Running parallel R code on SLURM]]
* [[Convert_between_MediaWiki_and_other_formats | Convert between MediaWiki format and other formats]]
* [[Manual GitLab | GitLab: Create projects and add scripts]]
* [[Monitoring_executions | Monitoring job execution]]
* [[Shared_folders | Working with shared folders in the Lustre file system]]
* [[Old_binaries | Running older binaries on the updated OS]]

= See also =
* [[Maintenance_and_Management | Maintenance and Management]]
* [[BCData | BCData]]
* [[Mailinglist | Electronic mail discussion lists]]
* [[About_ABGC | About ABGC]]
* [[Computer_cluster | High Performance Computing @ABGC]]
* [[Lustre_PFS_layout | Lustre Parallel File System layout]]

= External links =
{| width="90%"
|- valign="top"
| width="30%" |
* [https://www.wur.nl/en/Value-Creation-Cooperation/Facilities/Wageningen-Shared-Research-Facilities/Our-facilities/Show/High-Performance-Computing-Cluster-HPC-Anunna.htm SRF offers a HPC facilty]
| width="30%" |
* [http://en.wikipedia.org/wiki/Scientific_Linux Scientific Linux]
* [http://en.wikipedia.org/wiki/Help:Cheatsheet Help with editing Wiki pages]
|}

Main Page

2023-06-19T09:11:27Z

Dawes0011: /* Miscellaneous */

Anunna is a [http://en.wikipedia.org/wiki/High-performance_computing High Performance Computer] (HPC) infrastructure hosted by [http://www.wageningenur.nl/nl/activiteit/Opening-High-Performance-Computing-cluster-HPC.htm Wageningen University & Research Centre]. It is open for use for all WUR research groups as well as other organizations, including companies, that have collaborative projects with WUR.

= Using Anunna =
* [[Tariffs | Costs associated with resource usage]]

== Gaining access to Anunna==
Access to the cluster and file transfer are traditionally done via [http://en.wikipedia.org/wiki/Secure_Shell SSH and SFTP].
* [[log_in_to_B4F_cluster | Logging into cluster using ssh]]
* [[file_transfer | File transfer options]]
* [[Services | Alternative access methods, and extra features and services on Anunna]]
* [[Filesystems | Data storage methods on Anunna]]

== Access Policy ==
[[Access_Policy | Main Article: Access Policy]]

Access needs to be granted actively (by creation of an account on the cluster by FB-IT). Use of resources is limited by the scheduler. Depending on availability of queues ('partitions') granted to a user, priority to the system's resources is regulated. Note that the use of Anunna is not free of charge. List price of CPU time and storage, and possible discounts on that list price for your organisation, can be retrieved from Shared Research Facilities or FB-IT.

= Events =

* [[Courses]] that have happened and are happening
* [[Downtime]] that will affect all users
* [[Meetings]] that may affect the policies of Anunna

= Other Software =

== Cluster Management Software and Scheduler ==
Anunna uses Bright Cluster Manager software for overall cluster management, and Slurm as job scheduler.
* [[BCM_on_B4F_cluster | Monitor cluster status with BCM]]
* [[Using_Slurm | Submit jobs with Slurm]]
* [[node_usage_graph | Be aware of how much work the cluster is under right now with 'node_usage_graph']]
* [[SLURM_Compare | Rosetta Stone of Workload Managers]]

== Installation of software by users ==

* [[Domain_specific_software_on_B4Fcluster_installation_by_users | Installing domain specific software: installation by users]]
* [[Setting local variables]]
* [[Installing_R_packages_locally | Installing R packages locally]]
* [[Setting_up_Python_virtualenv | Setting up and using a virtual environment for Python3 ]]
* [[Virtual_environment_Python_3.4_or_higher | Setting up and using a virtual environment for Python3.4 or higher ]]
* [[Installing WRF and WPS]]
* [[Running scripts on a fixed timeschedule (cron)]]

== Installed software ==

* [[Globally_installed_software | Globally installed software]]
* [[ABGC_modules | ABGC specific modules]]

= Useful Notes =

== Being in control of Environment parameters ==

* [[Using_environment_modules | Using environment modules]]
* [[Setting local variables]]
* [[Setting_TMPDIR | Set a custom temporary directory location]]
* [[Installing_R_packages_locally | Installing R packages locally]]
* [[Setting_up_Python_virtualenv | Setting up and using a virtual environment for Python3 ]]

== Controlling costs ==

* [[SACCT | using SACCT to see your costs]]
* [[get_my_bill | using the "get_my_bill" script to estimate costs]]

== Management ==
Product Owner of Anunna is Alexander van Ittersum (Wageningen UR,FB-IT, C&PS). [[User:dawes001 | Gwen Dawes (Wageningen UR, FB-IT, C&PS)]] and [[User:haars001 | Jan van Haarst (Wageningen UR,FB-IT, C&PS)]] are responsible for [[Maintenance_and_Management | Maintenance and Management]] of the cluster.

* [[Roadmap | Ambitions regarding innovation, support and administration of Anunna ]]

= Miscellaneous =
* [[Mailinglist | Electronic mail discussion lists]]
* [[History_of_the_Cluster | Historical information on the startup of Anunna]]
* [[Bioinformatics_tips_tricks_workflows | Bioinformatics tips, tricks, and workflows]]
* [[Parallel_R_code_on_SLURM | Running parallel R code on SLURM]]
* [[Convert_between_MediaWiki_and_other_formats | Convert between MediaWiki format and other formats]]
* [[Manual GitLab | GitLab: Create projects and add scripts]]
* [[Monitoring_executions | Monitoring job execution]]
* [[Shared_folders | Working with shared folders in the Lustre file system]]
* [[Old_binaries | running older binaries on the updated OS]]

= See also =
* [[Maintenance_and_Management | Maintenance and Management]]
* [[BCData | BCData]]
* [[Mailinglist | Electronic mail discussion lists]]
* [[About_ABGC | About ABGC]]
* [[Computer_cluster | High Performance Computing @ABGC]]
* [[Lustre_PFS_layout | Lustre Parallel File System layout]]

= External links =
{| width="90%"
|- valign="top"
| width="30%" |
* [https://www.wur.nl/en/Value-Creation-Cooperation/Facilities/Wageningen-Shared-Research-Facilities/Our-facilities/Show/High-Performance-Computing-Cluster-HPC-Anunna.htm SRF offers a HPC facilty]
| width="30%" |
* [http://en.wikipedia.org/wiki/Scientific_Linux Scientific Linux]
* [http://en.wikipedia.org/wiki/Help:Cheatsheet Help with editing Wiki pages]
|}

Storage Systems Overview

2023-06-16T09:04:11Z

Dawes0011:

Anunna currently has multiple filesystem mounts that are available cluster-wide:

== Global ==
* /home - This mount uses NFS to mount the home directories directly from nfs01. Each user has a 200G quota for this filesystem, as it is regularly backed up to tape, and can reliably be restored from up to a week's history.

* /shared - This mount provides a consistent set of binaries for the entire cluster.

* /lustre - This large mount uses the Lustre filesystem to provide files from multiple redundant servers. Access is provided per group, thus:
/lustre/[level]/[partner]/[unit]
e.g.
/lustre/backup/WUR/ABGC/
It comprises of two major parts (and some minor):
* /lustre/backup - In case of disaster, this data is stored a second time on a separate machine. Whilst this backup is purely in case of complete tragedy (such as some immense filesystem error, or multiple component failure), it can potentially be used to revert mistakes if you are very fast about reporting them. There is however no guarantee of this service.
* /lustre/nobackup - This is the 'normal' filesystem for Lustre - no backups, just stored on the filesystem. Without having a backup needed, the cost of data here is not as much as under /lustre/backup, but in case of disaster cannot be recivered.
* /lustre/shared - Same as /lustre/backup, except publicly available. This is where truly shared data lives that isn't assigned to a specific group.

And additionally:
* /lustre/scratch - A separated, low resilience filesystem. Files here may be removed after some time if the filesystem gets too full (Typically 30 days). You should tidy up this data yourself once work is complete.

=== Private shared directories ===
If you are working with a group of users on a similar project, you might consider making a [[Shared_folders|Shared directory]] to coordinate. Information on how to do so is in the linked article.

== Local ==
Specific to certain machines are some other filesystems that are available to you:
* /archive - an archive mount only accessible from the login nodes. Files here are sent to the Isilon for deeper storage. The cost of storing data here is much less than on the Lustre, but it cannot be used for compute work. This location is only available to WUR users. Files are able to be reverted via snapshot, and there is a separated backup, however this only comes in fortnightly (14 day) intervals.

* /tmp - On each worker node there is a /tmp mount that can be used for temporary local caching. Be advised that you should clean this up, lest your files become a hindrance to other users. You can request a node with free space in your sbatch script like so:
<pre>
#SBATCH --tmp=<required space>
</pre>

* /dev/shm - On each worker you may also create a virtual filesystem directly into memory, for extremely fast data access. Be advised that this will count against the memory used for your job, but it is also the fastest available filesystem if needed.

== See also ==
* [[Tariffs | Costs associated with resource usage]]

== External links ==
* [http://wiki.lustre.org/index.php/Main_Page Lustre website]

Setting up Python virtualenv

2023-06-16T09:02:11Z

Dawes0011:

With many Python packages available, which are often in conflict or requiring different versions depending on application, installing and controlling packages and versions is not always easy. In addition, so many packages are often used only occasionally, that it is questionable whether a system administrator of a centralized server system or a High Performance Compute (HPC) infrastructure can be expected to resolve all issues posed by users of the infrastructure. Even on a local system with full administrative rights managing versions, dependencies, and package collisions is often very difficult. The solution is to use a virtual environment, in which a specific set of packages can then be installed. As many different virtual environments can be created, and used side-by-side, as is necessary.

NOTE: as of Python 3.3 virtual environment support is built-in. See this page for an [[virtual_environment_Python_3.4_or_higher | alternative set-up of your virtual environment if using Python 3.4 or higher]].

== Creating a new virtual environment ==
It is assumed that the appropriate <code>virtualenv</code> executable for the Python version of choice is installed. A new virtual environment, in this case called <code>newenv</code> is created like so:
<pre>
module load python/my-favourite-version (e.g. 2.7.12)
virtualenv newenv
OR
pyvenv newenv (For versions >3.4)
</pre>

When the new environment is created, one will see a message similar to this:
<nowiki> New python executable in newenv/bin/python3
Also creating executable in newenv/bin/python
Installing Setuptools.........................................................................done.
Installing Pip................................................................................done.</nowiki>

== Activating a virtual environment ==
Once the environment is created, each time the environment needs to be activated, the following command needs to be issued:
<pre>
source newenv/bin/activate
</pre>
This assumes that the folder that contains the virtual environment documents (in this case called <code>newenv</code>), is in the present working directory.
When working on the virtual environment, the virtual environment name will be between brackets in front of the <code>user-host-prompt</code> string.
<nowiki> (newenv)user@host:~$</nowiki>

== Installing modules on the virtual environment ==
Installing modules is the same as usual. The difference is that modules are in <code>/path/to/virtenv/lib</code>, which may be living somewhere on your home directory. When working from the virtual environment, the default <code>pip</code> will belong to the python version that is currently active. This means that the executable in <code>/path/to/virtenv/bin</code> are in fact the first in the <code>$PATH</code>.
<pre>
pip install numpy
</pre>
Similarly, installing packages from source works exactly the same as usual.
<pre>
python setup.py install
</pre>

== deactivating a virtual environment ==
Quitting a virtual environment can be done by using the command <code>deactivate</code>, which was loaded using the <code>source</code> command upon activating the virtual environment.
<pre>
deactivate
</pre>

== Virtualenv kernels in Jupyter ==
Want your own virtualenv kernel in a notebook? This can be done by making your own kernel specifications:

(an alternative way to the manual way (using conda) is described [[Using conda to install a new kernel into your notebook|here ]])

* Make sure you have the ipykernel module in your venv. Activate it and pip install it:
<nowiki>source ~/path/to/my/virtualenv/bin/activate && pip install ipykernel</nowiki>
* Create the following directory path in your homedir if it doesn't already exist:
<nowiki>mkdir -p ~/.local/share/jupyter/kernels/</nowiki>
* Think of a nice descriptive name that doesn't clash with one of the already present kernels. I'll use 'testing'. Create this folder:
<nowiki>mkdir ~/.local/share/jupyter/kernels/testing/</nowiki>
* Add this file to this folder:
<nowiki>vi ~/.local/share/jupyter/kernels/testing/kernel.json
{
"language": "python",
"argv": [
"/home/myhome/path/to/my/virtualenv/bin/python",
"-m",
"ipykernel",
"-f",
"{connection_file}"
],
"display_name": "testing"
}</nowiki>
* Reload Jupyterhub page. testing should now exist in your kernels list.

You can do more complex things with this, such as construct your own Spark environment. This relies on having the module findspark installed:
<nowiki> vi ~/.local/share/jupyter/kernels/mysparkkernel/kernel.json
{
"language": "python",
"env": {
"SPARK_HOME":
"/shared/apps/spark/my-spark-version"
},
"argv": [
"/home/myhome/my/spark/venv/bin/python",
"-m",
"ipykernel",
"-c", "import findspark; findspark.init()",
"-f",
"{connection_file}"
],
"display_name": "My Spark kernel"
}</nowiki>
(You'll want to make sure your spark cluster has the same environment - start it after activating this venv inside your sbatch script)

== Make IPython work under virtualenv ==
IPython may not work initially under a virtual environment. It may produce an error message like below:

<nowiki> File "/usr/bin/ipython", line 11
print "Could not start qtconsole. Please install ipython-qtconsole"
^</nowiki>

This can be resolved by adding a soft link with the name <code>ipython</code> to the <code>bin</code> directory in the virtual environment folder.
<pre>
ln -s /path/to/virtenv/bin/ipython3 /path/to/virtenv/bin/ipython
</pre>

== External links ==
* [https://pypi.python.org/pypi/virtualenv Python3 documentation for virtualenv]
* [http://cemcfarland.wordpress.com/2013/03/09/getting-ipython3-working-inside-your-virtualenv/ Solving the IPython hickup under virtual environment]

Performance Optimization/Multiple nodes (MPI)

2023-06-16T09:01:59Z

Dawes0011:

== A simple 'Hello World' example ==
Consider the following simple MPI version, in C, of the 'Hello World' example:

<source lang='cpp'>
#include <stdio.h>
#include <mpi.h>
int main(int argc, char ** argv) {
int size,rank,namelen;
char processor_name[MPI_MAX_PROCESSOR_NAME];
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
MPI_Comm_size(MPI_COMM_WORLD,&size);
MPI_Get_processor_name(processor_name, &namelen);
printf("Hello MPI! Process %d of %d on %s\n", rank, size, processor_name);
MPI_Finalize();
}
</source>

Before compiling, make sure that the compilers that are required available.
<pre>
module list
</pre>

To avoid conflicts between libraries, the safest way is purging all modules:
<source lang='bash'>
module purge
</source>

The load both gcc and openmpi libraries. If modules were purged, then slurm needs to be reloaded too.
<source lang='bash'>
module load gcc/4.8.1 openmpi/gcc/64/1.6.5 slurm/2.5.7
</source>

Compile the <code>hello_mpi.c</code> code.
<source lang='bash'>
mpicc hello_mpi.c -o test_hello_world
</source>

If desired, a list of libraries compiled into the executable can be viewed:
<source lang='bash'>
ldd test_hello_world
</source>

linux-vdso.so.1 => (0x00002aaaaaacb000)
libmpi.so.1 => /shared/apps/openmpi/gcc/64/1.6.5/lib64/libmpi.so.1 (0x00002aaaaaccd000)
libdl.so.2 => /lib64/libdl.so.2 (0x00002aaaab080000)
libm.so.6 => /lib64/libm.so.6 (0x00002aaaab284000)
libnuma.so.1 => /usr/lib64/libnuma.so.1 (0x0000003e29400000)
librt.so.1 => /lib64/librt.so.1 (0x00002aaaab509000)
libnsl.so.1 => /lib64/libnsl.so.1 (0x00002aaaab711000)
libutil.so.1 => /lib64/libutil.so.1 (0x00002aaaab92a000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00002aaaabb2e000)
libc.so.6 => /lib64/libc.so.6 (0x00002aaaabd4b000)
/lib64/ld-linux-x86-64.so.2 (0x00002aaaaaaab000)

Running the executable on two nodes, with four tasks per node, can be done like this:
<source lang='bash'>
srun --nodes=2 --ntasks-per-node=4 --mpi=openmpi ./test_hello_world
</source>

This will result in the following output:
Hello MPI! Process 4 of 8 on node011
Hello MPI! Process 1 of 8 on node010
Hello MPI! Process 7 of 8 on node011
Hello MPI! Process 6 of 8 on node011
Hello MPI! Process 5 of 8 on node011
Hello MPI! Process 2 of 8 on node010
Hello MPI! Process 0 of 8 on node010
Hello MPI! Process 3 of 8 on node010

== A mvapich2 sbatch example ==
A mpi job using mvapich2 on 32 cores, using the normal compute nodes and the fast infiniband interconnect for RDMA traffic.
<source lang='bash'>
$ module load mvapich2/gcc
$ vim batch.sh
#!/bin/sh
#SBATCH --comment=projectx
#SBATCH --time=30-0
#SBATCH -n 32
#SBATCH --constraint=4gpercpu
#SBATCH --output=output_%j.txt
#SBATCH --error=error_output_%j.txt
#SBATCH --job-name=MPItest
#SBATCH --mail-type=ALL
#SBATCH --mail-user=user@wur.nl

echo "Starting at `date`"
echo "Running on hosts: $SLURM_NODELIST"
echo "Running on $SLURM_NNODES nodes."
echo "Running on $SLURM_NPROCS processors."
echo "Current working directory is `pwd`"
# echo "Env var MPIR_CVAR_NEMESIS_TCP_NETWORK_IFACE is $MPIR_CVAR_NEMESIS_TCP_NETWORK_IFACE"
# export MPIR_CVAR_NEMESIS_TCP_NETWORK_IFACE=ib0

mpirun -iface ib0 -np 32 ./tmf_par.out -NX 480 -NY 240 -alpha 11 -chi 1.3 -psi_b 5e-2 -beta 0.0 -zeta 3.5 -kT 0.10

echo "Program finished with exit code $? at: `date`"

$ sbatch batch.sh

</source>

Performance Optimization/Multiple nodes (MPI)

2023-06-16T09:01:42Z

Dawes0011:

== A simple 'Hello World' example ==
Consider the following simple MPI version, in C, of the 'Hello World' example:

<source lang='cpp'>
#include <stdio.h>
#include <mpi.h>
int main(int argc, char ** argv) {
int size,rank,namelen;
char processor_name[MPI_MAX_PROCESSOR_NAME];
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
MPI_Comm_size(MPI_COMM_WORLD,&size);
MPI_Get_processor_name(processor_name, &namelen);
printf("Hello MPI! Process %d of %d on %s\n", rank, size, processor_name);
MPI_Finalize();
}
</source>

Before compiling, make sure that the compilers that are required available.
<source lang='bash'>
module list
</source>

To avoid conflicts between libraries, the safest way is purging all modules:
<source lang='bash'>
module purge
</source>

The load both gcc and openmpi libraries. If modules were purged, then slurm needs to be reloaded too.
<source lang='bash'>
module load gcc/4.8.1 openmpi/gcc/64/1.6.5 slurm/2.5.7
</source>

Compile the <code>hello_mpi.c</code> code.
<source lang='bash'>
mpicc hello_mpi.c -o test_hello_world
</source>

If desired, a list of libraries compiled into the executable can be viewed:
<source lang='bash'>
ldd test_hello_world
</source>

linux-vdso.so.1 => (0x00002aaaaaacb000)
libmpi.so.1 => /shared/apps/openmpi/gcc/64/1.6.5/lib64/libmpi.so.1 (0x00002aaaaaccd000)
libdl.so.2 => /lib64/libdl.so.2 (0x00002aaaab080000)
libm.so.6 => /lib64/libm.so.6 (0x00002aaaab284000)
libnuma.so.1 => /usr/lib64/libnuma.so.1 (0x0000003e29400000)
librt.so.1 => /lib64/librt.so.1 (0x00002aaaab509000)
libnsl.so.1 => /lib64/libnsl.so.1 (0x00002aaaab711000)
libutil.so.1 => /lib64/libutil.so.1 (0x00002aaaab92a000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00002aaaabb2e000)
libc.so.6 => /lib64/libc.so.6 (0x00002aaaabd4b000)
/lib64/ld-linux-x86-64.so.2 (0x00002aaaaaaab000)

Running the executable on two nodes, with four tasks per node, can be done like this:
<source lang='bash'>
srun --nodes=2 --ntasks-per-node=4 --mpi=openmpi ./test_hello_world
</source>

This will result in the following output:
Hello MPI! Process 4 of 8 on node011
Hello MPI! Process 1 of 8 on node010
Hello MPI! Process 7 of 8 on node011
Hello MPI! Process 6 of 8 on node011
Hello MPI! Process 5 of 8 on node011
Hello MPI! Process 2 of 8 on node010
Hello MPI! Process 0 of 8 on node010
Hello MPI! Process 3 of 8 on node010

== A mvapich2 sbatch example ==
A mpi job using mvapich2 on 32 cores, using the normal compute nodes and the fast infiniband interconnect for RDMA traffic.
<source lang='bash'>
$ module load mvapich2/gcc
$ vim batch.sh
#!/bin/sh
#SBATCH --comment=projectx
#SBATCH --time=30-0
#SBATCH -n 32
#SBATCH --constraint=4gpercpu
#SBATCH --output=output_%j.txt
#SBATCH --error=error_output_%j.txt
#SBATCH --job-name=MPItest
#SBATCH --mail-type=ALL
#SBATCH --mail-user=user@wur.nl

echo "Starting at `date`"
echo "Running on hosts: $SLURM_NODELIST"
echo "Running on $SLURM_NNODES nodes."
echo "Running on $SLURM_NPROCS processors."
echo "Current working directory is `pwd`"
# echo "Env var MPIR_CVAR_NEMESIS_TCP_NETWORK_IFACE is $MPIR_CVAR_NEMESIS_TCP_NETWORK_IFACE"
# export MPIR_CVAR_NEMESIS_TCP_NETWORK_IFACE=ib0

mpirun -iface ib0 -np 32 ./tmf_par.out -NX 480 -NY 240 -alpha 11 -chi 1.3 -psi_b 5e-2 -beta 0.0 -zeta 3.5 -kT 0.10

echo "Program finished with exit code $? at: `date`"

$ sbatch batch.sh

</source>

ABGC/JBrowse

2023-06-16T09:01:30Z

Dawes0011:

=== Typical commands used to set up a JBrowse ===

Author: Martijn Derks

* JBrowse is available for multiple species:
** https://jbrowse.hpcagrogenomics.wur.nl/pig/
** https://jbrowse.hpcagrogenomics.wur.nl/chicken/
** https://jbrowse.hpcagrogenomics.wur.nl/cattle/
** https://jbrowse.hpcagrogenomics.wur.nl/turkey/
** https://jbrowse.hpcagrogenomics.wur.nl/Cyprinus_carpio/
* Users are free to add usefull commands to this tutorial

=== Install JBrowse ===

Download the latest JBrowse here: http://jbrowse.org/

Make a directory in <code>/shared/apps/jbrowse/</code> for your species of interested (e.g. <code>mkdir Cyprinus_carpio</code>). Move the downloaded JBrowse source files there. All further procedures detailed in this Wiki page assume working from that directory (NOTE: if your species of interest is already there, contact the maintainer of that JBrowse instance).
Run the setup script to install perl dependencies and required modules

<pre>
unzip JBrowse-1.12.0.zip
mv JBrowse-1.12.0/* $PWD
./setup.sh
</pre>

=== Add reference sequence ===

Example code for chicken genome

<pre>
bin/prepare-refseqs.pl --fasta /lustre/nobackup/WUR/ABGC/shared/public_data_store/genomes/chicken/Ensembl74/Gallus_gallus.Galgal4.74.dna.toplevel.fa
</pre>

To remove tracks use following command:

<pre>
bin/remove-track.pl -D --trackLabel 'trackname'
</pre>

===Add annotation files (GFF/BED)===

Data can be downloaded from the Ensembl FTP site: http://www.ensembl.org/info/data/ftp/index.html

Add gene features:

<pre>
bin/flatfile-to-json.pl --key "Genes" --type gene --config '{ "category": "GalGal4.83 Annotation" }' --trackLabel Genes --gff ../ensembl_data/Gallus_gallus.Galgal4.83.gff3
</pre>

Add corresponding transcripts:

<pre>
bin/flatfile-to-json.pl --key "Transcripts" --className transcript --subfeatureClasses '{"exon": "exon", "CDS": "CDS", "five_prime_UTR": "five_prime_UTR", "three_prime_UTR": "three_prime_UTR"}' --config '{ "category": "GalGal4.83 Annotation" }' --type transcript --trackLabel Transcripts --gff ../ensembl_data/Gallus_gallus.Galgal4.83.gff3
</pre>

===Alignment tracks (BAM)===

You can load single BAM-files by following command:
<pre>
bin/add-bam-track --label <label> --bam_url <url>
</pre>

To load multiple BAM files present in a certain directory use:
<pre>

for bam in /<dir>*.bam; do
ln -s $bam track_symlinks/ ## Make symlinks from the BAM files
ln -s $bam.bai track_symlinks/ ## Make symlinks to the BAM index files
tissue=`echo $bam | rev | cut -c 5- | cut -d'/' -f1 | rev` ## USe the name of the file without .bam as trackLabel

## Add BAM in alignment mode (Alignments2)
echo '{
"label" : "'${tissue}'_alignment",
"key" : "'${tissue}'_alignment",
"storeClass" : "JBrowse/Store/SeqFeature/BAM",
"urlTemplate" : "../track_symlinks/'${tissue}'",
"category" : "3. RNA-seq alignments",
"type" : "Alignments2"
}' | bin/add-track-json.pl data/trackList.json

## Add BAM in coverage mode (SNPCoverage)
echo '{
"label" : "'${tissue}'_coverage",
"key" : "'${tissue}'_coverage",
"storeClass" : "JBrowse/Store/SeqFeature/BAM",
"urlTemplate" : "../track_symlinks/'${tissue}'",
"category" : "3. RNA-seq alignments",
"type" : "SNPCoverage"
}' | bin/add-track-json.pl data/trackList.json

done
</pre>

Make sure the BAM file can be read by a everybody if not use:
<pre>
chmod +r <BAM_file>
</pre>

Make sure that all directoryies in the full path of the BAMfile are executable:
<pre>
chmod +x <dir>
</pre>

===Variant tracks (VCF)===

To load a VCF file in JBrowse make sure the file is gzipped and indexed

<pre>
tabix -p vcf Gallus_gallus_incl_consequences.vcf.gz

echo ' {
"label" : "Gallus_gallus_incl_consequences",
"key" : "Gallus_gallus_incl_consequences",
"storeClass" : "JBrowse/Store/SeqFeature/VCFTabix",
"urlTemplate" : "../../ensembl_data/VCF/Gallus_gallus_incl_consequences.vcf.gz",
"category" : "2. Variants",
"type" : "HTMLVariants"
} ' | bin/add-track-json.pl data/trackList.json
</pre>

===Wiggle/BigWig tracks (WIG)===

You can load single BigWig-files by following command:
<pre>
bin/add-bw-track --label <label> --bw_url <url>
</pre>

===Evidence tracks===

Evidence tracks can be loaded in bed, gff and gbk format using

<pre>
bin/flatfile-to-json.pl
</pre>

Examples are given above.

Get my bill

2023-06-16T09:01:06Z

Dawes0011:

To estimate costs over a certain time-period the following script can be invoked:

<pre>
module load anunna
get_my_bill
</pre>

The script will, by default, report the cost over the current month, until present. The output will look similar to this:

!!! These results are advisory only !!!
User: user001
Currently run jobs this month: 1038
Total cost so far: 25.19 EUR
For account: 12345
Jobs: 0 Cost: 0.00 EUR
For account: project2
Jobs: 104 Cost: 4.61 EUR
For account: 56789
Jobs: 74 Cost: 6.59 EUR
For account: project 4
Jobs: 19 Cost: 2.09 EUR
For account: project5
Jobs: 80 Cost: 0.04 EUR
For account: project6
Jobs: 738 Cost: 11.86 EUR
For account: project7
Jobs: 1 Cost: 0.01 EUR
For account: project8
Jobs: 22 Cost: 0.00 EUR
Type Time Current Use Current Cost EUR
home 2015-02-13 23:54:53 7.131 GB 0.00
backup 2015-02-06 06:02:43 4.000 kB 0.00
nobackup 2015-02-10 10:45:29 5.348 TB 29.48
scratch 2015-02-08 13:48:40 0.233 TB 0.00
Total this month: 54.67 EUR

The script provides options in addition to the defaults, which can be invoked as follows:
Options:
-h - Show this help message
-g - Show results for your entire group
-d [disk|compute] - get extra detail on disk/compute usages
-t YYYY-MM - Show results for specific month

== See also ==
* [[Main_Page | Main page AgHPC Wiki]]
* [[Main_Page#Controlling_costs | Controlling costs @ AgHPC Wiki]]

Storage Systems Overview

2023-06-16T08:59:44Z

Dawes0011:

Anunna currently has multiple filesystem mounts that are available cluster-wide:

== Global ==
* /home - This mount uses NFS to mount the home directories directly from nfs01. Each user has a 200G quota for this filesystem, as it is regularly backed up to tape, and can reliably be restored from up to a week's history.

* /shared - This mount provides a consistent set of binaries for the entire cluster.

* /lustre - This large mount uses the Lustre filesystem to provide files from multiple redundant servers. Access is provided per group, thus:
/lustre/[level]/[partner]/[unit]
e.g.
/lustre/backup/WUR/ABGC/
It comprises of three major parts (and some minor):
* /lustre/backup - In case of disaster, this data is stored a second time on a separate machine. Whilst this backup is purely in case of complete tragedy (such as some immense filesystem error, or multiple component failure), it can potentially be used to revert mistakes if you are very fast about reporting them. There is however no guarantee of this service.
* /lustre/nobackup - This is the 'normal' filesystem for Lustre - no backups, just stored on the filesystem. Without having a backup needed, the cost of data here is not as much as under /lustre/backup, but in case of disaster cannot be recivered.
* /lustre/scratch - Files here may be removed after some time if the filesystem gets too full (Typically 30 days). You should tidy up this data yourself once work is complete.
* /lustre/shared - Same as /lustre/backup, except publicly available. This is where truly shared data lives that isn't assigned to a specific group.

=== Private shared directories ===
If you are working with a group of users on a similar project, you might consider making a [[Shared_folders|Shared directory]] to coordinate. Information on how to do so is in the linked article.

== Local ==
Specific to certain machines are some other filesystems that are available to you:
* /archive - an archive mount only accessible from the login nodes. Files here are sent to the Isilon for deeper storage. The cost of storing data here is much less than on the Lustre, but it cannot be used for compute work. This location is only available to WUR users. Files are able to be reverted via snapshot, and there is a separated backup, however this only comes in fortnightly (14 day) intervals.

* /tmp - On each worker node there is a /tmp mount that can be used for temporary local caching. Be advised that you should clean this up, lest your files become a hindrance to other users. You can request a node with free space in your sbatch script like so:
<source lang='bash'>
#SBATCH --tmp=<required space>
</source>

* /dev/shm - On each worker you may also create a virtual filesystem directly into memory, for extremely fast data access. Be advised that this will count against the memory used for your job, but it is also the fastest available filesystem if needed.

== See also ==
* [[Tariffs | Costs associated with resource usage]]

== External links ==
* [http://wiki.lustre.org/index.php/Main_Page Lustre website]

Environment Modules

2023-06-16T08:59:32Z

Dawes0011:

== Preface ==
Environment modules [http://modules.sourceforge.net] are a smart way to provide interchangeable blocks of executables and reproducible environments for use in an HPC. It's also the only way to provide simultaneous versions of the same software without collisions, as each module is housed entirely in its own subfolder structure.

== Using modules ==
The module executable is automatically provided to you upon login. Most users have some modules automatically loaded as well; to see these, use

<nowiki>module list</nowiki>

You should be able to see which modules are loaded.

One of the most important modules to load is 'shared' - this is Anunna specific, as it will extend the MODULEPATH environment variable to use modules present in /shared as well as /cm/local/ . Without this, many modules will not be available to you.

== Loading modules ==
Availability of (different versions of) software can be checked by the following command:
<nowiki>module avail</nowiki>

For example, you should be able to find the basic module slurm. This provides the path to the sbatch, srun, etc. executables for job submission. To load this, simply:
<nowiki>module load slurm</nowiki>

And you will see that it will automatically load the latest version - no need to write out any further.

Many of the hand-installed programs have a path such as:
<nowiki>hdf5/gcc/64/1.8.14</nowiki>
Which translates into:
<nowiki>SOFTWARE/COMPILER/BITS/VERSION</nowiki>

You can elect to load this to various levels:

<nowiki>module load hdf5 # loads the latest version, not caring for compiler
module load hdf5/gcc # loads the latest gcc-compiled version, not caring for 32/64 bits (default 64)
module load hdf5/gcc/64 # loads the latest 64-bit gcc-compiled version
module load hdf5/gcc/64/1.8.14 # loads this specific version of hdf5</nowiki>

This allows your job scripts to either automatically be upgraded when the latest executables are installed, or elect to use only one specific version of a piece of code.

== Switching modules ==

If you want to remove a module, simply
<nowiki>module unload module/1</nowiki>
This will remove the executable path from your environment. It'll also follow the same logic as above, i.e. you can unload all loaded slurm modules independent of version by just unloading the base module name. You can then load up a new one. You can do this in one command, with:
<nowiki>module switch module/1 module/2</nowiki>

Some modules will not allow themselves to be loaded when another one is loaded, for instance, for sanity reasons it's not possible to load two java modules at the same time. Trying to do this will give:

<nowiki>Module 'module/2' conflicts with the currently loaded module(s) 'module/1'</nowiki>
If you're seeing this, you must unload or switch your modules rather than overloading them:
<nowiki>module switch module/2</nowiki>
This works if both modules have the same base path.

== External links ==
* http://modules.sourceforge.net
* http://www.admin-magazine.com/HPC/Articles/Environment-Modules

Using environment modules

2023-06-16T08:59:02Z

Dawes0011:

=== Environment Modules ===
[http://modules.sourceforge.net/ Environment modules] are a simple way to allow multiple potentially clashing programs to coexist on a large shared machine such as an HPC. It allows a user to specify exactly which programs are loaded, and even which version of each program, whilst simultaneously allowing the administrator the ability to automatically configure the appropriate environment variables for the system itself.

== Viewing Modules ==
Upon logging in to Anunna, you should find that when you do:
module list

You will see something like this:
<source lang='bash'>
-bash-4.1$ module list
Currently Loaded Modulefiles:
1) shared 2) slurm/2.5.7
</source>

This is a list of all loaded modules in your shell session. To get a list of all available modules, simply
module available

And this will show you the (very exhaustive) list of modules on Anunna:

<source lang='bash'>
-bash-4.1$ module avail

---------------------------- /shared/modulefiles ----------------------------
acml/gcc/64/5.3.1 netcdf/gcc/64/4.1.3
acml/gcc/fma4/5.3.1 netcdf/gcc/64/4.3.0
acml/gcc/mp/64/5.3.1 netcdf/gcc/64/4.3.2
acml/gcc/mp/fma4/5.3.1 netcdf/gcc/64/4.3.3
acml/gcc-int64/64/5.3.1 netcdf/gcc/64/4.3.3.1
acml/gcc-int64/fma4/5.3.1 netcdf/intel/64/4.1.3
...
</source>

Let's look at each of these module names. Each module is named for the application it provides, plus a subfolder of which compiler it was compiled with (if compiled), the number of address bits or options (if compiled), and the version.

If you want to see a list for a specific module, you can
module avail netcdf

And the complete list of versions will be shown.

== Loading Modules ==
To load a module, simply
module load foo

And the most recent version of module foo will automatically be loaded. If foo is compiled, it will automatically select the gcc version. If you want to specify a certain version, then
module load foo/gcc/64/1.0.0

Will load foo version 1, compiled with gcc. Be advised that this may not always work, as some modules are not compatible with each other, but a message will be shown if this is the case. Additionally, some modules will automatically load other modules with them for them to operate.

== Unloading Modules ==
If you want to remove a module that you've loaded, then
module unload foo

Will remove all module foo's loaded.

== Example ==
Consider this simple python3 script that should calculate Pi to 1 million digits:
<source lang='python'>
from decimal import *
D=Decimal
getcontext().prec=10000000
p=sum(D(1)/16**k*(D(4)/(8*k+1)-D(2)/(8*k+4)-D(1)/(8*k+5)-D(1)/(8*k+6))for k in range(411))
print(str(p)[:10000002])
</source>

This script will not run at all in the default 2.4 version of Python on the cluster. In order for this script to run you must use Python3. To do this, first list all versions of Python:
<source lang='bash'>
-bash-4.1$ module avail python

---------------------------- /shared/modulefiles ----------------------------
python/2.7.6 python/3.3.3 python/3.4.2
</source>

Then you can load the specific version you need:
module load python/3.3.3

Now you have access to the executable python3.

== See also ==
* [[Environment_Modules | Environment Modules]]
* [[Control_R_environment_using_modules | Control R environment using modules]]
* [[Create_shortcut_log-in_command | Create a shortcut for the ssh log-in command]]
* [[Installing_R_packages_locally | Installing R packages locally]]

== External links ==
* http://modules.sourceforge.net
* https://modules.readthedocs.io/en/latest/ (documentation)
* http://www.admin-magazine.com/HPC/Articles/Environment-Modules

Node usage graph

2023-06-16T08:55:36Z

Dawes0011:

There is a graphing tool that uses elements directly from sacct to display information about the current cluster usage, node_usage_graph (located at in the anunna module ).

Example:
<pre>
[user@login0 ~]# module load anunna
[user@login0 ~]# usage_graph
node: |0% 100%|
fat001: DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
fat002: CCCCCCCCC
MMMMMmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
node001:

node002:cccccccccc
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMmmmmmmmmmm
node003:CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
MM
node004:CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
M
node005:CCCCCCCCCC

node006:CCCCCCCCCC

node007:CCCCCCCCCC

node008:CCCCCCCCCCccccc
MMMMMMMMMMMMMMMMMMMMM
node009:cccccccccc
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
node010:

node011:

node012:CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
M
node013:

node014:

node015:CCCCC
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
node016:CCCCCCCCCCCCCCCCCCCCC
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
node017:

node018:

node019:CCCCC
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
node020:

node021:

node022:

node023:

node024:CCCCCCCCCCCCCCC

node025:CCCCCCCCCCCCCCCCCCCCC

node026:CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC

node027:CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC

node028:CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
MMM
node029:

node030:

node031:

node032:

node033:

node034:

node035:

node036:

node037:

node038:

node039:

node040:DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
node041:CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCcccccc
MMMMMMMMMMMMMMMMMMMMMM
node042:RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR
RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR
node049:DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
node050:CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
M
node051:

node052:CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
MMMMMMmmmmmmmmmmmmmmm
node053:CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
M
node054:DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
</pre>

This gives an overview of the current per-node resource usage. There are four types of letter:
* M: Memory reserved and in use
* m: Memory reserved and not in use
* C: CPU reserved and in use
* c: CPU reserved and not in use
* D: Drained node (not available for submission for some adminstrative reason
* R: Reserved node

It cannot however give you an indication of how much the queue is right now for any node. for that, squeue is a better resource.

Node usage graph

2023-06-16T08:55:20Z

Dawes0011:

There is a graphing tool that uses elements directly from sacct to display information about the current cluster usage, node_usage_graph (located at /cm/shared/apps/accounting/node_usage_graph ).

Example:
<pre>
[user@login0 ~]# module load anunna
[user@login0 ~]# usage_graph
node: |0% 100%|
fat001: DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
fat002: CCCCCCCCC
MMMMMmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
node001:

node002:cccccccccc
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMmmmmmmmmmm
node003:CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
MM
node004:CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
M
node005:CCCCCCCCCC

node006:CCCCCCCCCC

node007:CCCCCCCCCC

node008:CCCCCCCCCCccccc
MMMMMMMMMMMMMMMMMMMMM
node009:cccccccccc
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
node010:

node011:

node012:CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
M
node013:

node014:

node015:CCCCC
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
node016:CCCCCCCCCCCCCCCCCCCCC
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
node017:

node018:

node019:CCCCC
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
node020:

node021:

node022:

node023:

node024:CCCCCCCCCCCCCCC

node025:CCCCCCCCCCCCCCCCCCCCC

node026:CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC

node027:CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC

node028:CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
MMM
node029:

node030:

node031:

node032:

node033:

node034:

node035:

node036:

node037:

node038:

node039:

node040:DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
node041:CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCcccccc
MMMMMMMMMMMMMMMMMMMMMM
node042:RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR
RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR
node049:DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
node050:CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
M
node051:

node052:CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
MMMMMMmmmmmmmmmmmmmmm
node053:CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
M
node054:DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
</pre>

This gives an overview of the current per-node resource usage. There are four types of letter:
* M: Memory reserved and in use
* m: Memory reserved and not in use
* C: CPU reserved and in use
* c: CPU reserved and not in use
* D: Drained node (not available for submission for some adminstrative reason
* R: Reserved node

It cannot however give you an indication of how much the queue is right now for any node. for that, squeue is a better resource.

Main Page

2023-01-27T14:40:32Z

Dawes0011:

Anunna is a [http://en.wikipedia.org/wiki/High-performance_computing High Performance Computer] (HPC) infrastructure hosted by [http://www.wageningenur.nl/nl/activiteit/Opening-High-Performance-Computing-cluster-HPC.htm Wageningen University & Research Centre]. It is open for use for all WUR research groups as well as other organizations, including companies, that have collaborative projects with WUR.

= Using Anunna =
* [[Tariffs | Costs associated with resource usage]]

== Gaining access to Anunna==
Access to the cluster and file transfer are traditionally done via [http://en.wikipedia.org/wiki/Secure_Shell SSH and SFTP].
* [[log_in_to_B4F_cluster | Logging into cluster using ssh]]
* [[file_transfer | File transfer options]]
* [[Services | Alternative access methods, and extra features and services on Anunna]]
* [[Filesystems | Data storage methods on Anunna]]

== Access Policy ==
[[Access_Policy | Main Article: Access Policy]]

Access needs to be granted actively (by creation of an account on the cluster by FB-IT). Use of resources is limited by the scheduler. Depending on availability of queues ('partitions') granted to a user, priority to the system's resources is regulated. Note that the use of Anunna is not free of charge. List price of CPU time and storage, and possible discounts on that list price for your organisation, can be retrieved from Shared Research Facilities or FB-IT.

= Events =

* [[Courses]] that have happened and are happening
* [[Downtime]] that will affect all users
* [[Meetings]] that may affect the policies of Anunna

= Other Software =

== Cluster Management Software and Scheduler ==
Anunna uses Bright Cluster Manager software for overall cluster management, and Slurm as job scheduler.
* [[BCM_on_B4F_cluster | Monitor cluster status with BCM]]
* [[Using_Slurm | Submit jobs with Slurm]]
* [[node_usage_graph | Be aware of how much work the cluster is under right now with 'node_usage_graph']]
* [[SLURM_Compare | Rosetta Stone of Workload Managers]]

== Installation of software by users ==

* [[Domain_specific_software_on_B4Fcluster_installation_by_users | Installing domain specific software: installation by users]]
* [[Setting local variables]]
* [[Installing_R_packages_locally | Installing R packages locally]]
* [[Setting_up_Python_virtualenv | Setting up and using a virtual environment for Python3 ]]
* [[Virtual_environment_Python_3.4_or_higher | Setting up and using a virtual environment for Python3.4 or higher ]]
* [[Installing WRF and WPS]]
* [[Running scripts on a fixed timeschedule (cron)]]

== Installed software ==

* [[Globally_installed_software | Globally installed software]]
* [[ABGC_modules | ABGC specific modules]]

= Useful Notes =

== Being in control of Environment parameters ==

* [[Using_environment_modules | Using environment modules]]
* [[Setting local variables]]
* [[Setting_TMPDIR | Set a custom temporary directory location]]
* [[Installing_R_packages_locally | Installing R packages locally]]
* [[Setting_up_Python_virtualenv | Setting up and using a virtual environment for Python3 ]]

== Controlling costs ==

* [[SACCT | using SACCT to see your costs]]
* [[get_my_bill | using the "get_my_bill" script to estimate costs]]

== Management ==
Product Owner of Anunna is Alexander van Ittersum (Wageningen UR,FB-IT, C&PS). [[User:dawes001 | Gwen Dawes (Wageningen UR, FB-IT, C&PS)]] and [[User:haars001 | Jan van Haarst (Wageningen UR,FB-IT, C&PS)]] are responsible for [[Maintenance_and_Management | Maintenance and Management]] of the cluster.

* [[Roadmap | Ambitions regarding innovation, support and administration of Anunna ]]

= Miscellaneous =
* [[Mailinglist | Electronic mail discussion lists]]
* [[History_of_the_Cluster | Historical information on the startup of Anunna]]
* [[Bioinformatics_tips_tricks_workflows | Bioinformatics tips, tricks, and workflows]]
* [[Parallel_R_code_on_SLURM | Running parallel R code on SLURM]]
* [[Convert_between_MediaWiki_and_other_formats | Convert between MediaWiki format and other formats]]
* [[Manual GitLab | GitLab: Create projects and add scripts]]
* [[Monitoring_executions | Monitoring job execution]]
* [[Shared_folders | Working with shared folders in the Lustre file system]]

= See also =
* [[Maintenance_and_Management | Maintenance and Management]]
* [[BCData | BCData]]
* [[Mailinglist | Electronic mail discussion lists]]
* [[About_ABGC | About ABGC]]
* [[Computer_cluster | High Performance Computing @ABGC]]
* [[Lustre_PFS_layout | Lustre Parallel File System layout]]

= External links =
{| width="90%"
|- valign="top"
| width="30%" |
* [https://www.wur.nl/en/Value-Creation-Cooperation/Facilities/Wageningen-Shared-Research-Facilities/Our-facilities/Show/High-Performance-Computing-Cluster-HPC-Anunna.htm SRF offers a HPC facilty]
| width="30%" |
* [http://en.wikipedia.org/wiki/Scientific_Linux Scientific Linux]
* [http://en.wikipedia.org/wiki/Help:Cheatsheet Help with editing Wiki pages]
|}