Cluster Architecture Overview

From HPCwiki
Jump to navigation Jump to search


Anunna has a classic, heterogeneous HPC architecture: a set of login nodes that users connect to, a larger pool of compute nodes where the actual work runs, a fast parallel filesystem shared across the whole cluster, and a high-speed network tying it all together. The design is built to grow — nodes and storage can be added over time.

Login nodes

You reach the cluster by connecting to login.anunna.wur.nl over SSH (see Log in to Anunna). The login nodes are the entry point: you use them to edit files, manage your data, and submit jobs. They are shared by everyone, so they are not for heavy computation — anything beyond light, interactive work belongs in a job (see Batch Jobs and Interactive Jobs).

Compute nodes

The compute nodes do the actual work. They are heterogeneous: ordinary CPU nodes for most jobs, plus nodes equipped with NVIDIA or AMD GPUs for accelerated workloads. Jobs reach them through the SLURM scheduler, which places each job on a node that matches its requested resources.

  • CPU work runs in the main partition.
  • GPU work runs in the gpu (NVIDIA) and gpu_amd (AMD) partitions.

For the hardware details of each node type, see Compute Hardware Overview; for how to select a partition or constrain a job to particular hardware, see Partitions / Queues and Choosing a node (constraints).

Storage

A Lustre parallel filesystem (mounted at /lustre) is shared across all nodes and is the cluster's main working storage; your home directory lives on a separate, smaller NFS filesystem. Longer-term and archival storage is available too. For the full picture — quotas, backup behaviour, and which tier to use for what — see Storage Systems Overview.

Network

The nodes and the Lustre filesystem are connected by a high-speed, low-latency interconnect, which is what allows the parallel filesystem and multi-node jobs to perform well.

Operating system

The cluster runs Ubuntu 24.04 across its nodes. Software is provided through environment modules rather than being installed system-wide, so several versions of the same program can coexist.

See also