Cluster Architecture Overview: Difference between revisions

From HPCwiki
Jump to navigation Jump to search
Phase 1 § 1 P1.1.2: rewrite — drop stale 2014 B4F specs and Cluster Management Portal dump, write conceptual overview with TODO markers for current node/network specifics (via update-page on MediaWiki MCP Server)
Fix storage links: FilesystemsStorage Systems Overview (was resolving only via redirect) (via update-page on MediaWiki MCP Server)
 
Line 18: Line 18:
== Storage ==
== Storage ==


A [https://en.wikipedia.org/wiki/Lustre_(file_system) Lustre] parallel filesystem (mounted at <code>/lustre</code>) is shared across all nodes and is the cluster's main working storage; your home directory lives on a separate, smaller NFS filesystem. Longer-term and archival storage is available too. For the full picture — quotas, backup behaviour, and which tier to use for what — see [[Filesystems]].
A [https://en.wikipedia.org/wiki/Lustre_(file_system) Lustre] parallel filesystem (mounted at <code>/lustre</code>) is shared across all nodes and is the cluster's main working storage; your home directory lives on a separate, smaller NFS filesystem. Longer-term and archival storage is available too. For the full picture — quotas, backup behaviour, and which tier to use for what — see [[Storage Systems Overview]].


== Network ==
== Network ==
Line 31: Line 31:


* [[Compute Hardware Overview]]
* [[Compute Hardware Overview]]
* [[Filesystems]]
* [[Storage Systems Overview]]
* [[Scheduler Overview (Slurm)]]
* [[Scheduler Overview (Slurm)]]
* [[Log in to Anunna]]
* [[Log in to Anunna]]

Latest revision as of 12:09, 18 June 2026


Anunna has a classic, heterogeneous HPC architecture: a set of login nodes that users connect to, a larger pool of compute nodes where the actual work runs, a fast parallel filesystem shared across the whole cluster, and a high-speed network tying it all together. The design is built to grow — nodes and storage can be added over time.

Login nodes

You reach the cluster by connecting to login.anunna.wur.nl over SSH (see Log in to Anunna). The login nodes are the entry point: you use them to edit files, manage your data, and submit jobs. They are shared by everyone, so they are not for heavy computation — anything beyond light, interactive work belongs in a job (see Batch Jobs and Interactive Jobs).

Compute nodes

The compute nodes do the actual work. They are heterogeneous: ordinary CPU nodes for most jobs, plus nodes equipped with NVIDIA or AMD GPUs for accelerated workloads. Jobs reach them through the SLURM scheduler, which places each job on a node that matches its requested resources.

  • CPU work runs in the main partition.
  • GPU work runs in the gpu (NVIDIA) and gpu_amd (AMD) partitions.

For the hardware details of each node type, see Compute Hardware Overview; for how to select a partition or constrain a job to particular hardware, see Partitions / Queues and Choosing a node (constraints).

Storage

A Lustre parallel filesystem (mounted at /lustre) is shared across all nodes and is the cluster's main working storage; your home directory lives on a separate, smaller NFS filesystem. Longer-term and archival storage is available too. For the full picture — quotas, backup behaviour, and which tier to use for what — see Storage Systems Overview.

Network

The nodes and the Lustre filesystem are connected by a high-speed, low-latency interconnect, which is what allows the parallel filesystem and multi-node jobs to perform well.

Operating system

The cluster runs Ubuntu 24.04 across its nodes. Software is provided through environment modules rather than being installed system-wide, so several versions of the same program can coexist.

See also