Wiki TODO

From HPCwiki
Revision as of 14:15, 18 June 2026 by Haars0011 (talk | contribs) (Add Software and workflows section with Workflow Engines task (via update-page on MediaWiki MCP Server))
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

This page lists Anunna documentation that has been published but still needs current or authoritative information filled in. The pages below are usable as they stand, but each contains placeholders — marked in the page source with comments — where someone with current operational knowledge needs to add or confirm details.

To pick up a task: open the page, find the TODO comment(s) in its source, add the information, and remove the item from this list.

Hardware and network

  • Cluster Architecture Overview — confirm the current interconnect (Omnipath and/or InfiniBand) and topology; add a current architecture diagram (the old schematics were out of date and were removed).
  • Compute Hardware Overview — fill in the per-node specifications: number of CPU nodes, cores and memory per node, CPU model(s), and whether there are high-memory "fat" nodes; number of NVIDIA GPU nodes, GPUs per node, and host CPU/memory; AMD GPU model and node details.
  • Network & Security — confirm the interconnect (as above) and document the data-security posture: how data is protected at rest and in transit, access control, handling of confidential or personal data, and backup/retention.
  • Performance Optimization/Multiple nodes (MPI) — confirm the recommended MPI library and module names, any required srun --mpi=... plugin flag, and whether specific interconnect/fabric tuning is needed. The old mvapich2 + InfiniBand (ib0) example was Breed4Food-era and was removed.

Storage

  • Quotas — document the Lustre quota model (per-user or per-group, and the default sizes for /lustre/backup, /lustre/nobackup, /lustre/scratch), the command(s) users run to check their current usage, and how to request more space. Only the home-directory quota (200 GB) is currently documented.
  • Archival Storage — clarify how the manual iRODS/itape workflow relates to Tapeworm, now that Tapeworm manages archival from /archive to tape automatically: is itape still the recommended way to push data to tape, or is iRODS now used mainly for retrieving archived datasets?

Software and workflows

  • Workflow Engines (Snakemake, Nextflow) — confirm the recommended way to run Snakemake on Anunna (the documented profile is the older Snakemake <8 style; Snakemake 8+ uses the --executor slurm plugin), and write the Nextflow section (loading/installing Nextflow, the SLURM executor configuration, and a minimal example).

Policy and governance

  • Mission and Governance — confirm and document the current governance bodies (for example a steering group and user group) and their membership. The previous roster was from the Breed4Food era and was removed.
  • Roadmap — add the current roadmap, or link the latest roadmap document. The 2019–2020 version was removed.
  • Sustainability — add the sustainability / green-HPC policy and any concrete measures: energy sourcing, cooling, hardware lifecycle, and institutional targets.
  • Policies and Terms of Use — supply the formal terms of use / acceptable-use policy (data ownership and responsibilities, security obligations, consequences of misuse) and confirm the current account-request route and per-group access contacts. The previous access-contact list was from the Breed4Food era and was removed.
  • Account Application Process — confirm whether WUR users request an Anunna account through the general "HPC (Anunna)" support service or a dedicated account-request form, and what details are needed (group/affiliation, sponsor, intended use).

Nice to have

  • FAQ — extend with the questions the HPC team is asked most often (for example quota limits, GPU access, course accounts, data transfer).

See also