Workflow Migration from Laptop to HPC

From HPCwiki
Revision as of 14:22, 18 June 2026 by Haars0011 (talk | contribs) (IA migration §8: new Workflow Migration from Laptop to HPC (onboarding guide) (via create-page on MediaWiki MCP Server))
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Moving your work from a laptop or workstation to Anunna is mostly a change of mindset: instead of running things interactively and waiting at your screen, you describe what you want, hand it to the scheduler, and let it run. This page is a high-level guide to making that transition.

What changes

  • You do not run work interactively. On your laptop you run a script and watch it. On the cluster you submit it as a job and the scheduler runs it when resources are free — see Batch Jobs. The login nodes are only for editing, transferring data, and submitting jobs.
  • You ask for resources explicitly. CPUs, memory, time, and GPUs are all requested up front — see Scheduler Overview (Slurm).
  • Storage is tiered. Active data goes on the fast Lustre filesystem, not your home directory, and you need to know what is backed up — see Storage Systems Overview and Backup Policy.
  • Software comes from modules. Instead of installing everything yourself, you load it with modules, or install into your own environment — see Installing Personal Software.

A typical first migration

  1. Get an account and log in — see Who Can Access? and SSH Access.
  2. Move your data and code across — see Data Transfer Methods.
  3. Find your software as modules, or set up your own environment — see Software Overview.
  4. Test on something small in an interactive job to check it runs on the cluster.
  5. Wrap the working commands in a batch script and submit it — see Batch Jobs.
  6. Scale up: many inputs at once with job arrays, or more cores and nodes with threads and MPI.
  7. For multi-step analyses, consider a workflow engine.

Tips

  • Start small. Get one step working as a job before scaling to your whole dataset.
  • Request only the resources you need — over-requesting just means longer waits in the queue.
  • Keep your work reproducible from the start — see Reproducibility Guidelines.

See also