Data storage best practices

From HPCwiki
Revision as of 12:15, 19 June 2026 by Haars0011 (talk | contribs) (IA migration §6: new Data storage best practices (P2.14) (via create-page on MediaWiki MCP Server))
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

A few practices that keep your data safe, your costs down, and the cluster fast for everyone. For the storage areas themselves, see Storage Systems Overview.

Put data in the right place

  • Keep active compute data on Lustre (/lustre/nobackup by default), not in your home directory.
  • Keep programs, scripts, and configuration — small, important files — in your home directory.
  • Move data you have finished with, but want to keep, to archival storage.

Protect what matters

  • Know what is backed up and what is not — see Backup Policy. Treat /lustre/nobackup and /lustre/scratch as expendable.
  • Keep your own copy of anything you cannot regenerate.

Keep it tidy and cheap

  • Clean up /lustre/scratch and node-local /tmp when you are done — scratch may be purged automatically, and storage is charged per TB (see Tariffs).
  • Delete intermediate files you no longer need.
  • Compress large datasets that you are keeping but rarely touch.

Work efficiently

  • Avoid creating very many tiny files; Lustre performs best with fewer, larger files.
  • Do not run heavy I/O (including writing logs) against your home directory — use Lustre.

See also