Data storage best practices
Jump to navigation
Jump to search
A few practices that keep your data safe, your costs down, and the cluster fast for everyone. For the storage areas themselves, see Storage Systems Overview.
Put data in the right place
- Keep active compute data on Lustre (
/lustre/nobackupby default), not in your home directory. - Keep programs, scripts, and configuration — small, important files — in your home directory.
- Move data you have finished with, but want to keep, to archival storage.
Protect what matters
- Know what is backed up and what is not — see Backup Policy. Treat
/lustre/nobackupand/lustre/scratchas expendable. - Keep your own copy of anything you cannot regenerate.
Keep it tidy and cheap
- Clean up
/lustre/scratchand node-local/tmpwhen you are done — scratch may be purged automatically, and storage is charged per TB (see Tariffs). - Delete intermediate files you no longer need.
- Compress large datasets that you are keeping but rarely touch.
Work efficiently
- Avoid creating very many tiny files; Lustre performs best with fewer, larger files.
- Do not run heavy I/O (including writing logs) against your home directory — use Lustre.