Filesystems: Difference between revisions

From HPCwiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(5 intermediate revisions by 3 users not shown)
Line 1: Line 1:
The AgroGenomics HPC currently has multiple filesystem mounts that are available cluster-wide:
Anunna currently has multiple filesystem mounts that are available cluster-wide:


== Global ==
== Global ==
* /home - This mount uses NFS to mount the home directories directly from nfs01. Each user has a 200G quota for this filesystem, as it is regularly backed up to tape, and can reliably be restored from up to a week's history.
* /home - This mount uses NFS to mount the home directories directly from nfs01. Each user has a 200G quota for this filesystem, as it is regularly backed up to tape, and can reliably be restored from up to a week's history.


* /cm/shared - This mount provides a consistent set of binaries for the entire cluster.
* /shared - This mount provides a consistent set of binaries for the entire cluster.


* /lustre - This large mount uses the Lustre filesystem to provide files from multiple redundant servers. Access is provided per group, thus:
* /lustre - This large mount uses the Lustre filesystem to provide files from multiple redundant servers. Access is provided per group, thus:
Line 10: Line 10:
e.g.
e.g.
  /lustre/backup/WUR/ABGC/
  /lustre/backup/WUR/ABGC/
It comprises of three major parts (and some minor):
It comprises of two major parts (and some minor):
* /lustre/backup - In case of disaster, this data is stored a second time on a separate machine. Whilst this backup is purely in case of complete tragedy (such as some immense filesystem error, or multiple component failure), it can potentially be used to revert mistakes if you are very fast about reporting them. There is however no guarantee of this service.
* /lustre/backup - In case of disaster, this data is stored a second time on a separate machine. Whilst this backup is purely in case of complete tragedy (such as some immense filesystem error, or multiple component failure), it can potentially be used to revert mistakes if you are very fast about reporting them. There is however no guarantee of this service.
* /lustre/nobackup - This is the 'normal' filesystem for Lustre - no backups, just stored on the filesystem. Without having a backup needed, the cost of data here is not as much as under /lustre/backup, but in case of disaster cannot be recivered.
* /lustre/nobackup - This is the 'normal' filesystem for Lustre - no backups, just stored on the filesystem. Without having a backup needed, the cost of data here is not as much as under /lustre/backup, but in case of disaster cannot be recivered.
* /lustre/scratch - Files here may be removed after some time if the filesystem gets too full (Typically 30 days). You should tidy up this data yourself once work is complete.
* /lustre/shared - Same as /lustre/backup, except publicly available. This is where truly shared data lives that isn't assigned to a specific group.
* /lustre/shared - Same as /lustre/backup, except publicly available. This is where truly shared data lives that isn't assigned to a specific group.
And additionally:
* /lustre/scratch - A separated, low resilience filesystem. Files here may be removed after some time if the filesystem gets too full (Typically 30 days). You should tidy up this data yourself once work is complete.


=== Private shared directories ===
=== Private shared directories ===
Line 21: Line 23:
== Local ==
== Local ==
Specific to certain machines are some other filesystems that are available to you:
Specific to certain machines are some other filesystems that are available to you:
* /archive - an archive mount only accessible from nfs01. Files here are sent to the Isilon for deeper storage. The cost of storing data here is much less than on the Lustre, but it cannot be used for compute work. This is only available to WUR users, as the Isilon is unable to resolve local groups (without additional work). Files are able to be reverted via snapshot, and there is a separated backup, however this only comes in fortnightly (14 day) intervals.
* /archive - an archive mount only accessible from the login nodes. Files here are sent to the Isilon for deeper storage. The cost of storing data here is much less than on the Lustre, but it cannot be used for compute work. This location is only available to WUR users. Files are able to be reverted via snapshot, and there is a separated backup, however this only comes in fortnightly (14 day) intervals.


* /tmp - On each worker node there is a /tmp mount that can be used for temporary local caching. Be advised that you should clean this up, lest your files become a hindrance to other users. You can request a node with free space in your sbatch script like so:
* /tmp - On each worker node there is a /tmp mount that can be used for temporary local caching. Be advised that you should clean this up, lest your files become a hindrance to other users. You can request a node with free space in your sbatch script like so:
<source lang='bash'>
<pre>
#SBATCH --tmp=<required space>
#SBATCH --tmp=<required space>
</source>
</pre>




Line 33: Line 35:


== See also ==
== See also ==
* [[B4F_cluster | B4F cluster]]
* [[Tariffs | Costs associated with resource usage]]


== External links ==
== External links ==
* [http://wiki.lustre.org/index.php/Main_Page Lustre website]
* [http://wiki.lustre.org/index.php/Main_Page Lustre website]

Latest revision as of 10:04, 16 June 2023

Anunna currently has multiple filesystem mounts that are available cluster-wide:

Global

  • /home - This mount uses NFS to mount the home directories directly from nfs01. Each user has a 200G quota for this filesystem, as it is regularly backed up to tape, and can reliably be restored from up to a week's history.
  • /shared - This mount provides a consistent set of binaries for the entire cluster.
  • /lustre - This large mount uses the Lustre filesystem to provide files from multiple redundant servers. Access is provided per group, thus:
/lustre/[level]/[partner]/[unit]

e.g.

/lustre/backup/WUR/ABGC/

It comprises of two major parts (and some minor):

  • /lustre/backup - In case of disaster, this data is stored a second time on a separate machine. Whilst this backup is purely in case of complete tragedy (such as some immense filesystem error, or multiple component failure), it can potentially be used to revert mistakes if you are very fast about reporting them. There is however no guarantee of this service.
  • /lustre/nobackup - This is the 'normal' filesystem for Lustre - no backups, just stored on the filesystem. Without having a backup needed, the cost of data here is not as much as under /lustre/backup, but in case of disaster cannot be recivered.
  • /lustre/shared - Same as /lustre/backup, except publicly available. This is where truly shared data lives that isn't assigned to a specific group.

And additionally:

  • /lustre/scratch - A separated, low resilience filesystem. Files here may be removed after some time if the filesystem gets too full (Typically 30 days). You should tidy up this data yourself once work is complete.

Private shared directories

If you are working with a group of users on a similar project, you might consider making a Shared directory to coordinate. Information on how to do so is in the linked article.

Local

Specific to certain machines are some other filesystems that are available to you:

  • /archive - an archive mount only accessible from the login nodes. Files here are sent to the Isilon for deeper storage. The cost of storing data here is much less than on the Lustre, but it cannot be used for compute work. This location is only available to WUR users. Files are able to be reverted via snapshot, and there is a separated backup, however this only comes in fortnightly (14 day) intervals.
  • /tmp - On each worker node there is a /tmp mount that can be used for temporary local caching. Be advised that you should clean this up, lest your files become a hindrance to other users. You can request a node with free space in your sbatch script like so:
#SBATCH --tmp=<required space>


  • /dev/shm - On each worker you may also create a virtual filesystem directly into memory, for extremely fast data access. Be advised that this will count against the memory used for your job, but it is also the fastest available filesystem if needed.


See also

External links