Computer cluster: Difference between revisions

From HPCwiki
Jump to navigation Jump to search
No edit summary
 
(16 intermediate revisions by the same user not shown)
Line 1: Line 1:
== HPC infrastructure==
The HPC infrastructure at the Animal Breeding and Genomics Centre is currently in a transition phase. Over the past years the main infrastructure consisted of the [[Lx6_and_Lx7_compute_nodes | Lx6 and Lx7 compute nodes]]. As of December 2013 the new [[B4F_cluster | B4F cluster]] has become online is expected to replace the old infrastructure early 2014. The Lx6 and Lx7 compute nodes, attached to a 100TB storage facility, will remain online for the foreseeable future (expectation mid-2014).
The '''ABGC high performance computer infrastructure''' comprise two machines:
 
    scomp1095/lx6 48 cores machine with 192GB of RAM
 
    scomp1090/lx7 48 cores machine with 512GB of RAM
 
For '''more information''' see the [[ABGC_bioinformatics | general ABGC bioinformatics page]].
 
== Access ==
Accessing''' computer cluster through ssh protocol''' is easy:
 
    ssh username@scompXXXX.wurnet.nl
 
For people that needs visualisation (i.e. R graphs) use:
 
    ssh -X username@scompXXXX.wurnet.nl
 
== Basic Bash programming ==
 
For basic bash programming please refer to:
 
    http://en.wikibooks.org/wiki/Bash_Shell_Scripting
 
== Submitting jobs ==
Submitting jobs on a super computer has to be done through the SGE (Sun Grid Engine) that manages jobs. It attributes priorities to jobs and distribute them across the different cores available.
 
'''Using the SGE through qsub''':
 
'''qsub''' has many options, I am going to describe a few crucial ones.
 
  -l h_vmem=XG
 
This command allows you to pre-define how much memory is to be attributed to your job. Note that your job will be killed by the SGE if you underestimate the amount of memory needed. The default is 1G.
 
  -cwd
 
Set the current working directory. This will allow the SGE to work with incomplete path (i.e. ../my_data/).
 
  -q all.q
 
Send your job to the all.q queue.
 
  -S $PATH
 
Sometimes the SGE has trouble finding the path of some interpreter, using -S allows you to specify to the SGE where to find the interpreter (i.e -S /bin/sh)
 
  -b y
 
Tells the SGE that you are running a binary program and to specify where to find it. This is particularly useful when you have some program that you compiled yourself in your own bin.
 
 
Example of qsub commands:
 
  qsub -l h_vmem=10G -q all.q -cwd -S /usr/bin/perl myscript.pl
  qsub -l h_vmem=10G -q all.q -cwd -b y ~/bin/./asreml
 
== Installing programs ==
 
There are two ways to install a program on the clusters:
 
1) If the program is going to be used by a wide range of users, better ask one of the administrators to install it.
 
2) If you are going to be the only one to use this program you can install it in your home directory. Create your own ~/bin directory and compile things there. Then copy the executable directly in your ~/bin/. The next step is to add the path of your new bin to your .bashrc To do so do:
 
  vim ~/.bashrc
 
Then add the following line in this file:
 
  export PATH=$PATH:~/bin/
 
This will make all executable in ~/bin installed, so the they can be invoked simply by typing their name in the command line.
 
== When to use the computer clusters ==
 
There are many applications that benefits from super computer clusters. If you have a large number of jobs to run in parallel or one job on multiple cores, then you should use the cluster. Jobs that requires a large amount of memory (RAM) or the transfer of large data base through the network (from the server to your own computer) should also be run on the cluster.
 
Although, the situation is sometime difficult to assess. Even when you have a lot of jobs to run, it might not be the best strategy to run these on the cluster. If the cluster is overloaded, your job might be attributed less than one core by the SGE. This will result in a '''DRASTIC''' slow down of your job. The servers infrastructures have many cores, however, one core on these computers is much slower than one core on your own local machine. Therefore, when overloaded, you should be careful not to run too many jobs, which might take longer to run on the server on your own machine. Moreover, this will also slow down existing job on the server, and everyone will loose its time.
 
== Out of University access ==
 
To access the cluster from outside of the intranet, you can use the access point '''scomp1038'''.
 
Note that if you '''install OpenSSH''' on your own machine you should also be able to access your own computer from scomp1038 and therefore, from outside the University network. Here is a tutorial for ubuntu installation of OpenSSH client / server:
 
    https://help.ubuntu.com/10.04/serverguide/C/openssh-server.html
 
 
Another nice feature of ssh is that you can redirect html traffic through a specified port to scomp1038. This will allow you to consult journals directly from home without having to login onto the University network. Here is a small tutorial for '''Linux + Firefox'''.
 
'''1) Connect to scomp1038'''
 
    ssh -D 9999 username@scomp1038.wur.nl
 
'''2) Change proxy settings in Firefox''' go to '''Edit > Preferences > Advanced > Network'''.  
 
Then click on '''Settings'''.
 
Then '''Manual proxy configuration'''.
 
There add in the '''SOCKS Host: localhost; port 9999'''; 
 
 
''For directions on using Putty Check this link : http://www.hacktabs.com/how-to-setup-ssh-tunneling-in-firefox-and-surf-anonymously/'''

Latest revision as of 01:39, 23 November 2013

The HPC infrastructure at the Animal Breeding and Genomics Centre is currently in a transition phase. Over the past years the main infrastructure consisted of the Lx6 and Lx7 compute nodes. As of December 2013 the new B4F cluster has become online is expected to replace the old infrastructure early 2014. The Lx6 and Lx7 compute nodes, attached to a 100TB storage facility, will remain online for the foreseeable future (expectation mid-2014).