Maintenance and Management: Difference between revisions

From HPCwiki
Jump to navigation Jump to search
m (Update about the 26th.)
No edit summary
 
(4 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== Maintenance and Management ==
== Maintenance and Management ==


As of April 2014, questions should be directed to the Service Desk IT.
The cluster is maintained for firmware and software updates on a six-monthly basis. Typically one downtime is scheduled in May, whilst the other is scheduled in November, in order to disrupt usage as little as possible.
 
Any issues should be directed to the WUR IT Servicedesk:


This can be done via the mail: servicedesk.it@wur.nl  
This can be done via the mail: servicedesk.it@wur.nl  
And it can be done via the telephone: +31 317 488888
And it can be done via the telephone: +31 317 488888
Please give your name and phonenumber and tell that your mail/call is about the HPC for Agrogenomics and give the company you are working for.
Please give your name and phonenumber and tell that your mail/call is about Anunna and give the company you are working for.
When you call the servicedesk, give also your email address.
When you call the servicedesk, give also your email address.


== Previous Maintenance Windows ==
=== Maintenance May 24th 2017 ===
This update moved Bright Cluster Manager to version 7.3, and SLURM to 16.08. Downtime was 8am to 8pm.
=== Maintenance November 16th 2016 ===
This was a firmware update and OS update for the system. Downtime was 8am to 8pm.
=== Maintenance May 23rd-29th 2016 ===
This update moved the OS from Scientific Linux 6 to Scientific Linux 7, and updated Bright Cluster manager to version 7.2. A week was taken to reconstruct the entire environment from scratch, thanks to assistance from Clustervision for this expedience.
=== Maintenance March 23rd 2016 ===
There will be mainly firmware maintenance between 8h and 20h CET . Because network controller and storage controller firmware will be upgrades, all servers need to be rebooted and also will have network hick ups. So to prevent job and/or data corruption, Anunna will be shut down during this maintenance window. Running jobs will be killed!
=== Maintenance June 17th 2015 ===
There will be mainly firmware maintenance between 8h and 20h CET . Because network controller and storage controller firmware will be upgrades, all servers need to be rebooted and also will have network hick ups. So to prevent job and/or data corruption, Anunna will be shut down during this maintenance window. Running jobs will be killed!


== Maintenance June 11th 2014 ==
=== Maintenance November 26th 2014 ===
There will be mainly firmware maintenance between 8h and 13h CET . Because network controller and storage controller firmware will be upgrades, all servers need to be rebooted and also will have network hick ups. So to prevent job and/or data corruption, the HPC will be shut down during this maintenance window. Running jobs will be killed!
There will be mainly firmware maintenance between 8h and 20h CET . Because network controller and storage controller firmware will be upgrades, all servers need to be rebooted and also will have network hick ups. So to prevent job and/or data corruption, Anunna will be shut down during this maintenance window. Running jobs will be killed!


== Maintenance November 26th 2014 ==
=== Maintenance June 11th 2014 ===
There will be mainly firmware maintenance between 8h and 13h CET . Because network controller and storage controller firmware will be upgrades, all servers need to be rebooted and also will have network hick ups. So to prevent job and/or data corruption, the HPC will be shut down during this maintenance window. Running jobs will be killed!
There will be mainly firmware maintenance between 8h and 13h CET . Because network controller and storage controller firmware will be upgrades, all servers need to be rebooted and also will have network hick ups. So to prevent job and/or data corruption, Anunna will be shut down during this maintenance window. Running jobs will be killed!

Latest revision as of 18:43, 19 February 2019

Maintenance and Management

The cluster is maintained for firmware and software updates on a six-monthly basis. Typically one downtime is scheduled in May, whilst the other is scheduled in November, in order to disrupt usage as little as possible.

Any issues should be directed to the WUR IT Servicedesk:

This can be done via the mail: servicedesk.it@wur.nl And it can be done via the telephone: +31 317 488888 Please give your name and phonenumber and tell that your mail/call is about Anunna and give the company you are working for. When you call the servicedesk, give also your email address.

Previous Maintenance Windows

Maintenance May 24th 2017

This update moved Bright Cluster Manager to version 7.3, and SLURM to 16.08. Downtime was 8am to 8pm.

Maintenance November 16th 2016

This was a firmware update and OS update for the system. Downtime was 8am to 8pm.

Maintenance May 23rd-29th 2016

This update moved the OS from Scientific Linux 6 to Scientific Linux 7, and updated Bright Cluster manager to version 7.2. A week was taken to reconstruct the entire environment from scratch, thanks to assistance from Clustervision for this expedience.

Maintenance March 23rd 2016

There will be mainly firmware maintenance between 8h and 20h CET . Because network controller and storage controller firmware will be upgrades, all servers need to be rebooted and also will have network hick ups. So to prevent job and/or data corruption, Anunna will be shut down during this maintenance window. Running jobs will be killed!

Maintenance June 17th 2015

There will be mainly firmware maintenance between 8h and 20h CET . Because network controller and storage controller firmware will be upgrades, all servers need to be rebooted and also will have network hick ups. So to prevent job and/or data corruption, Anunna will be shut down during this maintenance window. Running jobs will be killed!

Maintenance November 26th 2014

There will be mainly firmware maintenance between 8h and 20h CET . Because network controller and storage controller firmware will be upgrades, all servers need to be rebooted and also will have network hick ups. So to prevent job and/or data corruption, Anunna will be shut down during this maintenance window. Running jobs will be killed!

Maintenance June 11th 2014

There will be mainly firmware maintenance between 8h and 13h CET . Because network controller and storage controller firmware will be upgrades, all servers need to be rebooted and also will have network hick ups. So to prevent job and/or data corruption, Anunna will be shut down during this maintenance window. Running jobs will be killed!