Maintenance Schedule: Difference between revisions

From HPCwiki
Jump to navigation Jump to search
Vaend001 (talk | contribs)
No edit summary
IA migration §10: rewrite as maintenance schedule (6-monthly downtimes); reporting via How to Get Help; TODO for live dates (via update-page on MediaWiki MCP Server)
 
(5 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== Maintenance and Management ==
Anunna is taken down for planned maintenance — firmware and software updates — roughly twice a year. Typically one downtime is scheduled in April or May, in a week free of teaching, and another in October, to disrupt usage as little as possible.


The cluster is maintained for firmware and software updates on a six-monthly basis. Typically one downtime is scheduled in May, whilst the other is scheduled in November, in order to disrupt usage as little as possible.
During a downtime the cluster is unavailable. Jobs are not scheduled to run across a planned downtime, so check the announced dates when planning long jobs.


Any issues should be directed to the WUR IT Servicedesk:
== Announcements ==


This can be done via the mail: servicedesk.it@wur.nl
Planned maintenance is announced in advance — watch the Message of the Day on the [[Portal Overview|Apps Portal]] and the usual support channels.
And it can be done via the telephone: +31 317 488888
Please give your name and phonenumber and tell that your mail/call is about Anunna and give the company you are working for.
When you call the servicedesk, give also your email address.


== Previous Maintenance Windows ==
<!-- TODO: confirm where maintenance dates are announced (Apps Portal MOTD, mailing list, Teams) and add the next scheduled downtime, or a link to a live maintenance schedule. -->


=== Maintenance May 24th 2017 ===
== Reporting problems ==
This update moved Bright Cluster Manager to version 7.3, and SLURM to 16.08. Downtime was 8am to 8pm.


=== Maintenance November 16th 2016 ===
If something is not working — during or outside maintenance — see [[How to Get Help]].
This was a firmware update and OS update for the system. Downtime was 8am to 8pm.


=== Maintenance May 23rd-29th 2016 ===
== See also ==
This update moved the OS from Scientific Linux 6 to Scientific Linux 7, and updated Bright Cluster manager to version 7.2. A week was taken to reconstruct the entire environment from scratch, thanks to assistance from Clustervision for this expedience.
* [[How to Get Help]]
 
* [[Cluster Architecture Overview]]
=== Maintenance March 23rd 2016 ===
There will be mainly firmware maintenance between 8h and 20h CET . Because network controller and storage controller firmware will be upgrades, all servers need to be rebooted and also will have network hick ups. So to prevent job and/or data corruption, Anunna will be shut down during this maintenance window. Running jobs will be killed!
 
=== Maintenance June 17th 2015 ===
There will be mainly firmware maintenance between 8h and 20h CET . Because network controller and storage controller firmware will be upgrades, all servers need to be rebooted and also will have network hick ups. So to prevent job and/or data corruption, Anunna will be shut down during this maintenance window. Running jobs will be killed!
 
=== Maintenance November 26th 2014 ===
There will be mainly firmware maintenance between 8h and 20h CET . Because network controller and storage controller firmware will be upgrades, all servers need to be rebooted and also will have network hick ups. So to prevent job and/or data corruption, Anunna will be shut down during this maintenance window. Running jobs will be killed!
 
=== Maintenance June 11th 2014 ===
There will be mainly firmware maintenance between 8h and 13h CET . Because network controller and storage controller firmware will be upgrades, all servers need to be rebooted and also will have network hick ups. So to prevent job and/or data corruption, Anunna will be shut down during this maintenance window. Running jobs will be killed!

Latest revision as of 07:16, 19 June 2026

Anunna is taken down for planned maintenance — firmware and software updates — roughly twice a year. Typically one downtime is scheduled in April or May, in a week free of teaching, and another in October, to disrupt usage as little as possible.

During a downtime the cluster is unavailable. Jobs are not scheduled to run across a planned downtime, so check the announced dates when planning long jobs.

Announcements

Planned maintenance is announced in advance — watch the Message of the Day on the Apps Portal and the usual support channels.


Reporting problems

If something is not working — during or outside maintenance — see How to Get Help.

See also