31 October 2006

Network storage outages

Delays in the transition of responsibilites for hosting network storage are causing more and more concerns for Divisional staff.

The Division’s file server, dcenas, is becoming increasingly unreliable. dcenas houses staff home directories and other volumes shared between Divisional staff (and in some cases, students). In April this year the Division organised with ICT Services to migrate these services to a new server infrastructure provided by ICT Services to host home directories and shares for staff in the non-Academic Divisions (and eventually the other Academic Divisions as well).

On current estimates, the replacement facility will not be available until early December. The existing facility run by ICT Services to provide network storage to the non-Academic Divisions can’t cope with the Division’s requirements: existing users of the service on Windows already occasionally see lengthy delays, and Macintosh file services are not available. It would be unwise to migrate our Divisional users to the service because the extra load would probably compromise the service even further, and providing the Macintosh users with Macintosh file services could take a lot of effort and increase the unreliability of the system for all users.

Last week dcenas failed a number of times. On Thursday it took the combined effort of the Technical Services Unit and ICT Services to revive dcenas when it looked terminal. A ‘cold reboot’ (basically removing the power cord and waiting a minute or so before reconnecting it and starting the computer up again) has restored it for now and it has been operational since then.

Action Plan

We will continue to use dcenas in its current state until the services can be migrated to ICT Services’ replacement facility in early December, unless of course dcenas fails completely and can’t be revived. If two cold reboots are required within any twenty-four hour period, dcenas will be pronounced dead and an emergency migration to the existing ICT Services facility will be undertaken. This should take about a day and the service will be subject to the shortcomings outlined above.

To lower the load on dcenas and reduce the effort for migration (emergency or planned), TSU staff will:

  1. move shared volumes (that is, not home directories) to cestaff.canberra.edu.au.
  2. ask Divisional staff to clean up their home directories on dcenas, reducing the number of files and folders for any planned or emergency recovery.

We will continue to track the installation of the new ICT Services facility, and keep the Division informed if there are any changes to the current plan to move the directories and volumes to ICT Services in early December. In the meantime, daily backups are being made so in the event of a total failure of dcenas very little if any work should be lost.