08 July 2003

Network outages

The recent outages to the network were caused by building work in the Library.

Recently over the weekend 27-29 June 2003 the electrical switchboard in the Library was replaced, during which time the network was unavailable in the Division. The network comes in to the University in Building 10, goes through the Library to Building 5, then to 20 and finally Building 9, where the Division's servers and other network resources are housed.

No power in Building 8 meant the network routers and switches were down, and so there was no network communication to or from the Division.

When the power was restored to the equipment in Building 8, there was a failure in one of the vital pieces of equipment, which meant our network was down until a part could be replaced on Tuesday 1 July.

While the network was down, staff were unable to send or receive email; had no access to central file storage; and by and large could not print. There was no access to any of the Division's web servers from on or off campus, so for four days no-one had access to information about the Division and its services.

With the network becoming an increasingly important service to the Division, such outages need to be avoided if at all possible. To minimise outages, there needs to be a more professional approach to the management of the University's network, with better network planning and the building of redundancies or alternative routes into the system.

Web servers required to be available reliably around the clock should be placed (in a network sense) as close as possible to the University's connections to the Internet so that, among other things, the availability of the Web pages is unaffected by outages within the University's internal network.

Contingency planning should be undertaken to allow for the Network to be broken in one place, but services continued through alternative routing. Completing the loop between Building 9 and Building 1 may be required to help provide for alternative routes.

Centralised services would not have helped in this instance: the Division would have been unable to see the services with the Library routers down. As it was, centralised services like DHCP and WebCT were also unavailable during the outage.