Power cuts are inevitable and within a server room or datacentre environment the results can result in system downtime, erroneous errors and hardware failures. The electricity utilities continue to invest in their critical infrastructure to ensure that power outages are quickly identified and corrected. Substations are automated to rapid response power redistribution and engineering teams can be quickly despatched to sites.
Servers within a computing facility therefore require some form of backup power. Most often this is in the form of an uninterruptible power supply and a standby power generator. The UPS system provides backup power from its battery set and the generator (if installed) provides power to the UPS system for longer runtime periods.
For more information on live power cuts as recorded by a UK electricity utility, their causes and frequency of occurrence visit:
Specifying UPS Systems
One of the first steps when deciding on a UPS system for a server room or datacentre environment is to assess the total load power demands. This is required in order to ensure that the uninterruptible power supply is sized correctly from day one and to provide capacity for future expansion. Key factors to assess include:
- Critical Power loads: the first step is to identify the key IT network items that need to be protected by the UPS system. These are the ‘critical loads’ that must be kept running during a power outage. Examples include not only servers but any associated IT peripherals including network routers and switches, SAN devices, WIFI points and environment monitoring. The UPS is sized to support the critical power load.
- Essential Loads: these loads that support the overall IT environment. Key here is the cooling and lighting used within the server room or data centre environment. Security, access control and fire suppression could also fall into this area. These loads do not tend to be supported by the UPS system. They are typically powered by a suitably sized backup power generator. This is because they can be less sensitive to short power outages and including them in a UPS sizing would lead to an oversized system due to the need to factor in pump and motor sizes.
- Non-essential Loads: these are building support systems considered non-mission critical. Emergency lighting systems have their own standby power backup and other electrical circuits for printing or ventilation may not be considered critical enough to be powered from a generator.
Mono block and Modular UPS Systems
Once the IT load has been calculated decisions then have to be made as to how to provide the load(s) with uninterruptible power. Mono block UPS systems are single uninterruptible power supplies. These are typically desktop, rack mount, tower and floor standing systems that can be installed as singled backup power units.
Some larger mono block UPS systems (from 10kVA upwards) can be installed with parallel cards to allow parallel/redundant configurations of N+X. Should one of the mono block UPS systems fail or be placed in an overload condition, the load remains protected. Mono block UPS with a parallel option can also be scaled by adding a similarly sized UPS systems into the installation.
Modular UPS systems are different. These UPS systems are designed to use a UPS frame into which UPS modules are placed. The frame sizes are designed to house a maximum number of UPS modules. A typical system can run with just one module or with two or more to provide N+X resilience or power scalability.
Modular UPS systems provide a means to vertically scale the backup power protection provided by an uninterruptible power supply. Horizontal scalability can be achieved by adding another UPS frame in parallel to the installation.
Backup Runtime Periods
Whether the UPS installation uses a mono block or modular UPS system the question of how long to provide battery power for must be assessed. It may be possible to shutdown a small computer room in a relatively short space of some, especially is there is only one or two servers. A 10-15minute UPS battery may suffice, if the shutdown process is automated using UPS monitoring and shutdown software.
For larger server rooms and datacentres, it may be impossible to shut-down a facility easily. This is due to the complex nature of IT operation. Some datacentres may never have experienced a complete power shutdown and require a period of at least 24 hours to rebuild their networks if there is no disaster recovery or mirror site. Server failure at power-on can be an issue as can the loss of software patches and special configuration settings.
In these instances a 10-15minute battery only provides enough power for the start-up of a local backup generator. Most generators can start within 1-2 minutes from which point they can provide full power. The UPS battery provides enough time for the generator to start and with a safety margin should a problem occur. Even then the margin of error in the battery runtime period may not be enough for a poorly maintained or regularly tested generator.
Removing Single Points of Failure
The design of the critical power path and its resilience is key to the ability of the server room or datacentre to operate during a power outage. Redundancy can be built-into each layer within a power protection plan and this is one aspect of the Uptime Institute’s Tier-rating and certification programme (https://uptimeinstitute.com/).
A building incomer can be powered from two separate substations to provide ‘A’ and ‘B’ supplies. This concept can be run to the power supplies within the servers themselves, supplied from separate PDUs, UPS systems and backup power generators.
Any chain is only as strongest as its weakest link and this is the point about looking for single points of failure. It is important to identify even with a UPS system installation what could go wrong and to mitigate for this. As well as the hardware, the electrical installation should be scrutinised for short circuit clearances and discrimination.
The design and installation process for a secure power plan using backup power generators and UPS systems is relatively straightforward. Single points of failure should be removed using redundancy and correctly sizing electrical distribution components and the connected power backup systems. It is also vital to regular test power systems under live or load bank conditions to ensure the integrity of the overall power protection plan and to carry out manufacturer engineer certified preventative maintenance.