A power outage is a break in the mains power supply which can last from milliseconds to minutes or even hours. In recent years the number of power outages recorded within Western Europe has increased. Momentary breaks in the electrical supply are increasing due to more severe and disruptive weather conditions, the switching of substations transformers, grid breakdowns and a rising demand for electricity. Demand for electricity continues to rise as economies decarbonise and move to electric transportation whilst also introducing more power from renewables. This is in addition to an increasing dependence on datacentre services fuelled by Edge and 5G connectivity.
The Uptime Institute is well known for its datacentre resilience Tier-rating system and has now announced a new system for rating the severity of power outages. There are four tier ratings running from 1-4 with each representing a progressively higher level of resilience and uptime.
The Tier-rating system can be just as easily applied to server rooms as to datacentres. The rating system not only allows for certification by the Uptime Institute but also for colocation datacentres to differentiate themselves when competing for business.
What the Tier-ratings also do is provide an indirect measure of how capably a datacentre can ride through a power outage. As the critical power path can not only include power distribution and UPS systems but generating sets and even substation transformers.
How the of ‘UPS as a Reserve’ will impact the Tier-rating system has yet to be decided. Here a UPS system is installed with a lithium-ion battery and may be connected to a National Grid demand side response (DSR) program. The UPS/li-ion battery enabled system can be used to operate the entire datacentre load to help reduce demand on the local grid whilst the datacentre operator receives an annual service connection fee and then a feed-in-tariff like payment for every minute of operation on battery when the grid mains power supply is available. The use of lithium-ion batteries also allows the UPS system to function as an energy storage system, storing energy either from the grid as a standard lead-acid battery does or from renewable power sources including local solar PV, wind turbine or hydro power installations.
The new Outage Severity Rating (OSR) from the Uptime Institute is designed to help critical infrastructure and datacentre operators to better understand and classify the severity of outages in terms of how the outage incidents affect operations. The OSR system is the outcome of a three-year project monitoring power outages and investigating their causes and impact for digital infrastructure owners. From 2016 to 2017, the Uptime Institute recorded a 288% increase in power outages with the increase proportional related to the increasing complexity of the digital operation whether this was in-house, colocation or cloud or some combination of these. The single biggest cause of outages was however power-related.
In less complex server room and datacentre environments an outage could be considered as a binary event with the services provided either being ‘online’ or ‘offline’. The Uptime Institute OSR is designed to help organisations focus on service resilience and interdependencies and to build in the appropriate Tier-rating to ensure business continuity. There are five classification rating for outage severity:
For more information visit: https://missioncriticalpower.uk/uptime-institute-announces-outage-severity-rating/
Both the Tier and OSR systems are linked by their use of critical power and cooling systems. Power related incidents are the biggest cause of outages and can be mitigated against by adopting a progressive approach that builds redundancy into each sector of the critical power path.
The same can be said for cooling systems where even when power is present, a failure of a computer room air conditioner or liquid cooling system can lead to over temperatures, failure of a cold-aisle containment system and a potential cooling-related outage.
Using the Tier-rating system a server room or datacentre operator can judge how to provide the most appropriate level of power protection. The system is relatively straight forward from both a design and audit or certification perspective. The Outage Severity Rating system can also be used as an audit tool in terms of predicting how sever an outage could be to the operation of an installation, as well as classifying an actual impact for organisation or industry reporting.
The datacentre industry is poor at sharing learning from outages and there is no statutory or association led obligations to do so. It is only when a severe outage is publicly reported in the press, that sometimes information is shared later as part of an investigation. The introduction of the Outage Severity Rating system may help to turn the tide on this, as the ratings provide a standardised way for organisations to classify the outages they experience.
As the number of outages continues to increase it is not inconceivable to say that most server rooms and datacentres experience the less severe classifications of outage at least once a year. Shared experience will help to improve “server room and datacentre design”: https://www.serverroomenvironments.co.uk/server-room-design as well as increase their availability within an ever more complex environment.
Temperature and humidity levels in data centres and server rooms are two of the most critical aspects to continuously monitor. For most small to medium sized facilities temperature-related issues cause almost 35% of facility downtime. In the UK we have seen an increase in demand for our server room monitoring solutions as organisations have moved to remote working and have fewer staff including IT and facilities engineers on site to respond to alarm conditions. Even with COVID-19 secure practices in place, it can also take more time to get remote service companies to respond when there is an emergency call out, leading to a greater reliance on internal resources.
Sometimes it is easy within a server room or data centre environment to take for granted the relative ease with which we can simply plug IT network servers and peripherals into a PDU outlet or UPS powered socket. These connections form the final element of what we term the ‘critical power’ path. They are the result of time spent planning the electrical circuits to ensure power availability whether there is a mains power supply failure or downstream overload or server fault condition.