Data centres provide managed and controlled environments in which to run IT servers and networks. Whether the data centre is an on-premise enterprise or cloud providing colocation facility, environmental monitoring is critical to its availability and energy efficiency. Whilst no two data centres are the same, there can be shared best practices in terms of the most commonly monitored environmental factors and critical infrastructure systems.
Over 40% of the energy used in a typical data centre is consumed by cooling systems. All electronic devices generate heat and within a data centre, servers account for most of the heat generated. Without sufficient cooling, the ambient temperature within the server room and server racks can quickly become critical and present a fire hazard.
More information on data centre energy usage: https://www.sciencedirect.com/science/article/pii/S1876610217306331
Temperature monitoring in a data centre or server room is the most monitored environmental factor. ASHRAE, the American Society of Heating, Refrigerating and Air-Conditioning Engineers, Inc, provides several publications to guide the design and implementation of energy efficient data centres and server rooms.
For a data centre or server room, the recommended temperature range is 18-27°C (64-80°F). This is an optimal temperature range that provides both a cool environment for critical servers, other specific electronic and electrochemical devices (including UPS batteries) and the engineers and technicians who work there.
Data centre temperature monitoring should be a relatively straight forward concept to deploy. A typical system will consist of an environmental monitoring device or base station and one or more temperature probes. The issue is where to place the probe or probes to gather sufficient information for accurate ambient temperature readings.
Within a server rack for example, it may be necessary to monitor ambient temperatures at the front and rear of the racks and within pairs of sensors covering the base, middle and top of the racks. This allows a thermal cabinet map to be generated for each server rack and a more accurate data to be collected to avoid thermal hot spots.
Where hot and cold-aisle containment is deployed can also impact the choice of where to place a temperature monitoring sensor and the quantity to be used. Containment paths should be monitored to ensure that the cool air presented to the front of the racks has a sufficient Delta-T to the air exhaust into the hot aisle and return path the to the cooling system for efficient heat exchange. ASHRAE recommends the exhaust air to be less than 20°C (35°F) difference from inlet temperature.
Differential air pressures may also be monitored alongside temperatures to ensure optimal cooling efficiency. Poor air pressures and air flow within server rack can reduce cooling efficiency, leading to higher energy usage as the cooling system has to ‘pump’ greater volumes of cool air into the space(s). Visible signs of poor containment in server rack include open front spaces which could be filled with blanking panels, poorly fitted side panels and open front or rear rack doors.
In a smaller server room, a single temperature sensor may be all that is required. Here the sensor may be placed close to the air conditioning unit or on the top or side of a server cabinet.
More information on ANSI/ASHRAE Standard 90.4-2019, Energy Standard for Data Centers:
The recommended humidity level for a data centre or server room is 40-60% rH (relative humidity). The amount of water moisture within the air of cooled environment is controlled by the air conditioner and cooling system. If the air is too dry, then static electricity can build-up leading to potential discharges when earthed. If the air is too humid, condensation can build up in hard-to-reach spaces, such as ceiling voids and raised access floor plenums, or when the cool air hits hotter exposed surfaces. High humidity can lead to corrosion, fire risk and mould.
A typical server room or data centre environmental monitoring system will offer humidity sensors or combined temperature and humidity probes. As with temperature monitoring, where to place a humidity sensor is dependent upon the size of the server room or data centre and its layout. The most common placement is central to the room, racks and at the further points door-to-door.
Uncontrolled water and other such liquids within any electronic environment are a ‘no-no’ as it can lead to electrical short circuits and a fire hazard. Some data centres are now deploying liquid-based cooling systems, refrigerated doors and even servers immersed in cooling liquids.
In a more typically air-cooled data centre or server room environment, water build up or ingress can occur due to a higher-than-normal humidity or a local pipe burst or external flooding. Water leakage sensors are typically deployed as either spot-sensors or lengths of water leakage detection rope. A spot-sensor is as its name implies, a single sensor that will alarm if exposed to a drop of water or other similar liquid. A spot-sensor can be used under an air conditioner for example. Water leakage rope is ideal for the perimeter of a server room and/or in the suspended ceilings or under floor plenums of a data centre. If any part of the rope is exposed to a drop or more of water or a similar liquid, the sensor signals an alarm to the monitoring base station.
A server room or data centre security system will typically consist of access-controlled door entry. Only people with authorised credentials are allowed into the server room and access-controlled spaces within the data centre.
Access control does not have to stop with door entry and exit. Swing handles with built-in RFID readers can be used on server racks to the control who can open door and gain access to the equipment within the cabinets. Cabinet security can also be run as a separate access control solution to the main building security system via an appropriate environmental monitoring base unit.
Camera security can also be deployed alongside cabinet and room-door entry control. Continuous camera streams can be streamed into a suitable access control system using an ONVIF feed. If an access control incident occurs, snapshots of the camera feed can be stored alongside the event for later analysis. Alternatively, motion detection cameras can be used which only activate if someone enters the working space being monitored.
A fire suppression system is mandatory in most server rooms and data centres. It is not only good practice but is often a requirement from insurance companies. Whilst the fire suppression system will have its own monitoring and alarm notifications, additional smoke detectors can be connected to an appropriate environmental monitoring system base unit.
The smoke detectors connect to the base unit and provide a digital input signal (dry contact/volt-free signal contract) when smoke is detected. This leads to the monitoring device issuing an alarm notification through the connected monitoring software platform.
When designing a power protection solution for a server facility it is best practice to classify and group loads into critical, essential, and non-essential loads. Critical loads include servers and IT devices that must be support by uninterruptible power supplies to provide protection from short power outages and be provided by a stable source of AC power. Loads are generally connected to the UPS system(s) either directly or via power distribution units (PDUs).
Essential loads include air conditioning, lighting and security systems that require power but not of the same quality as that delivered by an uninterruptible power supply. Essential loads must be powered then the mains power supply fails, and these are typically supported by a local standby power generator. For the critical loads, the batteries within the UPS systems provide sufficient backup time up to 30minutes for the standby generator to start or resolve a starter problem if the generator fails to automatically fire up when the mains power supply fails. Non-essential loads are those that can be allowed to ‘drop’ when the mains power supply fails.
UPS systems can be monitored remotely over SNMP and other communications interfaces including MODBUS and volt-free contact signals. UPS manufacturers provide sophisticated UPS management software to allow data logging, power parameter graphing, email alerts and automatic load server shutdown scripts.
Whilst their connected battery sets may be monitored and automatically tested every 24hours, individual battery blocks are not. For some installations this can be weakness because a battery string is can collapse under load if there is one or more failing batteries. Battery blocks can be tested one or twice a year using hand-held battery testers and the results compared to analyse battery health trends. Alternatively, 24/7 battery monitoring can be deployed using individual sensors attached to each battery, that report back to a central battery monitoring unit. This can then be interfaced back to the overall environmental monitoring base unit and alerts issued for alarms and failures. Individual battery health monitoring can also be applied to generator starter batteries. Without a sufficiently charged battery, a generator cannot start when the mains power supply fails.
Power monitoring can also be deployed in the form of Watchdogs that monitor simply for power failure and report this through their environmental monitoring base units.
There are two ways to monitor sensors and detectors centrally within a server room or data centre.
Environmental monitoring system manufacturers provide software that can run on-premises or in a Cloud environment and connects directly to their devices. The software will provide data capture, logging, analysis, reporting and alert distribution lists and will either be available free of charge or for a yearly license fee. The annual fee may depend on the number of devices monitored and how long the data is to be retained for. This type of monitoring software is designed for comms rooms, small-to-medium server rooms and data centres.
An alternative approach for medium-to-large data centres is to deploy a data centre infrastructure management (DCIM) package. Whilst this can be more complex and costly, a DCIM can provide a more comprehensive view of the entire data centre estate to allow for load and capacity planning.
Cost per sensor is an important metric that allows a comparative approach to be taken when looking at how to best deploy server room or data centre environmental monitoring.
Monitoring base units tend to have a fixed number of hardware RJ45 ports for sensors to connected to. A 1-Wire UNI bus can extend this by allowing multiple sensors to be connected to a port at the same time. Digital input (DI) and digital output (DO) connections are also important considerations if external devices are to be monitored using dry contacts and volt-free signals, and devices are to be instructed to power ON/OFF following an algorithmic response to an alarm condition.
The greater the number of sensors, detectors, DI and DO connections, the more costly the overall installation in terms of fixed wiring. Costs can be reduced using wireless technologies. As well as reducing cost, wireless sensor monitoring allows for faster deployment and provides the option to move monitoring sensors and detectors around a server room or data centre.
In a server room or data centre environment, critical infrastructure systems provide the managed and controlled environment in which to run IT operations. The two most critical being power and cooling, followed by emergency protection systems including fire suppression.
Server rooms and data centres are designed to provide secure, managed, and controlled environments in which to run IT operations. In order to ensure uptime and availability, it is importance to have a suitable environmental monitoring system in place but what should you monitor and how should the data collected be reported and acted upon?