With many businesses and organisations shifting to remote working during the lockdown period, there has been a dramatic increase in the number of server and computer rooms supported remotely. This is a major change in how organisations work and for many means that their critical IT infrastructures are even more reliant on remote monitoring and control solutions.
For any building one of the key built environment measurements is ‘temperature’. Most of us can relate to this when we think about our heating systems and the thermostatic control unit, we used to control these. Thermostats are generally pre-programmed to set our heating ON/OFF at pre-set times, with a manual override to accommodate weather related temperature changes.
Within a server room or datacentre environment it is of course the cooling system that dominates the ambient environment. As with a heating system, the cooling solution installed is thermostatically controlled with a manual override facility. What differs is the dramatic effect that a cooling failure can make to the operation of critical IT servers and networking devices.
Most server racks run between 2-5kW of IT devices. Some high-density solutions can see power requirements far higher from 15-30kW at the top-end. Within an IT environment, power in kW equates to heat output that requires constant cooling to maintain an ideal temperature range of 18-25°C.
One issue for most installations is that the temperature varies with both their server rooms and server racks. The rear exhaust channel behind a server rack will have a higher ambient temperature than the front of the rack, which is usually presented with a cool and air-conditioned airflow. Within a server rack, temperatures can also vary due with hot-spots due to poor equipment layout and an inefficient air flow. Blanking plates, panels and sealing tapes can help to improve airflow efficiency but there will almost always be temperature differences between the bottom, middle and top of the cabinet as well as the front and rear channels.
If the room suffers an air conditioner failure, either complete or partial (N+X system), the overall ambient temperature could rise rapidly. Within a server rack, the temperature rise could be dramatic and create a potential fire risk if the IT systems are not powered down. The higher the kW power draw within the server rack, the higher the risk of fire.
For more information on the ideal server room ambient temperature see:
One of the simplest solutions to this problem is to install an environment monitoring device. The monitoring device is data concentrator that gathers information from probes and sensors installed within the room or server racks. The device connects to a local area network via an IP connection through which its data can be read and reported to a Cloud software platform. Alarm messages can be issued for readings outside pre-set thresholds.
Most environment monitoring systems are installed with at least a digital temperature probe. These can typically provide temperature readings within 10-80°C, so well within the ideal target temperature range for a server room of 18-25°C. Other sensors include combination temperature and humidity probes, humidity, water leakage, airflow, motion detection, smoke and power failure detectors.
A temperature probe typically has a 1m cable length and connects to the monitoring device via an RJ45 connector. Other standard cable lengths can include 3m and 10m options. The length of cable is important as the number of temperature probes required is relative to the size of the computer or server room and the number of server racks installed.
Where a server room has a single rack, the probe can be installed on the rack to provide an overall room temperature reading. Temperature monitoring probes can also be placed inside a rack and ideally six points of connection are recommended for high power (kW) density rack installations covering the front and rear airflows and the bottom, middle and top of the rack.
Where more than one server cabinet is installed the temperature monitoring regime could be extended to every other rack and include in its simplest form, one probe for the room and/or internal temperature probes.
What should you do when you receive a temperature warning alert? The first thing to recognise is that an alert means that the ambient temperature reading leading to the alarm is outside of the pre-set thresholds. The rise may only be temporary and be related to other factors or it may be significant and last for a longer duration, leading to a potential impact on overall IT resilience.
For example, in a small computer room, a rise to 27°C may take place on a warm and sunny day if the room is south facing. The air conditioning system will work harder (use more energy) to compensate for the temperature rise and the overall IT system resilience remains unimpaired.
If the server room temperature rise is to 30°C and last for a longer duration, this can indicate a partial loss of cooling or excessive heat being generated by a component within server rack. Under these conditions it would be necessary to consider powering down IT servers and networking devices to reduce the server rack load.
A longer-term rise above 30°C could indicate total failure of the air conditioning system and the need for a complete facility power down remotely. Under these conditions a site visit would be required to identify the cause issue.
For any computer room or server rack temperature rise, it is important to carry out detailed diagnostics and to understand the data being presented. If only temperature is monitored, then the range of data may be limited. A more complex monitoring solution can provide more detailed digital and analogue inputs and provide digital outputs for additional control.
A typical server room environment monitoring solution can be installed with a range of sensors and detectors covering temperature, humidity, water leakage, smoke, pressure, vibration, frost, light, airflow and motion. In addition, the monitoring device may be able to monitor digital inputs and provide digital output control.
The term ‘digital’ refers to a simple ‘ON/OFF’ state. Some air conditioners have a relay contact interface. The relays may be normally open (NO) or normally closed (NC). The relay card can be hard wired to the environment monitoring device. If the air conditioner fails, a relay on the interface card open or closes to signal an alarm condition that requires investigation. Airflow detectors can also be used to monitor the airflow speed and pressure output from an air conditioner. Other digital inputs can include door entry access control panels, fire alarm and smoke detectors, UPS backup battery usage and mains power failures.
Digital outputs provide the monitoring device with the facility to issue a relay signal to a third-party system. A typical feature here could include emergency power off (EPO) i.e. to facilitate a complete remote shutdown of a facility.
The cost of a temperature monitoring solution for a small computer or server room can cost less than £200 and is relatively easy to install. More complex environment monitoring devices can be used with a larger number of sensors and detectors to provide a more comprehensive overview. The data gathered in real time can be used to ensure IT facilities can be remotely monitored and controlled to support the business continuity of the organisation. Any spikes in operational data will lead to alarm notifications via email and/or text alerts and should be investigated. A small and short-term temperature rise may be due to external ambient changes. Larger and longer duration temperature changes can indicate failure of a critical room component such as an air conditioner or component within the cooling system.
When you install an uninterruptible power supply you are removing a single point of failure from your critical power path; the mains power supply itself. When there is a power outage your IT systems will continue to run on battery power. The runtime available should be long enough to ride through the power outage, start-up a local standby generator or initiate an orderly shutdown. But how resilient is your UPS and should you consider installing a parallel or modular UPS system?
The next few months will see many of us take part in the biggest home working experiment ever. Our local fibre and broadband connections should be able to supply the levels of connectivity needed for remote working. Many local communications services have already been upgraded to support greater use of the internet and media streaming. However, home working could leave remote workers more vulnerable to local power outages. So, what can we do to ensure power availability and uptime?