Many businesses and organisations are rapidly checking their IT systems in preparation for remote working and instigation of their business continuity plans. At the best of times constantly running and accessible IT infrastructure underpins operational performance. In testing circumstances like these, it is even more important to ensure server room resilience and ensure that you can safely monitor and control remote systems.
Servers and IT networking devices generate heat when running. The greater the processing power of their CPUs, the larger the amount of heat released into their environments. This heat is normally managed by a constantly running server room air conditioner or datacentre-type cooling system and an overall ambient temperature range of 18-25°C maintained.
Whilst datacentres will have dedicated on-site engineering teams and infrastructure management systems, this is not always the case for smaller server rooms. Many businesses with from one to 5 server racks can be classed as smaller-scale IT operations and it is these types of organisation who will need to quickly re-evaluate their remote working set-ups and monitoring systems.
One of the easiest to deploy is a server room temperature monitoring system. Most server rooms operate within a temperature range if 18-25°C. This is achieved through some form of room cooling and typically a server room air conditioner. Most will have only one air conditioner and therefore no resilience. Without cooling, the overall room temperature will start to rise but within packed server racks the temperature rise can be rapid.
A suitable temperature monitoring system will alarm whenever the room ambient moves outside of a pre-set temperature range. For example, a temperature alert could be issued for temperature rises above 25°C and the monitoring system could send an email or text alert to the IT manager and engineers. Some monitoring systems allow for more than one temperature probe, allowing multiple devices to be deployed around the server room and amongst the server racks. Additional probes and detectors can be connected to suitable monitoring devices for humidity, water leakage, smoke and security monitoring.
Once an alarm has been raised, the IT team can then decide how best to respond. If there is an air conditioning failure, then the complete site may have to be powered down remotely. If the temperature spike is temporary, then the situation may just require continuous monitoring.
When there is no-one onsite, a server room temperature monitoring system can therefore help to underpin the security of the IT installation and ensure remote working continuity.
The most popular type of server room air conditioner is a wall mounted type. For larger volume rooms or those that are L-shaped or with obstructions, two or more AC units may be installed to provide duty-standby cycling and N+1 resilience. If there is only one air conditioner for the room then this can represent a single-point-of-failure for the entire IT facility. If this is the case, then the AC system should be remotely monitored for alarms and alerts.
A server room temperature monitoring system provides one way to monitor air conditioning unit failure. Any failure will lead to a higher ambient within the room. Some AC units also provide for signal contact card monitoring. Such cooling systems can be connected to a suitable room monitoring system i.e. one that can accept several digital and analogue input probes.
More sophisticated server room air conditioners may also have network-IP connectivity. This is a plug-in card that allows the air conditioning system to be monitored over an Ethernet network. If there is an air conditioning alarm, the unit can be inspected remotely and possible reset or reconfigured.
Portable air conditioning units are not recommended as they will require regular inspection and emptying of their moisture collection trays. It is not always possible to ensure their operational efficiency as well. A portable air conditioner generally uses a hose for exhaust hot air that must be connected to the outside world via doorway or window. Neither can be efficiently sealed.
Within confined spaces, temperatures can rise quickly if there is insufficient or uneven airflow. Temperature imbalances can be seen in most server racks and cabinets, in-row installations and the wider room environments themselves. Some of this is to be expected and represents ‘normal’ running i.e. a difference in temperature between the cool air intake of a server cabinet as opposed to the warmer temperatures at the rear of the rack.
In preparing for remote working, server rack layouts should be checked to ensure as minimal a temperature difference as possible from top to middle to bottom. Whilst most electronic devices will work operate without performance degradation above 25°C, higher temperatures are to be avoided for sensitive devices.
For server racks with space designed for future installations it may be possible to reposition the existing rack mounted devices to provide more space between them and ensure good air flow. Front and rear doors and the used of blanking and side panels (with or without sealing strips) can also help to improve air flow and cooling from front to rear.
For more information on server room and datacentre thermal guidelines:
Uninterruptible power supplies (UPS) typically have valve regulated lead acid (VRLA) batteries whose working life degrades at high temperature. For reliability UPS batteries require an ambient temperature of 20-25°C. This requirement applies to all types of UPS systems installed with lead acid batteries including monobloc centralised or modular UPS systems powering a complete room or rack mounted UPS siting within a server rack.
Any UPS system should be checked for general operating and battery alarms, battery charge percentage and load in terms of Watts or VA. This is important as it allows a check of the estimated battery runtime and provides an opportunity to increase this i.e. remove non-critical devices from UPS support.
UPS systems can also be remotely monitored and configured. Simple volt-free contact cards allow for remote alarm signalling and shutdown. An SNMP card provides more sophisticated monitoring and control as the UPS becomes an IP-network managed device. Either of these can be used remotely to notify of a UPS issue with the SNMP route providing remote diagnostics.
It is paramount at the best of times to ensure that there is efficient airflow within a server rack or data cabinet. With the potential for prolonged remote working it is even more imperative to review:
Whichever type of rack or cooling option is chosen, it is important to ensure that all electrical devices are powered from secure backup power supplies, including UPS systems and where possible a standby power generator. Without backup power, during a power outage, air conditioners will fail, and cooling fans will stand idle. Temperature will rise rapidly and will lead to possibly equipment failure and present a fire risk.
A final note that can assist remote working is adequate labelling and documentation for the IT server room. This could assist anyone looking at a server rack on behalf of a remote working to quickly identify the equipment or connection cable that they are being asked to review. Where there is not sufficient time to implement a clear labelling and documentation plan, then mobile phones can provide a useful fallback position in terms of photos, videos, email and support calls.
The most important aspect of any server room environment is the monitoring of its temperature. Sudden temperatures can indicate operational issues including overloading or a system component failure. Monitoring allows for a fast response and if there is now built-in resilience a fast return to operation (RTO) status for those working on site and remote workers. What we have tried to define here is a general overview of critical infrastructure items to consider before implementing a remote site working plan. Other potential items to consider (outside the scope of this article) include software licenses and internet bandwidth.
The Server Room Environments projects team are available 24/7. We are operating as normal (though remotely) and will continue to support not only our clients but anyone needing assistance with their server rooms or datacentre operations or any aspect of their IT systems.
The next few months will see many of us take part in the biggest home working experiment ever. Our local fibre and broadband connections should be able to supply the levels of connectivity needed for remote working. Many local communications services have already been upgraded to support greater use of the internet and media streaming. However, home working could leave remote workers more vulnerable to local power outages. So, what can we do to ensure power availability and uptime?