22/10/2020

How Data Centre Monitoring Systems Reduce Downtime

Data Centre Monitoring DCIM Platforms

Temperature and humidity levels in data centres and server rooms are two of the most critical aspects to continuously monitor. For most small to medium sized facilities temperature-related issues cause almost 35% of facility downtime. In the UK we have seen an increase in demand for our server room monitoring solutions as organisations have moved to remote working and have fewer staff including IT and facilities engineers on site to respond to alarm conditions. Even with COVID-19 secure practices in place, it can also take more time to get remote service companies to respond when there is an emergency call out, leading to a greater reliance on internal resources.

Data Centre Environment Monitoring

Power and energy demand within data centres continue to rise. 10years ago you might expect to see server rack densities demanding 3-5kW of electrical power. Today up to 15kW may be the norm with some larger facilities pushing 20-30kW per rack. These levels of power provide seriously large amounts of heat to remove from the racks to maintain operations. If heat levels can rise, there is a potential risk to the long-term reliability of the servers and storage devices and more immediately the potential for a fire.

Not surprisingly, data centres invest large amounts of time and monitoring in their monitoring architectures. Complete facility-wide platforms may be deployed and are often referred to as Data Centre Infrastructure Management (DCIM) platforms. DCIM software is used not just for monitoring but capacity planning and other aspects of operational management. A typical example of a DCIM package is the Schneider EcoStructure platform. Connection of compatible products within the APC and Schneider range EcoStructure is straightforward and the data collected can have a dramatic impact when looking to improve data centre efficiency and capacity utilisation.

More information:
https://www.se.com/uk/en/work/campaign/innovation/platform.jsp

Server Room Monitoring using Digital Sensors

Smaller data centres and server rooms may not have the budgets for investment in a DCIM package. For these projects, our projects engineers will turn to dedicated environment monitoring devices and software packages that are designed to monitor key operational aspects using plug-in sensors and data collection devices.

Temperature monitoring is often a critical environment factor for many server room operators. A fast-rising temperature indicates failure of the local air conditioning system. Responding fast to such an alarm event can safeguard the capital investment in the IT network and the data it is processing.

A typical monitoring device is shipped with temperature monitoring as standard. The sensor is either built-into the environment monitor or is a plug-in digital sensor attached to a 1-3m cable. Digital sensor cables can also be supplied for both temperature and humidity monitoring. These combined sensors can be used to collect data on temperature, humidity, heat index and dew points. Combination sensors like these provide a wider set of environmental data and this can lead to faster response times to changes in the local air-conditioned environment and more accurate data for trending and analysis. For more IT departments temperature and humidity related events can collectively be the cause of over 50% of related operational issues.

Temperature Monitoring

Extreme under or over temperatures in a server room or data centre IT space can lead to extensive and expensive downtime. High heat levels can damage system reliability and lead to erratic operation. Within servers, storage devices and UPS systems, cooling fans must run faster to draw air into the devices for cooling and this can lead to increased wear and a greater need for regular service inspections and preventative maintenance.

As well as monitoring temperature levels at the room-level, it is also important to consider monitoring temperature at the rack level. With rising power demands within server racks, there is an increased potential for hot spots to occur and potentially damaging thermal energy build-up. At a rack level temperature monitoring should be three to six points.

These include the top, middle and bottom of a server rack or cabinet, and the front and the rear of each point. Front and rear temperature monitoring is important in high-kW racks. A high difference in the readings indicates that heat is being expelled from the racks. A low differential indicates potential issues and possible leakage through poor airflow management design i.e. leaky side panel fittings.

Humidity Levels and Dew Points

Humidity and temperature are related environmental issues. If the humidity level is too low, static electricity can build-up with the potential for an electrostatic discharge (ESD). If the humidity level is too high, the air moisture content is high and this can lead to a build-up of condensation and water droplets on cooler surfaces, electrical and network wiring, and connection sockets, and within IT devices themselves. Corrosion, mould and mildew, potential short-circuits, and fire risk can result from this a moisture content related build-up. The ‘Dew Point’ is the temperature the air needs to be cooled to achieve a relative humidity (RH) of 100%. Dew point is related to humidity level and when too high can lead to operational issues.

Heat Index

The ‘heat index’ is an important factor for technicians and engineers working within the IT white space. If the heat index is high, the temperature level may not be comfortable for those working in the environment. This can put those people working within the space at risk and impair their productivity, mood and efficiency. Health & Safety issues can result from not monitoring the heat index. Most people can operate comfortably with a heat index of less than 30°C but 20-25°C is more appropriate for anyone working long term within the environment.

Summary

Environment monitoring devices for server rooms and small-to-medium sized data centres are relatively easy to deploy and can make use of the best practice principles that underpin DCIM packages. The entry-level for any computer or server room operation should be temperature monitoring and a typical system can be deployed for under £200. These devices can not only help to improve operational reliability but can provide an almost instantaneous alert to temperature related problems with the local air conditioning system. In addition to temperature monitoring, humidity level, dew point and heat index are other environmental factors, the monitoring of which can provide a comprehensive overview as to the operating environment for critical IT servers, storage devices and networking bridges, routers and switches.

As we move towards a more automated and remote working culture, there is a greater need for onsite monitoring in order to guarantee uptime and respond quickly to any changes to critical infrastructure systems including the cooling and power systems. Our project engineers can deliver environmental projects to suit your application, whether it is a small server room temperature monitoring application or the provision of an EcoStruxture data centre monitoring platform.

comments powered by Disqus

Related blog posts

30/05/2019
Next Article
How Severe Is A Critical Power Outage For A Datacentre?

A power outage is a break in the mains power supply which can last from milliseconds to minutes or even hours. In recent years the number of power outages recorded within Western Europe has increased. Momentary breaks in the electrical supply are increasing due to more severe and disruptive weather conditions, the switching of substations transformers, grid breakdowns and a rising demand for electricity. Demand for electricity continues to rise as economies decarbonise and move to electric transportation whilst also introducing more power from renewables. This is in addition to an increasing dependence on datacentre services fuelled by Edge and 5G connectivity.

Read more ...