Introduction
After receiving of the container data center from Huawei, one should test it. The reason is to find the possible problems and solve them before start to use your new data center.
Description of the infrastructure under test
After completing the installation of the Huawei FusionModule 1000A modular datacenter, we start the comprehensive tests of it. As part of these tests, we tested how the power and air conditioning system behaves.
The structure of our data center is as follows:
Crack-D | T/H | Crack-C | Crack-B | T/H | Crack-A | |||||
UPS 1 | UPS 2 | Battery | C1 | B1 | A1 | |||||
D2 | C2 | B2 | A2 | |||||||
Layout of cabinets and separation into hot and cold aisle
Crack is the internal blocks of air conditioners, and A1, A2 and so on to D2 are cabinets for infocommunication equipment (IT cabinets). As we can see, there are four air conditioners in the data center that take air from hot aisles, cool it and blow it into the cold aisle. Between the rows of cabinets in cold aisle on the ceiling there are temperature and humidity sensors (T/H) readings of which are used to determine the optimal mode of operation of air conditioners. Air conditioners have the ability to workin "rotation mode", when one of the air conditioners is constantly turned off, and the rest work as needed. And the non-working air conditioner automatically changes once a day. Since we planned to use this mode, we also checked it. Inour old serverrooms, these modes provide alternate operation of conventional air conditioners, which can either cool to the maximum or not to cool at all.
How does we test our tests
After the container set a stable temperature inside itself, we connected a 3 kW heat gun in each of the IT cabinets, simulating the operation of servers. The heat guns sucked air from a cold aisle and blew it into a hot aisle. So, we were able to simultaneously load uninterruptible power supplies (UPS) and air conditioners.
The tests took place in several stages, differing in which IT cabinets are connected to heat guns. Each stage lasted a couple of hours. By moving the load between IT cabinets, we specifically simulated the most challenging load imbalance conditions. In the beginning, of course, we just loaded all the IT cabinets, and then we started loading up individual rows of IT cabinets.
Let's consider the stages of testing in more detail.
Crack-D | T/H | Crack-C | Crack-B | T/H | Crack-A | |||||
UPS 1 | UPS 2 | Battery | 3 kW | 3 kW | 3 kW | |||||
3 kW | 3 kW | 3 kW | 3 kW | |||||||
Scheme of the first stage
In the first phase, we gave the maximum load planned during the testing phase in all IT cabinets, which allowed us to test the overall performance of the cooling and power systems.
Crack-D | T/H | Crack-C | Crack-B | T/H | Crack-A | |||||
UPS 1 | UPS 2 | Battery | 3 kW | B1 | 3 kW | |||||
D2 | 3 kW | B2 | 3 kW | |||||||
Scheme of the second stage
Starting with the second stage, we reduced the overall load, but began to arrange its imbalance. To begin with, we loaded the IT cabinets only on one side of each cold aisle.
Crack-D | T/H | Crack-C | Crack-B | T/H | Crack-A | |||||
UPS 1 | UPS 2 | Battery | C1 | 3 kW | A1 | |||||
6 kW | C2 | 3 kW | A2 | |||||||
Scheme of the third stage
Then we changed the loaded cabinets to the opposite. This allows you to check how air conditioning will occur with a strong imbalance of heated air flows.
Crack-D | T/H | Crack-C | Crack-B | T/H | Crack-A | |||||
UPS 1 | UPS 2 | Battery | C1 | B1 | A1 | |||||
3 kW | 3 kW | 3 kW | 3 kW | |||||||
Scheme of the fourth stage
Then we loaded the cabinets far from the air conditioners. This lengthens the air flow between air conditioners and air consumers.
Crack-D | T/H | Crack-C | Crack-B | T/H | Crack-A | |||||
UPS 1 | UPS 2 | Battery | 3 kW | 3 kW | A1 | |||||
D2 | 3 kW | 3 kW | A2 | |||||||
Scheme of the fifth stage
At the last stage, we loaded only the "internal cabinets". This should create a very strong heat between the two air conditioners and virtually none for the other two.
Verification process
As early as possible, we went to the data center and took readings from the UPS and air conditioners. Then they launched all the heat guns, checked that all doors isolating the cold and hot aisles were closed, and began to monitor what was happening.
After a while, it became clear that the air conditioning system does not behave quite (and quiet) as we would like. One of the air conditioners sometimes startsworking at full capacity, and the rest do not support it. To find out the reasons, we changed the air conditioner, turned off by the rotation mode, and continued testing.
It is worth noting that the outside of the data center at this time was much quieter than inside, although cooling at the maximum air conditioning was clearly audible from the outside. It became clear that the problem is not in the specific air conditioner, but somewhere else. Even after changing the air conditioner turned off, one of them from time to time begins to freeze the entire data center, while leaving other two stand idle. After analyzing the symptoms, namely that the temperature shown by different air conditioners is the same for those that do not freeze, but different from that which shows a freezing air conditioner, the Huawei engineer found that this is due to the temperature setting. In the container data center air conditioners were configured in pairs — A with D and B with C. The first pair believes that the temperature in the cold aisle corresponds to the maximum of the sensor readings of the T/H. And the second pair averages this temperature.
One of the air conditioners, which was supposed to lower the temperature in the cold aisle by the maximum T/H readings. The second air conditioner responsible for this aisle calculated average temperature (16°C), which did not exceed the threshold of its inclusion. At the same time, the air conditioner is tuned to maximum temperature, but located in another cold aisle, turned on and tried his best to bring down the heat in the data center (he thought the temperature was 24°C). But due to the fact that it was very far from the measured place, it did not do it effectively. To solve this problem, all air conditioners have been put into maximum temperature mode. This solved the problem in incorrect behavior separate air conditioning and further testing were more predictable. I will give graphs of power consumption on the lines of two UPS and temperatures.

Graph of electrical power consumed during testing (green line - UPS 1, blue - UPS 2)

Graph of temperature fixed by air conditioners
This graph clearly shows the separation of air conditioner readings at the beginning of the tests. Some used the temperature from a warmer aisle, and the second averaged it.

Graph of air temperature from UPS
At the beginning of the graph, it was clear that the temperature was high - at this time the air conditioner near the UPS was turned off. Then the temperatures began to drop extremely - this is the period of incorrect operation of the air conditioners. And at the end the temperature stabilized.
Conclusion
In conclusion, I want to note that the rotation mode in Huawei FusionModule 1000A should not be considered as the main mode of operation. This became clear almost immediately with the start of testing - air conditioners very smoothly adjust their power, because even their excessive amount does not freeze the data center into an icicle when they turn on. Their behavior is much smarter.
At the same time, the way they work doesn't seem quite right to me. In my opinion, it would be better to use two temperature sources in each air conditioner: the nearest temperature sensor T/H and the calculated value (maximum or average). To calculate the operating mode, you need to use the maximum of these values. This would avoid the identified problem when one of the air conditioners is turned off.