- The AI data centers overwhelm the air cooling with energy and increasing heat
- Liquid cooling is becoming essential as the server density with the growth of AI increases
- The new hybrid cooling cuts energy and water, but faces adoption doubt
As AI transforms everything from search engines to logistics, their hidden costs are becoming increasingly difficult to ignore, especially in the data center. The necessary power to execute generative is to push the infrastructure beyond what traditional cooling can handle.
To explore the scale of the challenge, I spoke with Daren Shumate, founder of Shumate Engineering, and Stephen Spinazzola, director of missionary services of the firm.
With decades of experience in the construction of main data centers, they are now focused on resolving energy demands and AI cooling. From the failure of air systems to the promise of a new hybrid cooling, they explained why AI is forcing data centers to a new era.
What are the biggest challenges to cool a data center?
Stephen Spinazzola: The biggest challenges in cooling data centers are power, water and space. With high density computing, such as data centers that execute artificial intelligence, comes immense heat that cannot be cooled with a conventional air cooling system.
Typical cabinet charges have doubled and tripled with the deployment of AI. An air cooling system cannot simply capture the heat generated by the high KW/ Cabinet loads generated by AI cabinet groups.
We have performed the computer fluid dynamics (CFD) in numerous data centers and an air cooling system shows high temperatures above acceptable levels. The air flows that we assign with CFD show temperature levels greater than 115 degrees F. This can cause servers to close.
Water cooling can be done in a smaller space with less power, but requires a huge amount of water. A recent study determined that a single hypercalad installation would need 1.5 million liters of water per day to provide cooling and humidification.
These limitations raise great challenges for engineers while planning the new generation of data centers that can support the unprecedented demand we are seeing for AI.
How is the AI changing when it comes to the heat dissipation of the data center?
Stephen Spinazzola: With CFS modeling that shows potential servers off with conventional air cooling within AI cabinet groups, the need for direct liquid cooling (DLC) is required. The AI is generally displayed in 20-30 Gabinets groups AO more than 40 kW per cabinet. This represents an increase of four times in KW/ Cabinet with the deployment of AI. The difference is amazing.
A typical Chat-GPT consultation uses approximately 10 times more energy than a google search, and that is only for a basic generative function. The most advanced consultations require substantially more power that has to go through a cluster farm AI to process large -scale computing between multiple machines.
Change the way we think about power. Consequently, energy demands are changing to the industry to use more liquid cooling techniques than traditional air cooling.
We talk a lot about cooling, what about offering real power?
Dare Shumate: There are two new general challenges to offer energy to AI computing: how to move the power of UPS output boards to high density racks and how to deliver creatively high high -power power densities of uPS.
Power to racks is still done with branches from distribution PDU to rack pdu (plug strips) or with a plug -in bus road on the racks with the PDU in the rent that connects to the bus road on each shelf. The nuance is now what Busway’s ampacity makes sense with the stripes and what is commercially available.
Even with the plug -in bus road available to an ampacity of 1,200 A, the energy density is forcing the deployment of a greater amount of separate bus roads circuits to meet the density and striped requirements. The additional energy distribution is a specific and variable requirement of the end users of the individual database of branches circuit or distribution preferences.
Depending on site restrictions, cooling designs of the data center can have medium voltage UPS. Promoted by voltage drop concerns, the MV UPS solves concerns about the need to have very large feeding ducts, but also introduces new average voltage voltage substations in the program. And when considering the average voltage ups and downs, another consideration is the applicability of the ROVS ROTARY MV versus versus static MV solutions.
What are the advantages/disadvantages of the various cooling techniques?
Stephen Spinazzola: There are two types of DLC in the market today. Emersion and cold plate cooling. The cooling of the emersion uses large tanks of a fluid did not drive with the servers placed vertically and completely emerged in the liquid.
The heat generated by the servers is transferred to the fluid and then transferred to the cold water system of the buildings with a closed circuit heat exchanger. Emersion tanks occupy less space but require configured servers for this type of cooling.
Cold plated cooling uses a heat dissipator attached to the lower part of the chips pile that transfers the chip battery energy to a fluid that is had throughout the cabinet. The fluid is launched at an end of the cooling distribution unit of the row (CDU) that transfers the energy to the cold water system of the building.
The CDU contains a heat exchanger to transfer energy and 2N pumps on the secondary side of the heat exchanger to guarantee a continuous flow of fluid to the servers. Cold plaque cooling is effective for server cooling, but requires a large number of fluid pipe connectors that disconnection leakage stop technology should have.
Air cooling is the technique proven for cooling data centers, which has existed for decades; However, it is inefficient for high density racks needed to cool the AI data centers. As the loads increase, it becomes more difficult failure proof using CFD modeling.
You are presenting a different refrigerator, how does it work and what are the current challenges for adoption?
Stephen Spinazzola: Our hybrid pending hybrid/adiabatic breeding (HDAC) slope design solution exclusively provides two single -circuit cooling fluid temperatures, allowing a higher temperature fluid to cool DLC servers and a lower temperature fluid for conventional air cooling.
Because HDAC simultaneously uses 90 percent less water than a cooling cooling tower system and 50 percent less energy than an air -cooled coolers system, we have managed to obtain the greatest effectiveness of energy use (PUE) up to approximately 1.1 annualized for the type of hyperscala data center that is needed to process AI. Typical AI data centers produce a PU that varies from 1.2 to 1.4.
With the lower PUE, HDAC provides an approximate IT power of 12% more usable from the same utility feed feed size. The economic and environmental benefits are significant. With a system that provides an economic and environmental benefit, HDAC only requires “a sip of water.”
The challenge to adoption is simple: nobody wants to go first.