The Rise of AI Drives 9 Fold Surge in Liquid Cooling Technology

AI servers, driven by Nvidia’s GB200 superchip, have experienced significant growth. The cutting-edge B200 chip, due to its high thermal design power, requires direct-to-chip cooling. Supermicro announced that it had shipped over 2000 direct-liquid-cooled AI server racks by the end of August 2024, and it has expanded its manufacturing capacity to 5000 racks per month. Supermicro reported that it has around 75% of the liquid-cooled AI server rack market, and IDTechEx believes that this production capacity expansion will lead to a surge in liquid-cooled server racks, as well as the number of cold plates. The projections for the number of cold plates for AI servers in IDTechEx’s new report, “Thermal Management for Data Centers 2025-2035: Technologies, Markets, and Opportunities” align with Supermicro’s latest announcement.

IDTechEx believes that this production capacity expansion is expected to drive a rapid increase in the deployment of liquid-cooled racks across the AI and high-performance computing (HPC) sectors, along with a notable rise in the use of cold plates. Cold plates are integral to direct-liquid-cooling systems, as they are responsible for absorbing and dissipating the significant heat generated by high-performance chips like Nvidia’s B200. IDTechEx’s recent research into thermal management for data centers echoes Supermicro’s projections, highlighting the increasing importance of liquid cooling technologies in managing the heat loads associated with next-generation AI and HPC hardware.

Direct-to-chip (D2C) cooling, also known as cold plate cooling, is a sophisticated cooling method wherein a cold plate is mounted directly onto the chip (GPU or CPU). The plate facilitates the transfer of heat from the chip to a circulating coolant, which then dissipates the heat. D2C cooling can be divided into two main categories: single-phase and two-phase systems, depending on the type of coolant used. Single-phase D2C typically uses a water-glycol mixture, which circulates through the system and transfers heat away from the chip via convection. This type of cooling is efficient for systems with moderate TDPs, as the coolant remains in a liquid state throughout the process. In contrast, two-phase D2C cooling uses a coolant like fluorinated refrigerant, which absorbs heat through a phase change. As the coolant transitions from liquid to gas, it provides significantly greater cooling power, making it well-suited for systems with extremely high TDPs.

The rapid increase in chip TDPs is driving the demand for more advanced cooling solutions. AI and HPC applications, in particular, are pushing the limits of current cooling technologies, as these workloads require chips with significantly higher power consumption to handle complex computations. Nvidia’s GPU roadmap, combined with Intel‘s recent announcement of its Falcon Shores GPU – expected to have a TDP of 1,500W – suggests that GPUs and CPUs with TDPs exceeding 1,500W likely become common within the next one to two years. IDTechEx predicts that this ongoing rise in TDP will eventually lead to a shift from single-phase to two-phase D2C cooling systems, as the latter offers superior heat dissipation capabilities required for these high-power chips despite the unclear timeline.

In addition to direct-to-chip cooling, immersion cooling has garnered significant attention as an alternative solution for high-performance systems. Similar to D2C, immersion cooling can be split into two categories: single-phase immersion cooling (1-PIC) and two-phase immersion cooling (2-PIC). However, unlike D2C, immersion cooling involves submerging the entire server into a bath of coolant, which absorbs heat directly from all components. This method is highly effective for cooling densely packed systems with high power requirements, as it eliminates the need for air-based cooling entirely. In single-phase immersion cooling, the coolant remains in a liquid state, similar to single-phase D2C. Two-phase immersion, however, leverages a phase change in the coolant, similar to two-phase D2C, to provide even more efficient heat dissipation.

While immersion cooling offers numerous advantages in terms of thermal efficiency, it comes with several challenges. The process of submerging servers requires extensive retrofitting of existing infrastructure, as well as rigorous material compatibility tests to ensure that the components can withstand prolonged exposure to the coolant. This results in higher upfront costs compared to D2C cooling systems. Additionally, immersion cooling systems, especially two-phase variants, face regulatory challenges. For example, 3M‘s Novec™ products, commonly used as two-phase coolants, are set to be discontinued by the end of 2025. As of now, no PFAS-free or “forever chemical”-free two-phase coolants have been officially announced, adding another layer of complexity for companies considering immersion cooling solutions.

Cooling in data centers occurs at various levels, ranging from chip-level to facility-level cooling. Each level requires different cooling strategies, with technologies like D2C and immersion cooling primarily focusing on chip, server, and rack-level thermal management. At the room and facility levels, air-based cooling remains the most common approach in 2024. Computer room air conditioning (CRAC) units and computer room air handling (CRAH) units are widely used to cool entire server rooms or data center floors. However, the growing heat loads generated by high-performance AI and HPC systems are pushing the limits of air cooling, prompting the adoption of more efficient liquid-based solutions.

One such solution is liquid-to-liquid (L2L) cooling, which is becoming increasingly popular for facility-level heat management. In L2L cooling, a cooling distribution unit (CDU) transfers heat from one liquid loop to another, enhancing heat exchange efficiency. This system is particularly effective for data centers dealing with higher heat loads from AI and HPC workloads. Supermicro’s CEO has predicted that liquid-cooled data centers, which currently represent around 1% of the market, will grow to 30% by 2026. IDTechEx shares this optimistic outlook, noting that while L2L cooling is gaining traction, its widespread adoption will likely be concentrated in newly constructed data centers due to the significant retrofitting required for existing facilities. However, many existing data centers, particularly those using CRAH units, already have facility water systems in place, which can be leveraged for L2L cooling retrofits. These existing water systems are often the starting point for upgrading older data centers to accommodate more advanced liquid cooling technologies.

Cooling on the server/rack level and on the room and facility level. Source: IDTechEx

In conclusion, the rapid rise of AI and HPC applications is driving a fundamental shift in data center cooling strategies. As chips like Nvidia’s B200 and Intel’s Falcon Shores GPU push the limits of thermal design power, direct-to-chip and immersion cooling solutions are becoming critical to managing the heat loads in modern data centers. This unprecedented transition brings significant opportunities to players in the data center cooling value chain, including but not limited to coolant suppliers, server makers, system integrators, cold plate manufacturers, materials suppliers, and cooling equipment (e.g., HVAC) suppliers.

Source: idtechex.com