Contact us

Cooling latest AI chips with liquids

Posted on
September 4, 2024

Liquid cooling is the only practical solution at high power

Transferring heat through the movement of a given volume of liquid is far more efficient than through the same volume of air—by a factor of about 3,600 for water.

Cooling latest AI chips with liquids

This makes liquid cooling through the die heat spreader a highly effective method. It is generally necessary when heat dissipation exceeds around 50 W per cm2 of die area. Given that the GB200 has an estimated area of about 9cm2, any dissipation over 450 W indicates the need for pumped liquid cooling.

In ‘Direct-to-Chip’ cooling, the liquid is routed through channels in a cold plate attached to the chip’s heat spreader via a thermal interface. When the liquid does not evaporate during the process, it is referred to as ‘single-phase’ operation, in which the medium, typically water, is pumped through a heat exchanger cooled by fans.

Alternatively, heat can be transferred to a second liquid loop, which can provide hot water to the building and potentially to local consumers. A two-phase operation offers better heat transfer, by allowing the liquid, typically a fluorocarbon, to evaporate as it absorbs heat and then re-condense at the heat exchanger. This method can provide a dramatic improvement in performance. However, system fans are still needed for cooling other components, although some, like the DC/DC converters, can be integrated into the liquid cooling loop using their own baseplates. This aligns with the ‘vertical power delivery’ concept, where DC/DC converters are positioned directly below the processor to minimize voltage drops. A practical limitation of the direct-to-chip approach is the thermal resistance of the interface between the chip and the cold plate. Precision flatness of the surfaces and high-performance paste are necessary but at the multi-kilowatt level, the temperature differential can still be problematic.

This constraint seems to be an impending limit on heat dissipation, and, consequently, on performance. As a solution, immersion cooling can be considered. Here, the whole server is placed in an open bath of dielectric fluid pumped via a reservoir around a loop to a heat exchanger. Again, two-phase operation is possible for the best performance.

Those Intel engineers in 1971 would have been astonished by the performance levels achieved in data centers in 2024. But is there a cliff edge coming? There are practical limits to chip feature size and temperature increase, as well as constraints on energy supply and environmental impact, especially if performance continues to rely on simply replicating hardware.

Ultimately, investors seek a return on their investment. Given the extreme complexity of cooling, the high energy costs, and the expensive chip acquisition—such as the GB200 chip reportedly costing up to $70,000 each—commercial viability could soon become a pressing issue. Maybe AI will tell us what the solution is.