Recently, Huawei unveiled its latest AI system architecture, the CloudMatrix 384 Supernode. This cutting-edge system is powered by an impressive 384 Ascend 910C chips, achieving a total computing power of 300 PFLOPS—approximately 1.7 times the performance of NVIDIA's GB200 NVL72. This milestone signifies a robust counterstrike by China amidst the U.S. chip sanctions. However, this Make up for quality with quantity strategy comes at a steep cost, with daily power consumption nearly four times higher than its competitor.
The CloudMatrix 384 Supernode is now operational at a data center in Wuhu, Anhui, China. Huawei refers to this system internally as the Atomic Energy Level AI solution, positioning it as a major competitor to NVIDIA's NVL72 architecture. The NVL72 features 72 Blackwell GPUs interconnected via high-speed NVLink, delivering an overall computing power of 180 PFLOPS. Although the single-chip performance of the Ascend 910C is approximately one-third that of Blackwell, Huawei has managed to surpass NVIDIA's flagship system by leveraging five times the chip quantity, alongside 3.6 times the memory capacity and 2.1 times the memory bandwidth.
Huawei is also collaborating with the Chinese startup SiliconFlow, planning to leverage the CloudMatrix architecture to support the independently developed inference model DeepSeek-R1. This deployment highlights China's steady advancement in achieving a decoupling from U.S.-based AI computing infrastructure, further intensifying the technological rivalry between China and the U.S.
Although the CloudMatrix 384 showcases innovation in system design, including large-scale optical interconnects and software optimization, its power efficiency remains suboptimal. Its total power consumption is 3.9 times that of the NVL72, with energy consumption per FLOP being 2.3 times higher, power usage per TB/s of memory bandwidth reaching 1.8 times more, and memory capacity power consumption ratio being 1.1 times greater. While these figures may raise concerns in Europe and the US, power supply is not considered a primary constraint in China.
According to the report by SemiAnalysis, China remains heavily reliant on coal-fired power while continuously expanding its solar, hydro, wind, and nuclear energy capacity. The country's energy growth rate is the fastest in the world, with the additional power grid capacity since 2011 equating to the scale of the entire U.S. grid. This energy advantage allows China to compromise on efficiency in exchange for achieving a broader scope of AI expansion capabilities.
The report highlights that the CloudMatrix architecture consists of 16 cabinets, with 12 dedicated to computation. Each cabinet houses 32 Ascend chips, while the remaining 4 serve as optical interconnect cores. The entire system employs an impressive 6,912 units of 400G LPO (Linear Pluggable Optics) transceivers, replacing traditional copper wiring to enhance interconnect density and scalability. This feature bears some resemblance to NVIDIA's DGX H100 NVL256Ranger architecture, which was planned but never mass-produced.
Although the Ascend 910C is entirely designed by Huawei, its manufacturing process remains highly dependent on foreign supply chains, including HBM high-bandwidth memory from South Korea, wafers provided by Taiwan's TSMC, and semiconductor manufacturing equipment from the United States, the Netherlands, and Japan. It is reported that TSMC may face fines of up to $1 billion over allegations of circumventing sanctions to supply wafers.
Huawei has also procured approximately 2.9 million wafer dies from TSMC through the third-party company, Sophgo, enabling the production of 800,000 units of the Ascend 910B and 1.05 million units of the Ascend 910C. At the same time, Samsung has become a key supplier of HBM for China. Reportedly, Huawei has stockpiled up to 13 million sets of HBM stacked components, enough to support the packaging of 1.6 million Ascend chips.
Although China's local semiconductor foundry SMIC has yet to fully achieve advanced process technology, it is actively expanding its capacity in Shanghai, Shenzhen, and Beijing. It is expected that this year’s monthly output will reach 50,000 wafers. If it continues to secure foreign-supplied photoresist materials and maintenance support for tools, SMIC’s production volume still has room for further growth.
Overall, the CloudMatrix 384 highlights China’s strategy of compensating for the lack of advanced chip fabrication through system integration. Although the performance of a single chip cannot match NVIDIA, Huawei has successfully leveraged large-scale stacking and optical network expansion to achieve Corner Overtake in overall computational performance, thereby narrowing the gap with Western tech giants.



