Published in Network

Huawei wants to kill PCIe, TCP/IP and NVLink

by on28 August 2025


Chinese giant pitches UB-Mesh as a single unifying protocol

Huawei has rolled out a grand plan to scrap half the data centre’s plumbing and replace it with its shiny new UB-Mesh protocol.

Speaking at Hot Chips 2025, the company introduced UB-Mesh as a single interconnect that promises to unify all links inside and outside AI nodes, making it the only language AI hardware will need to speak.

HiSilicon chief boffin Heng Liao told Tom's Hardware the company plans to give away the spec under a free licence and open it to the world at a conference next month.

“This is a very new technology; we are seeing competing standardisation efforts from different camps. Depending on how successful we are in deploying actual systems and demand from partners and customers, we can talk about turning it into some kind of standard.”

The idea is to dump PCIe, NVLink, CXL and TCP/IP in favour of one fat unified pipe, slashing latency, improving uptime, and dodging the cost traps of conventional interconnects in what Huawei calls “gigawatt-class” data centres.

Modern AI clusters look like a Frankenstein’s monster of buses and protocols stitched together. They rely on UPI, PCIe, CXL, RoCE, NVLink, UALink and TCP/IP or upcoming experiments like Ultra Ethernet.

Each conversion burns power, adds latency and makes failure points multiply like rabbits on Viagra. UB-Mesh aims to fix that by making everything talk to everything else with no translators.

Huawei reckons its tech can connect up to one million processors, including CPUs, GPUs and NPUs, into a single “SuperNode” with hop latency down to 150 ns and bandwidth per chip up to 10 Tbps. That’s 1.25 TB/s, far ahead of anything PCIe 8.0 is threatening to deliver.

The company’s plan is to move from the current cobbled-together setup to a synchronous load/store model and ditch DMA-based approaches. It’s designed to make every SERDES lane multipurpose, support backward compatibility over Ethernet, and move the whole shebang from rack-scale to hall-scale.

There are problems with the vision. While copper still works fine inside racks, anything going across the floor will need optical links. Fibre is fast, but it’s also flakier and more expensive. To cope, Huawei is throwing in retry logic, backup lanes and redundancy in optical modules. That’ll keep things ticking over if links fail, but it raises the bill.

Huawei’s topology involves a hybrid approach. A CLOS structure ties racks together at the macro level, while a multi-dimensional mesh does the job within each rack. The aim is to avoid cost spirals when scaling up to tens of thousands of nodes.

Redundancy features include hot spare racks that automatically jump in when something goes pear-shaped, rotating faulty racks out and back in to keep uptime high. Huawei claims this could stretch mean time between failures by orders of magnitude.

Cost-wise, Huawei insists that traditional interconnects scale linearly with node count, eventually costing more than the AI accelerators they serve. UB-Mesh, in theory, scales sub-linearly. The company pointed to a real-world 8,192-node system that blends CLOS and 2D mesh elements as a working example.

If the tech gains traction beyond Huawei’s own installs, it could loosen Nvidia’s grip on interconnects and force the industry to take the SuperNode model seriously. But it’s going to be a hard sell.

The competition is entrenched. Nvidia has NVLink and uses Ethernet or InfiniBand for cross-rack traffic. Chipzilla, Broadcom and AMD are pushing UALink and Ultra Ethernet, with both backed by open consortia and broad industry support.

Whether Huawei’s customers are ready to bet the data centre on a single-supplier protocol remains the open question. But if enough big systems get built and enough third parties start kicking the tyres, UB-Mesh might just find itself turning from a Huawei side project into a global standard.

Last modified on 28 August 2025
Rate this item
(0 votes)