Why OpenAI, AMD, NVIDIA, Intel, Broadcom, and Microsoft all agreed on one networking protocol

The AI industry’s biggest rivals somehow managed to agree on something. That alone is worth paying attention to.

OpenAI has released MRC – Multipath Reliable Connection – a new open networking protocol co-developed with AMD, NVIDIA, Intel, Broadcom, and Microsoft. Getting five competing tech giants to align on a single standard is no small feat. The reason they did tells you everything about where the bottlenecks in AI development actually are.

Also read: OpenAI partners with Nvidia, Microsoft and others to build MRC: What it is 

The problem

When discussing the obstacles to AI advancement, there are mentions of compute, data, and talent, but the network seems to always slip through. As GPU cluster sizes have grown to supercomputers capable of containing 100,000 GPUs or more, the connection between those GPUs has been revealed to be quietly sabotaging performance.

Also read: Former OpenAI CTO Mira Murati testifies against Sam Altman in Elon Musk lawsuit: Here is what she said

During large-scale AI training, even a single data transfer arriving late can disrupt the entire process, leading to idle GPUs. For a company like OpenAI, which works on the scale of millions of connections in its networks, delays will occur often, and solving becomes become increasingly difficult.

The networks were not originally designed for this task. Traditional networks could need several seconds or even dozens of seconds to recover from a failure situation. Synchronous training requires all the GPUs to remain perfectly synchronized, meaning that any second spent in recovery is an incredible waste of compute.

What MRC Actually Does

MRC attacks the problem from two directions: congestion and failure recovery.

By distributing packets to travel through hundreds of paths at once rather than using a single path, MRC mitigates the problem of congestion by spreading out the load in the core network. If there is a problem with the paths, the links, or switches fail, MRC notices this problem and finds an alternative route within microseconds.

On top of that, MRC also reimagines the physical structure of the network. In traditional networks, each network interface works like a single link with 800Gb/s speed. With MRC, the same network interface can be separated into different links, connecting eight different switches. This enables constructing a fully connected network with around 131,000 GPUs in just two tiers of switches while in a typical network, three or four tiers would be required. Fewer switches, reduced energy consumption, reduced costs.

The answer is self-interest, dressed up as altruism and that’s not a criticism. OpenAI’s networking lead Mark Handley put it plainly: the infrastructure industry has reached a point where it’s worth establishing open standards, “as opposed to each of these large companies doing their own thing.”

Proprietary networking protocols fragment the ecosystem, raise costs, and slow everyone down. A shared baseline lifts all boats. OpenAI has released MRC through the Open Compute Project, enabling the broader industry to use and build on it.

MRC is already in production. It has been used to train GPT-5.5 and other models across OpenAI and Microsoft’s largest clusters.

The rivals agreed because the real competition isn’t in the wiring. It’s in what you build once the network stops getting in the way.

Also read: Forts, monkeys, and a telephoto lens: The Vivo X300 FE experience in Jaipur

Vyom Ramani

A journalist with a soft spot for tech, games, and things that go beep. While waiting for a delayed metro or rebooting his brain, you’ll find him solving Rubik’s Cubes, bingeing F1, or hunting for the next great snack.

Connect On :