Qualcomm Readies CPUs For The Data Center Using Nvidia NVLink Fusion

by Karl Freund | Sep 5, 2025 | In the News

Nvidia headquarters in Santa Clara, California. Photo by Justin Sullivan/Getty Images

NVLink has become a second, and perhaps deeper, moat for Nvidia, following its CUDA software tools and libraries. There’s a reason the AI community prefers NVLink for connecting multiple GPUs: it is mind-blowingly fast, more than three times the performance of the open-source UALink alternative, which won’t even ship until 2026. In fact, by the time UALink 1.0 is deployed in data centers, NVLink 6 will likely be shipping at a speed perhaps twice that of NVLink 5, or 3600 GB/s, compared to the 800 GB/s of UALink 1.0. (Like many other semiconductor firms, Nvidia and Qualcomm are both clients of my firm, Cambrian-AI Research.)

But, UALink is open, and NVLink is closed. That’s why it was invented. Right?

Wrong. Nvidia opened up NVLink earlier this year with NVLink Fusion, offering it at GTC, enabling other silicon builders to add NVLink’s speed and scale to their SoCs, accelerators, and CPUs. Qualcomm is one of the first to buy in.

Qualcomm Jumps On Board the NVLink Train

Qualcomm has already become a leader in edge AI, spearheaded by the Snapdragon processor for mobile, automotive chips and software, and the Cloud AI100 Ultra PCIe card for data center inference. Cloud AI100 Ultra is not a wimpy solution, with up to 870 TOPS (Trillions of Operations Per Second) of INT8 performance and up to 288 TFLOPS of FP16 performance.

Now the company also wants its fair share, or better, in data center CPUs. The Oryon CPU was initially intended as a data center CPU by Nuvia, which Qualcomm acquired in 2021. Currently, Nvidia is capturing a significant portion of the Arm headnode market with its Grace CPU. Could Qualcomm muscle in? Only if Nvidia helps.

The Oryon CPU is the configurable compute engine for multiple markets. QUALCOMM

So, imagine you are Qualcomm, looking for an interconnect to attach your fast Oryon Arm-based CPUs to make a future data center CPU. You can add your own Neural Processing Unit (NPU) to the Oryon SoC, and that will work fine for lightweight inference processing. Think of this as a Cloud AI200. But if you want to enter the larger market for much bigger dollars and TAM, you also need a fast scale-up interconnect. And you will need a fast GPU.

You could pick a 2nd-tier GPU, such as the AMD MI350, which appears to be closing the performance gap with Nvidia’s Blackwell, at least according to AMD’s specs. Then you have to follow all your competition off the UALink cliff, accepting far less performance and zero proven adoption. UltraEthernet is designed for scale-up; it is not yet shipping and will have half the performance of UALink and ~1/6 the performance of today’s NVLink.

Or, you could use NVLink Fusion. Nvidia would provide proven logic for your SoC, tooling, testbeds, and support. But what GPU would you use on the other end of the link or switch? Qualcomm picked the fastest GPU on the planet: Blackwell Ultra and Rubin GPUs.

Having gone down this path, Qualcomm will now enjoy a level playing field with Nvidia, while leveraging and continuing its investments in Snapdragon for edge, CloudAI for edge clouds, and the Snapdragon AI

And Qualcomm’s AI strategy has evolved as a result of this arrangement. (Or, perhaps Nvidia decided to launch NVLink Fusion as a result of conversations with Qualcomm?) The company’s AI strategy centers on leveraging edge and on-device AI to drive a new phase of distributed, hybrid intelligence, fundamentally distinct from the cloud-centric approaches favored by many competitors. And now, with NVLink and Nvidia GPUs, Qualcomm can broaden this hybrid strategy to include training and large-scale data center inference.

What’s next?

First, the arguments presented above are likely to be considered by many Hyperscalers and other chip design teams. Imagine a cloud accelerator provider deciding to throw in the towel, tired of trying to compete with NVLink, and adding NVLink Fusion to their next generation Trainium 4 or Microsoft Maia3. Sounds crazy, I know, but at least now they can consider more options than they had before NVLink Fusion

Second, Qualcomm could enjoy a first-mover advantage with NVLink Fusion, combining a future Oryon CPU with the world’s best GPUs.

Think Hyperscalers would bite on NVLink? Yeah, so do I.

Disclosures: This article expresses the opinions of the author and is not to be taken as advice to purchase from or invest in the companies mentioned. My firm, Cambrian-AI Research, is fortunate to have many semiconductor firms as our clients, including Baya Systems BrainChip, Cadence, Cerebras Systems, D-Matrix, Esperanto, Flex, Groq, IBM, Intel, Micron, NVIDIA, Qualcomm, Graphcore, SImA.ai, Synopsys, Tenstorrent, Ventana Microsystems, and scores of investors. I have no investment positions in any of the companies mentioned in this article.

← Previous Post Next Post →

Qualcomm Readies CPUs For The Data Center Using Nvidia NVLink Fusion

Qualcomm Jumps On Board the NVLink Train

What’s next?

More Recent AI News>>

Categories

Qualcomm Readies CPUs For The Data Center Using Nvidia NVLink Fusion

Qualcomm Jumps On Board the NVLink Train

What’s next?

More Recent AI News>>

Companies

Categories