Nvidia Leads The HPC Landscape At SuperComputing ’24

by | Nov 18, 2024 | In the News

Nvidia's modern headquarters, Endeavor, in Santa Clara.

Nvidia’s modern headquarters, Endeavor, in Santa Clara. NVIDIA

Nvidia has gone from a niche provider of HPC technology to becoming a dominant force in the industry. This year’s SC event reinforces that leadership.

The annual SuperComputing event in North America is taking place this week in Atlanta, Georgia, and as usual the show is full of 1000’s of scientists discovering the future, and those selling them technologies to accelerate High Performance Computing, or HPC. And Artificial Intelligence is the new tool that helps scientists in new ways as AI models continue to evolve. With AI and accelerated computing, scientists can speed simulations by 10-100X, or even enable scientists to predict what a classical simulation would produce without even running them.

The Accelerated Computing Paradigm

This shows the % of the "Top 100 Supercomputers in the world that are accelerated, primarily be Nvidia GPUs with AMD contributing a few as well!

This shows the % of the “Top 100 Supercomputers in the world that are accelerated, primarily be Nvidia GPUs with AMD contributing a few as well! NVIDIA

In the last five years, the high performance community has embraced accelerated computing (primarily GPUs) in a steady march towards dominance. GPUs were a rarity in 2019 in the Top100, while today accelerators rule the roost representing over 65% of the prestigious list. Consequently, the CPU-centric supercomputer is becoming a rare beast. Nvidia and AMD GPUs have become essential to HPC.

AI: The new tool for Science

AI: The new tool for Science. NVIDIA

NIMs

HPC scientist need the performance that GPUs can deliver, but they don’t want to become AI scientists. They are weather scientists, geneticists, chemists, and physicists. Nvidia is increasingly promoting “NIM Microservices” to enable these scientists to tap into the power of AI, And Nvidia continually adds new domains to its NIM catalog for faster inference processing and lower TCO. Scientist increasingly do not have to worry about the optimizations normally demanded by AI to get the best performance out of their hardware; NIM Microservices allow Nvidia engineers to do all that work for them.

Nvidia NIM inference microservices

Nvidia NIM inference microservices NVIDIA

Nvidia in Quantum Computing

All companies producing Quantum Computing Units (QPUs) now believe that Quantum needs classic computing to help it become usable and, perhaps someday, ubiquitous. Nvidia’s Quantum play is CUDA-Q, which is essentially like CUDA for developing and simulating quantum circuits that run on somebody’s QPU. Nvidia has added “Dynamics” to CUDA-Q, which enable developers to accelerate the creation and simulation of quantum circuits from taking a year on CPUs to running in under an hour on Nvidia GPUs.

CUDA-Q is used to accelerate the design of quantum computing

CUDA-Q is used to accelerate the design of quantum computing. NVIDIA

And its not just Nvidia who is saying that. Google, a developer of quantum hardware, has adopted CUDA-Q in its Quantum AI program to perform the world’s largest device simulations.

Google is using CUDA-Q to simulate circuits on their Quantum AI platform

Google is using CUDA-Q to simulate circuits on their Quantum AI platform. NVIDIA

New Hardware

As we wrote in another blog today, Nvidia realizes that not everyone needs or can handle the infrastructure needed for its most advanced system, the GB200 NVL72. Many can adequately handle their HPC and AI needs with traditionally connected and configured servers; they just need a faster accelerator. Nvidia announced that its H200 NVL4 PCIe card package is now generally available. The 4-card package increases performance by 30-70% using the NVLink-connected Hopper GPUs in virtually any server.

The Nvidia H200 NVL is a 4 PCIe card packaging connected with NVLink for smaller AI installations.

The Nvidia H200 NVL is a 4 PCIe card packaging connected with NVLink for smaller AI installations. NVIDIA

And for those needing the performance of Blackwell, which is 1.8 to 2.2 times faster for simulation and AI, Nvidia has announced a new GB200 NVL4.

The new Grace Blackwell 4—way "superchip"

The new Grace Blackwell 4—way “superchip”. NVIDIA

Other Nvidia announcements from this week at SC’24

We can’t cover it all but Nvidia made a slew of other announcements, the depth and breadth of which reinforce our view that Nvidia has regained its leadership in HPC, if it ever lost it. These announcements include:

  • Earth-2 NIM Microservices for FourCastNet is 500 times faster and 10,000 times more efficient than prior CPU-based weather forecasting
  • NVIDIA Omniverse adds a Blueprint for real-time CAE physical digital twins
  • Ansys adopts Omniverse for real-time fluid simulation
  • BioNeMo framework in biopharma is now OpenSource
  • DiffDock 2.0 NVIDIA NIM for molecule protein docking
  • NVIDIA ALCHEMI NIM APIs for AI-driven chemistry and materials discovery, reducing time to solution in hours instead of months
  • NVIDIA cuPyNumeric accelerates the leading foundation library of the Python ecosystem with six times better peformance
  • Spectrum-X Ethernet accelerates the world’s largest AI supercomputer, built by X
  • Dell announced the first Ethernet systems to break into the Top 50 SuperComputers, using Spectrum-X.
  • NVIDIA expanded its Foxconn collaboration with new manufacturing plants for GB200

What might the future hold?

Nvidia strongly believes and has demonstrated that AI and HPC are “Full Stack” problems, demanding that technology providers think through the entire system of systems, with the networking and software needed to provide performant solutions. And one of the most important of these in HPC and AI is networking these fast CPUs and GPUs together into a comprehensive system of systems.

While Ethernet and InfiniBand do a great job of solving many node-to node and rack to rack networking problems, the GPU to GPU demands far faster links. NVLink with NVSwitch deliver significantly higher bandwidth compared to standard Ethernet, with the latest generations reaching several terabytes per second, while even high-end Ethernet currently maxes out at around 800 gigabits per second. NVLink also provides lower latency than Ethernet, which is crucial for applications requiring fast data exchange between GPUs.

Perhaps the most important slide shows how Ethernet, even Nvidia's Ethernet, compares with InfinniBand

Perhaps the most important slide shows how Ethernet, even Nvidia’s Ethernet, compares with InfinniBand. NVIDIA

Net-Net

Nvidia’s leadership in HPC appeared to take a hit a few years ago when the DOE’s fastest systems were built on AMD GPU and CPU technologies. They had the 64-bit floating point and prices that the DOE required, and these systems have endured as some of the fastest in the world. But as AI has become more mainstream in supercomputing centers, Nvidia’s full-stack approach with the fastest AI GPUs have become relatively more attractive.

In this article, we have barely scratched the surface of what Nvidia is showing off in Atlanta, but thanks for reading!