MLPerf, an industry consortium of over 70 companies and institutions, has released the second round of AI Inference processing results. These benchmarks now represent production applications from all major areas of AI deployment today. But only a few technologies participated in the contest.
Many observers and investors expect NVIDIA to encounter more competition in the market for inference accelerators. The hypothesis is that GPUs consume too much energy, are expensive, and cannot match purpose-built AI chips’ performance. But NVIDIA has never lost an MLPerf benchmark, and very few companies even try. Let’s look briefly at the results, explore why NVIDIA has continued to lead, and speculate why everyone else is a no-show. Once again, I will use NVIDIA’s graphics because, well, the company does graphics.
MLPerf 0.7 Results
The inference benchmark suite includes seven benchmarks deployed at the edge and the data center in multiple deployment models. If one does the math, MLPerf now encompasses some 35 scenarios.
The four new applications include areas where CPUs have historically dominated but where accelerators are beginning to exhibit promising results. In particular, therecommendation engine market is vast, dominated by Intel, and is ripe for acceleration. But this application can prove challenging to accelerate due to the massive tables needed to predict consumer preferences accurately, which requires a lot of memory, or a creative solution.
As the following figure shows, NVIDIA dominated the data center runs and did so by a wide margin. However, there were few competitive submissions with which we can compare; only Intel and Xilinx provided results. We will explore why this is once again the case later in this blog.
If NVIDIA is unbeatable, why?
There are three reasons why NVIDIA continues to be the only game in town for #AI in the data center.
- NVIDIA has amazingly fast hardware and optimized software
- NVIDIA builds optimized platforms that are easy to access and deploy.
- NVIDIA nurtures and enjoys a massive ecosystem of cloud service providers, software, and researchers worldwide.
While many argue that newer technologies are faster and more power-efficient, NVIDIA provides higher value due to its platforms and ecosystem. And that value is immediately accessible. The company offers hardware tuned for specific workloads, on which it provides training and inference platforms and application-specific frameworks for six industry and horizontal applications. While most everyone else is just trying to get their chip working, NVIDIA has spent many years creating the entire hardware and software stack, including the seventh fastest AI supercomputer, Selene, dedicated to enabling research in and outside of NVIDIA. How many competitors are willing to invest tens of millions in a private AI supercomputer based on technology that nobody is using? It is a catch-22.
Where the heck is everyone else?
I get this question a lot, so let’s explore the landscape.
First, many players such as Intel Habana and Tenstorrent are just not ready to run complex benchmarks such as MLPerf. Their chips and software stack must be finished and optimized even to consider undertaking this endeavor.
Second, it takes a tremendous effort to run and optimized these benchmarks and participate in the peer-review process.
Third, and most importantly, new entrants must focus on getting traction in select accounts. This priority applies to giants like Qualcomm and Intel, especially for startups such as Groq, SambaNova, and Cerebras. These companies must use precious resources working with early customers, not running and publicizing benchmarks that do not drive immediate revenue.
Conclusions
NVIDIA did great against a shallow field of competitors. The A100 results were terrific compared to the V100, demonstrating the value of its enhanced tensor core architecture. And I commend MLPerf for adding new benchmarks that are increasingly representative of fast-growing inference opportunities such as recommendation engines.
That being said, the competition is too busy with firstcustomer projects, or the new chips are just not yet ready. For example, SambaNova announced a new partnership with LLNL, and Intel Habana is still in the oven. If I were still at a chip startup, I would wait to run MLPerf (an expensive project) until I already had secured a few lighthouse customers.
MLPerf is certainly the right answer but will remain mostly irrelevant until players are farther along. Until then, follow the money!
Note: Moor Insights & Strategy writers and editors may have contributed to this article.