The explosive growth of AI processing in data center and edge environments has induced AI startups and established firms alike to develop silicon to handle the massive processing demands of neural networks. Inference processing, in particular, is an emerging opportunity, wherein a trained deep neural network is processed to predict characteristics of new data samples. This processing is typically performed on CPUs. However, that situation will have to change to handle the exponential growth in model size and new applications that depend on multiple neural networks to solve complex problems. We believe that the market for inference processing will exceed that of data center AI training in 3-4 years, surpassing $5B in annual chip sales by 2025.
You can download the paper from Moor Insights & Strategy, where it was published, by clicking on the logo below.
Table Of Contents:
- Introduction
- Tenstorrent’s Holistic Strategy
- Tenstorrent Product Roadmap
- Conclusions: The Holistic Approach Holds Tremendous Promise And Challenges
- Figure 1: Tenstorrent Grayskull Processing Element (Single Core)
- Figure 2: Packet Manager
- Figure 3: O(N) Matrix Multiplication
- Figure 4: ML vs Moore’s Law (Optimistic)
- Figure 5: Flexible Scheduling & Parallelization
- Figure 6: Tenstorrent Silicon Roadmap
- Table 1: 65W Grayskull BERT Inference Performance
Companies Cited
- AMD
- Graphcore
- NVIDIA
- OpenAI
- Qualcomm
- Tenstorrent