Select customers are now evaluating the chip; early results look promising across a broad range of AI workloads
Esperanto has been talking about their edge AI chips for several years, and now the company can demonstrate working AI acceleration for image, language, and recommendation processing. I had the chance to watch a demo of the platform and came away quite impressed with the performance and power efficiency of the RISC-V based platform. I was also pleased to see that the Esperanto device is not a one-trick pony, as the team demonstrated Resnet50, DLRM, and and the Transformer network underlying BERT.
As it stands now, the chip is only running as a single accelerator; additional tuning and engineering should increase frequency substantially and will extend the fabric to other chips for larger networks and more throughput. I can’t share the benchmark results just yet, but the performance was rock solid in all three models while only sipping power. At full frequency we should expect a 20 watt power envelope.
The Esperanto chip, which we covered here, has nearly 1100 power efficient RISC-V cores that have been extended by Esperanto with vector and tensor operations for AI. “On one 7nm chip,” said Founder Dave Ditzel, “we put 1,088 energy-efficient ET-Minion RISC-V processors, each with its own vector/tensor unit; four high-performance ET-Maxion RISC-V processors; over 160 million bytes of on-chip SRAM; and interfaces for external DRAM and flash memory.”
What really makes this approach unique, is that the RISC-V cores are actually doing the heavy lifting, not offloading the matrix multiplies to a MAC core or a GPU. Consequently, I for one was skeptical that they could rise to the performance level needed to do serious inference processing, but early results validate the approach. Moreover, buy using full instruction set RISC-V cores, these chips could support massive parallelism on virtually any workload, in theory, not just AI.
Conclusions
I think it is fair to say the Esperanto has finally arrived, perhaps a bit late but with the promised performance and energy efficiency. We will watch them closely, and are anxious to see the results of early customer access projects, both for AI and other highly parallel workloads.