Cerebras Systems Enables Brain-scale AI

Sep 21, 2021

Wafer-Scale Engine Inventor Lays the Foundations for Running Massive AIs

Many technology prognosticators have predicted it would take at least until 2045 for the industry to create Artificial Intelligence (AI) technology that could rival the human brain, as measured by the number of synapses in the human brain or parameters in an AI. Note that we are not talking about Terminator-style AI, or “General AI” here. We are talking about an AI that can process a single though complex task, such as natural language processing. The average human cerebral cortex has about 80-100 billion neurons and 120 trillion synapses. For the sake of argument, let’s assume the parameters in an AI model roughly equates to synapses. The largest AI ever trained is the GPT3 natural language model from OpenAI.org, at 175 billion parameters, or roughly 1/1000th the size of a brain. So, 120 trillion is enormous, approximately 1000 times larger than today’s state-of-the-art.

This research paper explores Cerebras System’s approach to create a brain-scale AI and the new technologies that could enable that feat. But first, let’s put this discussion into the proper context. Just how big is a 120 trillion-parameter model? You can download this exclusive Cambrian AI research paper by clicking below.

Table Of Contents

  • Introduction
  • How Does Cerebras Create Brains-scale AI?
  • Memory
  • Scalability
  • Working Smarter, Not Just Harder
    • The Weight Streaming Execution Model
    • Sparsity
  • Do We Really Need Such Massive AIs? Do We Want Them?
  • Conclusions
  • Figure 1: AI Model sizes have been doubling every 3.5 months…
  • Figure 2: The MemoryX platform streams weights to a CS-2 system or an entire cluster of CS-2 systems.
  • Figure 3: The SwarmX fabric enables scaling to 192 CS-2 systems.
  • Figure 4: Projected scalability of the Cerebras data center predicts near-linear scalability up to…
  • Figure 5: The Cerebras Weight Streaming Execution Model enables the CS-2 to train huge models.
  • Figure 6: Cerebras designed sparsity awareness into each core with a trigger that skips the…
  • Figure 7: This image shows how a four CS-2 system works…