Shi, Yuhan

Neuro-inspired Computing Using Emerging Non-Volatile Memories

2023

Shi, Yuhan
Advisor(s): Kuzum, Duygu

Abstract

Data movement between separate processing and memory units in traditional von Neumann computing systems is costly in terms of time and energy. The problem is aggravated by the recent explosive growth in data intensive applications related to artificial intelligence. In-memory computing has been proposed as an alternative approach where computational tasks can be performed directly in memory without shuttling back and forth between the processing and memory units. Memory is at the heart of in-memory computing. Technology scaling of mainstream memory technologies, such as static random-access memory (SRAM) and Dynamic random-access memory (DRAM), is increasingly constrained by fundamental technology limits. The recent research progress of various emerging nonvolatile memory (eNVM) device technologies, such as resistive random-access memory (RRAM), phase-change memory (PCM), conductive bridging random-access memory (CBRAM), ferroelectric random-access memory (FeRAM) and spin-transfer torque magnetoresistive random-access memory (STT-MRAM), have drawn tremendous attentions owing to its high speed, low cost, excellent scalability, enhanced storage density. Moreover, an eNVM based crossbar array can perform in-memory matrix vector multiplications in analog manner with high energy efficiency and provide potential opportunities for accelerating computation in various fields such as deep learning, scientific computing and computer vision. This dissertation presents research work on demonstrating a wide range of emerging memory device technologies (CBRAM, RRAM and STT-MRAM) for implementing neuro-inspired in-memory computing in several real-world applications using software and hardware co-design approach. Chapter 1 presents low energy subquantum CBRAM devices and a network pruning technique to reduce network-level energy consumption by hundreds to thousands fold. We showed low energy (10×-100× less than conventional memory technologies) and gradual switching characteristics of CBRAM as synaptic devices. We developed a network pruning algorithm that can be employed during spiking neural network (SNN) training to further reduce the energy by 10×. Using a 512 Kbit subquantum CBRAM array, we experimentally demonstrated high recognition accuracy on the MNIST dataset for digital implementation of unsupervised learning. Chapter 2 presents the details of SNN pruning algorithm that used in Chapter1. The pruning algorithms exploits the features of network weights and prune weights during the training based on neurons’ spiking characteristics, leading significant energy saving when implemented in eNVM based in-memory computing hardware. Chapter 3 presents a benchmarking analysis for the potential use of STT-MRAM in in-memory computing against SRAM at deeply scaled technology nodes (14nm and 7nm). A C++ based benchmarking platform is developed and uses LeNet-5, a popular convolutional neural network model (CNN). The platform maps STT-MRAM based in-memory computing architectures to LeNet-5 and can estimate inference accuracy, energy, latency, and area accurately for proposed architectures at different technology nodes compared against SRAM. Chapter 4 presents an adaptive quantization technique that compensates the accuracy loss due to limited conductance levels of PCM based synaptic devices and enables high-accuracy SNN unsupervised learning with low-precision PCM devices. The proposed adaptive quantization technique uses software and hardware co-design approach by designing software algorithms with consideration of real synaptic device characteristics and hardware limitations. Chapter 5 presents a real-world neural engineering application using in-memory computing. It presents an interface between eNVM based crossbar with neural electrodes to implement a real-time and high-energy efficient in-memory spike sorting system. A real-time hardware demonstration is performed using CuOx based eNVM crossbar to sort spike data in different brain regions recorded from multi-electrode arrays in animal experiments, which further extend the eNVM memory technologies for neural engineering applications. Chapter 6 presents a real-world deep learning application using in-memory computing. We demonstrated a direct integration of Ag-based conductive bridge random access memory (Ag-CBRAM) crossbar arrays with Mott-ReLU activation neurons for scalable, energy and area efficient hardware implementation of DNNs. Chapter 7 is the conclusion of this dissertation. The future directions of in-memory computing system based on eNVM technologies are discussed

Main Content

For improved accessibility of PDF content, download the file to your device.

UC San Diego

Neuro-inspired Computing Using Emerging Non-Volatile Memories