Design and Optimization of Resistive RAM-based Storage and Computing Systems
Description
The Resistive Random Access Memory (ReRAM) is an emerging non-volatile memory
technology because of its attractive attributes, including excellent scalability (< 10 nm), low
programming voltage (< 3 V), fast switching speed (< 10 ns), high OFF/ON ratio (> 10),
good endurance (up to 1012 cycles) and great compatibility with silicon CMOS technology [1].
However, ReRAM suffers from larger write latency, energy and reliability issue compared to
Dynamic Random Access Memory (DRAM). To improve the energy-efficiency, latency efficiency and reliability of ReRAM storage systems, a low cost cross-layer approach that spans device, circuit, architecture and system levels is proposed.
For 1T1R 2D ReRAM system, the effect of both retention and endurance errors on
ReRAM reliability is considered. Proposed approach is to design circuit-level and architecture-level techniques to reduce raw Bit Error Rate significantly and then employ low cost Error Control Coding to achieve the desired lifetime.
For 1S1R 2D ReRAM system, a cross-point array with “multi-bit per access” per subarray
is designed for high energy-efficiency and good reliability. The errors due to cell-level as well
as array-level variations are analyzed and a low cost scheme to maintain reliability and latency
with low energy consumption is proposed.
For 1S1R 3D ReRAM system, access schemes which activate multiple subarrays with
multiple layers in a subarray are used to achieve high energy efficiency through activating fewer
subarray, and good reliability is achieved through innovative data organization.
Finally, a novel ReRAM-based accelerator design is proposed to support multiple
Convolutional Neural Networks (CNN) topologies including VGGNet, AlexNet and ResNet.
The multi-tiled architecture consists of 9 processing elements per tile, where each tile
implements the dot product operation using ReRAM as computation unit. The processing
elements operate in a systolic fashion, thereby maximizing input feature map reuse and
minimizing interconnection cost. The system-level evaluation on several network benchmarks
show that the proposed architecture can improve computation efficiency and energy efficiency
compared to a state-of-the-art ReRAM-based accelerator.
technology because of its attractive attributes, including excellent scalability (< 10 nm), low
programming voltage (< 3 V), fast switching speed (< 10 ns), high OFF/ON ratio (> 10),
good endurance (up to 1012 cycles) and great compatibility with silicon CMOS technology [1].
However, ReRAM suffers from larger write latency, energy and reliability issue compared to
Dynamic Random Access Memory (DRAM). To improve the energy-efficiency, latency efficiency and reliability of ReRAM storage systems, a low cost cross-layer approach that spans device, circuit, architecture and system levels is proposed.
For 1T1R 2D ReRAM system, the effect of both retention and endurance errors on
ReRAM reliability is considered. Proposed approach is to design circuit-level and architecture-level techniques to reduce raw Bit Error Rate significantly and then employ low cost Error Control Coding to achieve the desired lifetime.
For 1S1R 2D ReRAM system, a cross-point array with “multi-bit per access” per subarray
is designed for high energy-efficiency and good reliability. The errors due to cell-level as well
as array-level variations are analyzed and a low cost scheme to maintain reliability and latency
with low energy consumption is proposed.
For 1S1R 3D ReRAM system, access schemes which activate multiple subarrays with
multiple layers in a subarray are used to achieve high energy efficiency through activating fewer
subarray, and good reliability is achieved through innovative data organization.
Finally, a novel ReRAM-based accelerator design is proposed to support multiple
Convolutional Neural Networks (CNN) topologies including VGGNet, AlexNet and ResNet.
The multi-tiled architecture consists of 9 processing elements per tile, where each tile
implements the dot product operation using ReRAM as computation unit. The processing
elements operate in a systolic fashion, thereby maximizing input feature map reuse and
minimizing interconnection cost. The system-level evaluation on several network benchmarks
show that the proposed architecture can improve computation efficiency and energy efficiency
compared to a state-of-the-art ReRAM-based accelerator.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
2019
Agent
- Author (aut): Mao, Manqing
- Thesis advisor (ths): Chakrabariti, Chaitali
- Committee member: Yu, Shimeng
- Committee member: Cao, Yu
- Committee member: Orgas, Umit
- Publisher (pbl): Arizona State University