Design and Optimization of Resistive RAM-based Storage and Computing Systems

157305-Thumbnail Image.png
Description
The Resistive Random Access Memory (ReRAM) is an emerging non-volatile memory

technology because of its attractive attributes, including excellent scalability (< 10 nm), low

programming voltage (< 3 V), fast switching speed (< 10 ns), high OFF/ON ratio (> 10),

good endurance (u

The Resistive Random Access Memory (ReRAM) is an emerging non-volatile memory

technology because of its attractive attributes, including excellent scalability (< 10 nm), low

programming voltage (< 3 V), fast switching speed (< 10 ns), high OFF/ON ratio (> 10),

good endurance (up to 1012 cycles) and great compatibility with silicon CMOS technology [1].

However, ReRAM suffers from larger write latency, energy and reliability issue compared to

Dynamic Random Access Memory (DRAM). To improve the energy-efficiency, latency efficiency and reliability of ReRAM storage systems, a low cost cross-layer approach that spans device, circuit, architecture and system levels is proposed.

For 1T1R 2D ReRAM system, the effect of both retention and endurance errors on

ReRAM reliability is considered. Proposed approach is to design circuit-level and architecture-level techniques to reduce raw Bit Error Rate significantly and then employ low cost Error Control Coding to achieve the desired lifetime.

For 1S1R 2D ReRAM system, a cross-point array with “multi-bit per access” per subarray

is designed for high energy-efficiency and good reliability. The errors due to cell-level as well

as array-level variations are analyzed and a low cost scheme to maintain reliability and latency

with low energy consumption is proposed.

For 1S1R 3D ReRAM system, access schemes which activate multiple subarrays with

multiple layers in a subarray are used to achieve high energy efficiency through activating fewer

subarray, and good reliability is achieved through innovative data organization.

Finally, a novel ReRAM-based accelerator design is proposed to support multiple

Convolutional Neural Networks (CNN) topologies including VGGNet, AlexNet and ResNet.

The multi-tiled architecture consists of 9 processing elements per tile, where each tile

implements the dot product operation using ReRAM as computation unit. The processing

elements operate in a systolic fashion, thereby maximizing input feature map reuse and

minimizing interconnection cost. The system-level evaluation on several network benchmarks

show that the proposed architecture can improve computation efficiency and energy efficiency

compared to a state-of-the-art ReRAM-based accelerator.
Date Created
2019
Agent