In the last decade, the advent of the Internet of Things (IoT) and cloud computing has promoted a rapid growth in the number of smart devices connected to the internet. However, requirements imposed by new applications such as the ever-increasing amount of data that is exchanged over the network is demanding a transition to a more distributed infrastructure. This has promoted research for new technologies and computing architectures which aim to address the main sources of time and energy inefficiencies of conventional von Neumann computing architectures, namely the memory wall (i.e., the large discrepancy in performance between the memory and the processing unit) and the von Neumann bottleneck (i.e., the need to exchange data between the memory and the processing unit over a slow and inefficient bus). Thus, several new non-volatile memory (NVM) technologies were proposed to overcome the memory wall (i.e., improve speed and energy consumption). These new memories include Phase Change Memories (PCM), Ferroelectric Random Access Memories (FRAM), Magnetic Random Access Memories (MRAM) and Resistive Random Access Memories (RRAM). Despite achieving encouraging performance, all these new technologies pose challenges to circuit designers and require in-depth circuit simulations and the implementation of device-circuit co-design strategies to determine technological barriers and limitations.
Among these NVMs, RRAMs are perhaps the most promising and can be used to implement non von Neumann computing architectures. In particular, RRAMs can be used to realize Logic-in-memory (LiM) architectures based on the material implication logic (IMPLY), which enable the energy efficient execution of logic operations directly inside the memory with a high degree of parallelism, avoiding the VNB.
In the LiM framework, the execution of logic operations requires an appropriate peripheral circuitry and control logic to provide the required driving voltages to specific RRAM devices. However, only a few studies in the literature attempted to design this peripheral circuitry, and its energy contribution is typically neglected. This constitutes an open issue, as there is still a debate in the scientific community to whether the energy overhead introduced by this peripheral circuitry results in inefficiencies that mask the benefits of overcoming the VNB. Also, a smart IMPLY (SIMPLY) architecture which improves the energy efficiency and solves the reliability issues of conventional IMPLY architectures was recently proposed, and a design of its complete peripheral circuitry has never been attempted.
In this work, we fill this gap by designing the peripheral circuitry required to implement the complete set of operations of the SIMPLY architecture using a 45nm CMOS technology. By using a physics-based RRAM compact model calibrated on three RRAM technologies from the literature, we implement device circuit co-design strategies and highlight important technological trade-offs that need to be considered in the design phase. We perform extensive circuit simulations including RRAM devices non-idealities (i.e., resistive state variability, Random Telegraph Noise, self-heating) to estimate the circuit performance and reliability. Finally, considering the direction of future developments of RRAM technologies, we project the energy performance and footprint of the circuit considering a RRAM technology with a low current compliance (i.e., Ic = 10nA), and devise a roadmap for device-circuit co-optimization in the view of further technological scaling of both CMOS circuits and RRAM devices. Results suggest that the SIMPLY architecture is indeed a promising candidate for the development of ultra-low power reconfigurable hardware for IoT applications.