Computation-in-Memory
From Circuits to Compilers
More Info
expand_more
Abstract
Memristive devices are promising candidates as a complement to CMOS devices. These devices come with several advantages such as non-volatility, high density, good scalability, and CMOS compatibility. They enable in-memory computing paradigms since they can be used for both storing and computing. However, building in-memory computing systems using memristive devices is still in an early research stage. Therefore, challenges still exist with respect to the development of devices, circuits, architectures, design automation, and applications. This thesis focuses on developing memristive device-based circuits, their usage in in-memory computing architectures, and design automation methodologies to create or use such circuits. Circuit Level – We propose two logical operation schemes based on memristive devices. The first one uses resistive sensing to perform logical operations. It modifies the sense amplifier in such a way that it can compare the overall current with references and output the logical operation result. During sensing, the resistance of memristive devices remains unchanged. Therefore, endurance and lifetime are not reduced. This scheme provides a solution for maintaining a relatively long lifetime in logic operations for memristive devices that have low endurance. The second scheme is the enhanced version of the first one. It uses two different sensing paths for AND and OR operations. In this way, the correctness of logic operations can be guaranteed even if large resistance variation exists in memristive devices. Architecture Level – We present three in-memory computing architectures based on memristive devices. The first one is a heterogeneous architecture containing an accelerator for vector bit-wise logical operations and a CPU. The accelerator communicates with the CPU or accesses the external memory directly. The second one is to accelerate automata processing. In this architecture, memristive memory arrays store configuration information and conduct computation as well. This architecture outperforms similar ones that are built with conventional memory technologies. The third one is an improved version of the second one. It breaks the routing network into multiple pipeline stages, each processing a different input sequence. In this way, the architecture achieves a higher throughput with a negligible area overhead. Design Automation – A synthesis flow for computation-in-memory architectures and a compiler for automata processors are presented. The synthesis flow is proposed based on the concept of skeletons, which relates an algorithmic structure to a pre-defined solution template. This solution template contains scheduling, placement, and routing information needed for the hardware generation. After the user rewrites the algorithm using skeletons, the tool generates the desired circuit by instantiating the solution template. The automata processor compiler generates configuration bits according to the input automata. It uses multiple strategies to transform given automata, so that constraint conflicts can be resolved automatically. It also optimizes the mapping for storage utilization.