TY - JOUR
T1 - A 818–4094 TOPS/W Capacitor-Reconfigured Analog CIM for Unified Acceleration of CNNs and Transformers
AU - Yoshioka, Kentaro
N1 - Publisher Copyright:
© 1966-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - The rapid evolution of machine learning has led to the emergence of diverse neural network architectures, such as CNNs, Transformers, and their hybrid models, each with unique computational precision requirements. Transformers, in particular, demand higher precision compared to CNNs. Existing analog compute-in-memory (ACIM) solutions primarily cater to CNNs and struggle to achieve the high precision necessary for Transformers, despite their promise in addressing the memory bottleneck. To bridge this gap, we propose a capacitor-reconfigured CIM (CR-CIM) macro that introduces dual-mode operation, dynamically switching between high-precision and high-efficiency modes based on the active DNN layer. In the CNN mode, the CR-CIM employs bit-parallel computation and an 8-bit ADC to maximize power efficiency, exploiting the inherent error tolerance of CNNs. In contrast, for the Transformer mode, the CR-CIM switches to bit-serial computation and a 10-bit ADC to boost the compute signal-to-noise ratio (CSNR), ensuring the higher precision required by Transformers. This dual-mode functionality of the proposed CR-CIM is enabled by three key technologies: 1) a novel CR-CIM architecture and cell structure; 2) a resource-efficient multi-bit driver for bit-parallel computation; and 3) a software-analog co-design (SAC) strategy for enhanced Transformer computation. Our CR-CIM prototype is the first ACIM design to enable optimized operation for both Transformers and CNNs. CR-CIM achieves 45-dB signal-to-quantization-noise ratio (SQNR) and 31-dB CSNR (8-bit input and 8-bit weight bit-serial MAC) in the Transformer mode and a peak-power efficiency of 4094 TOPS/W (normalized to 1-bit × 1 -bit MAC) in the CNN mode.
AB - The rapid evolution of machine learning has led to the emergence of diverse neural network architectures, such as CNNs, Transformers, and their hybrid models, each with unique computational precision requirements. Transformers, in particular, demand higher precision compared to CNNs. Existing analog compute-in-memory (ACIM) solutions primarily cater to CNNs and struggle to achieve the high precision necessary for Transformers, despite their promise in addressing the memory bottleneck. To bridge this gap, we propose a capacitor-reconfigured CIM (CR-CIM) macro that introduces dual-mode operation, dynamically switching between high-precision and high-efficiency modes based on the active DNN layer. In the CNN mode, the CR-CIM employs bit-parallel computation and an 8-bit ADC to maximize power efficiency, exploiting the inherent error tolerance of CNNs. In contrast, for the Transformer mode, the CR-CIM switches to bit-serial computation and a 10-bit ADC to boost the compute signal-to-noise ratio (CSNR), ensuring the higher precision required by Transformers. This dual-mode functionality of the proposed CR-CIM is enabled by three key technologies: 1) a novel CR-CIM architecture and cell structure; 2) a resource-efficient multi-bit driver for bit-parallel computation; and 3) a software-analog co-design (SAC) strategy for enhanced Transformer computation. Our CR-CIM prototype is the first ACIM design to enable optimized operation for both Transformers and CNNs. CR-CIM achieves 45-dB signal-to-quantization-noise ratio (SQNR) and 31-dB CSNR (8-bit input and 8-bit weight bit-serial MAC) in the Transformer mode and a peak-power efficiency of 4094 TOPS/W (normalized to 1-bit × 1 -bit MAC) in the CNN mode.
KW - Analog compute-in-memory (ACIM)
KW - capacitor reconfigured CIM (CR-CIM)
KW - CNN
KW - transformer
UR - http://www.scopus.com/inward/record.url?scp=105003653542&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=105003653542&partnerID=8YFLogxK
U2 - 10.1109/JSSC.2024.3457898
DO - 10.1109/JSSC.2024.3457898
M3 - Article
AN - SCOPUS:105003653542
SN - 0018-9200
VL - 60
SP - 1844
EP - 1855
JO - IEEE Journal of Solid-State Circuits
JF - IEEE Journal of Solid-State Circuits
IS - 5
ER -