TY - GEN
T1 - A 297mops/0.4mw ultra low power coarse-grained reconfigurable accelerator CMA-SOTB-2
AU - Masuyama, Koichiro
AU - Fujita, Yu
AU - Okuhara, Hayate
AU - Amano, Hideharu
N1 - Funding Information:
This work was performed as Ultra-Low Voltage Device Project funded and supported by the Ministry of Economy, Trade and Industry (METI) and the New Energy and Industrial Technology Development Organization(NEDO). Also, this work was partially supported by JSPS KAKENHI S Grant Number 25220002.
Publisher Copyright:
© 2015 IEEE.
PY - 2016/1/25
Y1 - 2016/1/25
N2 - Cool mega array-SOTB-2 (CMA-SOTB-2) is an ultra-low energy coarse grained reconfigurable architecture (CGRA) for advanced sensor networks, the Internet of Things, and wearable computing. It uses a large processing element (PE) array with combinatorial circuits and a micro-controller for data transfer between data memory and the PE array. To improve the energy efficiency of the previous prototype, the CMA-SOTB, the performance of the micro-controller was improved by introducing parallel data memory access with data manipulators and optimization of both instruction sets and micro-architecture. A delay learning mechanism that finds the optimal delay time for the computation in the PE array is also introduced. Standard cell libraries of the 65nm silicon on thin buried oxide (SOTB) process have been optimized for under-milliwatt operation. A real chip evaluation shows that more than 250-MOPS performance was achieved with only a 0.4-mW power budget by independently controlling the body-bias voltage for the micro-controller and the PE array. The energy efficiency is almost double that of the previous prototype, the CMA-SOTB.
AB - Cool mega array-SOTB-2 (CMA-SOTB-2) is an ultra-low energy coarse grained reconfigurable architecture (CGRA) for advanced sensor networks, the Internet of Things, and wearable computing. It uses a large processing element (PE) array with combinatorial circuits and a micro-controller for data transfer between data memory and the PE array. To improve the energy efficiency of the previous prototype, the CMA-SOTB, the performance of the micro-controller was improved by introducing parallel data memory access with data manipulators and optimization of both instruction sets and micro-architecture. A delay learning mechanism that finds the optimal delay time for the computation in the PE array is also introduced. Standard cell libraries of the 65nm silicon on thin buried oxide (SOTB) process have been optimized for under-milliwatt operation. A real chip evaluation shows that more than 250-MOPS performance was achieved with only a 0.4-mW power budget by independently controlling the body-bias voltage for the micro-controller and the PE array. The energy efficiency is almost double that of the previous prototype, the CMA-SOTB.
UR - http://www.scopus.com/inward/record.url?scp=84964324558&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84964324558&partnerID=8YFLogxK
U2 - 10.1109/ReConFig.2015.7393280
DO - 10.1109/ReConFig.2015.7393280
M3 - Conference contribution
AN - SCOPUS:84964324558
T3 - 2015 International Conference on ReConFigurable Computing and FPGAs, ReConFig 2015
BT - 2015 International Conference on ReConFigurable Computing and FPGAs, ReConFig 2015
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - International Conference on ReConFigurable Computing and FPGAs, ReConFig 2015
Y2 - 7 December 2015 through 9 December 2015
ER -