QUEST: Multi-purpose log-quantized DNN inference engine stacked on 96-MB 3-D SRAM using inductive coupling technology in 40-nm CMOS

Kodai Ueyoshi, Kota Ando, Kazutoshi Hirose, Shinya Takamaeda-Yamazaki, Mototsugu Hamada, Tadahiro Kuroda, Masato Motomura

    Research output: Contribution to journalArticlepeer-review

    40 Citations (Scopus)

    Abstract

    QUEST is a programmable multiple instruction, multiple data (MIMD) parallel accelerator for general-purpose state-of-the-art deep neural networks (DNNs). It features die-to-die stacking with three-cycle latency, 28.8 GB/s, 96 MB, and eight SRAMs using an inductive coupling technology called the ThruChip interface (TCI). By stacking the SRAMs instead of DRAMs, lower memory access latency and simpler hardware are expected. This facilitates in balancing the memory capacity, latency, and bandwidth, all of which are in demand by cutting-edge DNNs at a high level. QUEST also introduces log-quantized programmable bit-precision processing for achieving faster (larger) DNN computation (size) in a 3-D module. It can sustain higher recognition accuracy at a lower bitwidth region compared to linear quantization. The prototype QUEST chip is integrated in the 40-nm CMOS technology, and it achieves 7.49 tera operations per second (TOPS) peak performance in binary precision, and 1.96 TOPS in 4-bit precision at 300-MHz clock.

    Original languageEnglish
    Article number8492341
    Pages (from-to)186-196
    Number of pages11
    JournalIEEE Journal of Solid-State Circuits
    Volume54
    Issue number1
    DOIs
    Publication statusPublished - 2019 Jan

    Keywords

    • Accelerator
    • deep learning
    • deep neural networks (DNNs)
    • logarithmic-quantized neural networks
    • processor architecture

    ASJC Scopus subject areas

    • Electrical and Electronic Engineering

    Fingerprint

    Dive into the research topics of 'QUEST: Multi-purpose log-quantized DNN inference engine stacked on 96-MB 3-D SRAM using inductive coupling technology in 40-nm CMOS'. Together they form a unique fingerprint.

    Cite this