TY - JOUR
T1 - A Compression Router for Low-Latency Network-on-Chip
AU - Niwa, Naoya
AU - Shikama, Yoshiya
AU - Amano, Hideharu
AU - Koibuchi, Michihiro
N1 - Funding Information:
This work was supported by JSPS KAKENHI 19H01106 and VLSI Design and Education Center (VDEC), the University of Tokyo with the collaboration with SYNOPSYS Corporation.
Publisher Copyright:
© 2023 The Institute of Electronics.
PY - 2023/2
Y1 - 2023/2
N2 - Network-on-Chips (NoCs) are important components for scalable many-core processors. Because the performance of parallel applications is usually sensitive to the latency of NoCs, reducing it is a primary requirement. In this study, a compression router that hides the (de)compression-operation delay is proposed. The compression router (de)compresses the contents of the incoming packet before the switch arbitration is completed, thus shortening the packet length without latency penalty and reducing the network injection-and-ejection latency. Evaluation results show that the compression router improves up to 33% of the parallel application performance (conjugate gradients (CG), fast Fourier transform (FT), integer sort (IS), and traveling salesman problem (TSP)) and 63% of the effective network throughput by 1.8 compression ratio on NoC. The cost is an increase in router area and its energy consumption by 0.22mm2 and 1.6 times compared to the conventional virtual-channel router. Another finding is that off-loading the decompressor onto a network interface decreases the compression-router area by 57% at the expense of the moderate increase in communication latency.
AB - Network-on-Chips (NoCs) are important components for scalable many-core processors. Because the performance of parallel applications is usually sensitive to the latency of NoCs, reducing it is a primary requirement. In this study, a compression router that hides the (de)compression-operation delay is proposed. The compression router (de)compresses the contents of the incoming packet before the switch arbitration is completed, thus shortening the packet length without latency penalty and reducing the network injection-and-ejection latency. Evaluation results show that the compression router improves up to 33% of the parallel application performance (conjugate gradients (CG), fast Fourier transform (FT), integer sort (IS), and traveling salesman problem (TSP)) and 63% of the effective network throughput by 1.8 compression ratio on NoC. The cost is an increase in router area and its energy consumption by 0.22mm2 and 1.6 times compared to the conventional virtual-channel router. Another finding is that off-loading the decompressor onto a network interface decreases the compression-router area by 57% at the expense of the moderate increase in communication latency.
KW - lossy data compression
KW - Network-on-Chips
KW - router architecture
UR - http://www.scopus.com/inward/record.url?scp=85150457059&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85150457059&partnerID=8YFLogxK
U2 - 10.1587/transinf.2022EDP7080
DO - 10.1587/transinf.2022EDP7080
M3 - Article
AN - SCOPUS:85150457059
SN - 0916-8532
VL - E106D
SP - 170
EP - 180
JO - IEICE Transactions on Information and Systems
JF - IEICE Transactions on Information and Systems
IS - 2
ER -