TY - GEN
T1 - Acceleration of deep recurrent neural networks with an FPGA cluster
AU - Sun, Yuxi
AU - Ben Ahmed, Akram
AU - Amano, Hideharu
N1 - Funding Information:
This paper is based on results obtained from a project commissioned by the New Energy Industrial Technology Development Organization (NEDO).
Publisher Copyright:
© 2019 Copyright held by the owner/author(s).
PY - 2019/6/6
Y1 - 2019/6/6
N2 - In this paper, we propose an acceleration methodology for deep recurrent neural networks (RNNs) implemented on a multi-FPGA platform called Flow-in-Cloud (FiC). RNNs have been proven effective for modeling temporal sequences, such as human speech and written text. However, the implementation of RNNs on traditional hardware is inefficient due to their long-range dependence and irregular computation patterns. This inefficiency manifests itself in the proportional increase of run time with respect to the number of layers of deep RNNs when running on traditional hardware platforms such as a CPUs. Previous works have mostly focused on the optimization of a single RNN cell. In this work, we take advantage of the multi-FPGA system to demonstrate that we can reduce the run time of deep RNNs from O(k) to O(1).
AB - In this paper, we propose an acceleration methodology for deep recurrent neural networks (RNNs) implemented on a multi-FPGA platform called Flow-in-Cloud (FiC). RNNs have been proven effective for modeling temporal sequences, such as human speech and written text. However, the implementation of RNNs on traditional hardware is inefficient due to their long-range dependence and irregular computation patterns. This inefficiency manifests itself in the proportional increase of run time with respect to the number of layers of deep RNNs when running on traditional hardware platforms such as a CPUs. Previous works have mostly focused on the optimization of a single RNN cell. In this work, we take advantage of the multi-FPGA system to demonstrate that we can reduce the run time of deep RNNs from O(k) to O(1).
KW - Acceleration
KW - FPGAs
KW - Recurrent Neural Networks
UR - http://www.scopus.com/inward/record.url?scp=85070566081&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85070566081&partnerID=8YFLogxK
U2 - 10.1145/3337801.3337804
DO - 10.1145/3337801.3337804
M3 - Conference contribution
AN - SCOPUS:85070566081
T3 - ACM International Conference Proceeding Series
BT - Proceedings of the 10th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2019
PB - Association for Computing Machinery
T2 - 10th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2019
Y2 - 6 June 2019 through 7 June 2019
ER -