TY - GEN
T1 - Accelerating ODE-Based Neural Networks on Low-Cost FPGAs
AU - Watanabe, Hirohisa
AU - Matsutani, Hiroki
N1 - Funding Information:
Acknowledgements This work was partially supported by JSPS KAKENHI Grant Number 19H04117, Japan.
Publisher Copyright:
© 2021 IEEE.
PY - 2021/6
Y1 - 2021/6
N2 - ODENet is a deep neural network architecture in which a stacking structure of ResNet is implemented with an ordinary differential equation (ODE) solver. It can reduce the number of parameters and strike a balance between accuracy and performance by selecting a proper solver. It is also possible to improve the accuracy while keeping the same number of parameters on resource-limited edge devices. In this paper, using Euler method as an ODE solver, a part of ODENet is implemented as a dedicated logic on a low-cost FPGA (Field-Programmable Gate Array) board, such as PYNQ-Z2 board. As ODENet variants, reduced ODENets (rODENets) each of which heavily uses a part of ODENet layers and reduces/eliminates some layers differently are proposed and analyzed for low-cost FPGA implementation. They are evaluated in terms of parameter size, accuracy, execution time, and resource utilization on the FPGA. The results show that an overall execution time of an rODENet variant is improved by up to 2.66 times compared to a pure software execution while keeping a comparable accuracy to the original ODENet.
AB - ODENet is a deep neural network architecture in which a stacking structure of ResNet is implemented with an ordinary differential equation (ODE) solver. It can reduce the number of parameters and strike a balance between accuracy and performance by selecting a proper solver. It is also possible to improve the accuracy while keeping the same number of parameters on resource-limited edge devices. In this paper, using Euler method as an ODE solver, a part of ODENet is implemented as a dedicated logic on a low-cost FPGA (Field-Programmable Gate Array) board, such as PYNQ-Z2 board. As ODENet variants, reduced ODENets (rODENets) each of which heavily uses a part of ODENet layers and reduces/eliminates some layers differently are proposed and analyzed for low-cost FPGA implementation. They are evaluated in terms of parameter size, accuracy, execution time, and resource utilization on the FPGA. The results show that an overall execution time of an rODENet variant is improved by up to 2.66 times compared to a pure software execution while keeping a comparable accuracy to the original ODENet.
KW - CNN
KW - FPGA
KW - Neural ODE
KW - Neural network
KW - ODE
UR - http://www.scopus.com/inward/record.url?scp=85112083662&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85112083662&partnerID=8YFLogxK
U2 - 10.1109/IPDPSW52791.2021.00021
DO - 10.1109/IPDPSW52791.2021.00021
M3 - Conference contribution
AN - SCOPUS:85112083662
T3 - 2021 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2021 - In conjunction with IEEE IPDPS 2021
SP - 88
EP - 95
BT - 2021 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2021 - In conjunction with IEEE IPDPS 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2021
Y2 - 17 May 2021
ER -