抄録
Computational requirements for training state-of-the-art neural network models are increasing on vision tasks because high computational factors have become known to be effective in improving quality. While research in the image-processing field requires a lot of trials, this trend makes proving hypotheses difficult for researchers in computationally restricted environments. Neural convolution with a wide receptive field is one of the high-computational factors with quality improvement. This study aims to accelerate the training of the large kernel convolutions by resizing both training images and convolution filters to a smaller scale. Applying this strategy requires careful training designs to replace conventional training on the target scale, and we propose four techniques to improve the quality of the trained models. In our experiment, we apply our proposals to train an image classifier model modified from RepLKNet-31B on the image classification task of the CIFAR-10, CIFAR-100, and STL-10 datasets. Our proposed framework trains almost the same models 4.62-4.91 times faster than the standard training on the target spatial scale, keeping its accuracy, and provides 2.61-2.79 times further training acceleration and stability in accuracy compared to Progressive Learning. In addition to the training acceleration, our framework can simultaneously train models for multiple scales without any scale-specific tuning, which provides scalable usage considering the computational costs.
本文言語 | English |
---|---|
ページ(範囲) | 161312-161328 |
ページ数 | 17 |
ジャーナル | IEEE Access |
巻 | 12 |
DOI | |
出版ステータス | Published - 2024 |
ASJC Scopus subject areas
- コンピュータサイエンス一般
- 材料科学一般
- 工学一般