TY - JOUR
T1 - Effects of sample size and data augmentation on U-Net-based automatic segmentation of various organs
AU - Nemoto, Takafumi
AU - Futakami, Natsumi
AU - Kunieda, Etsuo
AU - Yagi, Masamichi
AU - Takeda, Atsuya
AU - Akiba, Takeshi
AU - Mutu, Eride
AU - Shigematsu, Naoyuki
N1 - Funding Information:
This study was partially funded by JSPS KAKENHI (Grant Number 20K08034).
Publisher Copyright:
© 2021, Japanese Society of Radiological Technology and Japan Society of Medical Physics.
PY - 2021/9
Y1 - 2021/9
N2 - Deep learning has demonstrated high efficacy for automatic segmentation in contour delineation, which is crucial in radiation therapy planning. However, the collection, labeling, and management of medical imaging data can be challenging. This study aims to elucidate the effects of sample size and data augmentation on the automatic segmentation of computed tomography images using U-Net, a deep learning method. For the chest and pelvic regions, 232 and 556 cases are evaluated, respectively. We investigate multiple conditions by changing the sum of the training and validation datasets across a broad range of values: 10–200 and 10–500 cases for the chest and pelvic regions, respectively. A U-Net is constructed, and horizontal-flip data augmentation, which produces left and right inverse images resulting in twice the number of images, is compared with no augmentation for each training session. All lung cases and more than 100 prostate, bladder, and rectum cases indicate that adding horizontal-flip data augmentation is almost as effective as doubling the number of cases. The slope of the Dice similarity coefficient (DSC) in all organs decreases rapidly until approximately 100 cases, stabilizes after 200 cases, and shows minimal changes as the number of cases is increased further. The DSCs stabilize at a smaller sample size with the incorporation of data augmentation in all organs except the heart. This finding is applicable to the automation of radiation therapy for rare cancers, where large datasets may be difficult to obtain.
AB - Deep learning has demonstrated high efficacy for automatic segmentation in contour delineation, which is crucial in radiation therapy planning. However, the collection, labeling, and management of medical imaging data can be challenging. This study aims to elucidate the effects of sample size and data augmentation on the automatic segmentation of computed tomography images using U-Net, a deep learning method. For the chest and pelvic regions, 232 and 556 cases are evaluated, respectively. We investigate multiple conditions by changing the sum of the training and validation datasets across a broad range of values: 10–200 and 10–500 cases for the chest and pelvic regions, respectively. A U-Net is constructed, and horizontal-flip data augmentation, which produces left and right inverse images resulting in twice the number of images, is compared with no augmentation for each training session. All lung cases and more than 100 prostate, bladder, and rectum cases indicate that adding horizontal-flip data augmentation is almost as effective as doubling the number of cases. The slope of the Dice similarity coefficient (DSC) in all organs decreases rapidly until approximately 100 cases, stabilizes after 200 cases, and shows minimal changes as the number of cases is increased further. The DSCs stabilize at a smaller sample size with the incorporation of data augmentation in all organs except the heart. This finding is applicable to the automation of radiation therapy for rare cancers, where large datasets may be difficult to obtain.
KW - Automatic segmentation
KW - Data augmentation
KW - Radiation therapy
KW - Sample size
UR - http://www.scopus.com/inward/record.url?scp=85110862234&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85110862234&partnerID=8YFLogxK
U2 - 10.1007/s12194-021-00630-6
DO - 10.1007/s12194-021-00630-6
M3 - Article
C2 - 34254251
AN - SCOPUS:85110862234
SN - 1865-0333
VL - 14
SP - 318
EP - 327
JO - Radiological Physics and Technology
JF - Radiological Physics and Technology
IS - 3
ER -