TY - JOUR
T1 - Identifying neurocognitive disorder using vector representation of free conversation
AU - Horigome, Toshiro
AU - Hino, Kimihiro
AU - Toyoshiba, Hiroyoshi
AU - Shindo, Norihisa
AU - Funaki, Kei
AU - Eguchi, Yoko
AU - Kitazawa, Momoko
AU - Fujita, Takanori
AU - Mimura, Masaru
AU - Kishimoto, Taishiro
N1 - Funding Information:
This research is supported by the Japan Agency for Medical Research and Development (AMED) under Grant Number JP18he1102004.
Publisher Copyright:
© 2022, The Author(s).
PY - 2022/12
Y1 - 2022/12
N2 - In recent years, studies on the use of natural language processing (NLP) approaches to identify dementia have been reported. Most of these studies used picture description tasks or other similar tasks to encourage spontaneous speech, but the use of free conversation without requiring a task might be easier to perform in a clinical setting. Moreover, free conversation is unlikely to induce a learning effect. Therefore, the purpose of this study was to develop a machine learning model to discriminate subjects with and without dementia by extracting features from unstructured free conversation data using NLP. We recruited patients who visited a specialized outpatient clinic for dementia and healthy volunteers. Participants’ conversation was transcribed and the text data was decomposed from natural sentences into morphemes by performing a morphological analysis using NLP, and then converted into real-valued vectors that were used as features for machine learning. A total of 432 datasets were used, and the resulting machine learning model classified the data for dementia and non-dementia subjects with an accuracy of 0.900, sensitivity of 0.881, and a specificity of 0.916. Using sentence vector information, it was possible to develop a machine-learning algorithm capable of discriminating dementia from non-dementia subjects with a high accuracy based on free conversation.
AB - In recent years, studies on the use of natural language processing (NLP) approaches to identify dementia have been reported. Most of these studies used picture description tasks or other similar tasks to encourage spontaneous speech, but the use of free conversation without requiring a task might be easier to perform in a clinical setting. Moreover, free conversation is unlikely to induce a learning effect. Therefore, the purpose of this study was to develop a machine learning model to discriminate subjects with and without dementia by extracting features from unstructured free conversation data using NLP. We recruited patients who visited a specialized outpatient clinic for dementia and healthy volunteers. Participants’ conversation was transcribed and the text data was decomposed from natural sentences into morphemes by performing a morphological analysis using NLP, and then converted into real-valued vectors that were used as features for machine learning. A total of 432 datasets were used, and the resulting machine learning model classified the data for dementia and non-dementia subjects with an accuracy of 0.900, sensitivity of 0.881, and a specificity of 0.916. Using sentence vector information, it was possible to develop a machine-learning algorithm capable of discriminating dementia from non-dementia subjects with a high accuracy based on free conversation.
UR - http://www.scopus.com/inward/record.url?scp=85135259763&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85135259763&partnerID=8YFLogxK
U2 - 10.1038/s41598-022-16204-4
DO - 10.1038/s41598-022-16204-4
M3 - Article
C2 - 35922457
AN - SCOPUS:85135259763
SN - 2045-2322
VL - 12
JO - Scientific reports
JF - Scientific reports
IS - 1
M1 - 12461
ER -