TY - JOUR
T1 - Understanding psychiatric illness through natural language processing (UNDERPIN)
T2 - Rationale, design, and methodology
AU - the UNDERPIN Collaborators
AU - Kishimoto, Taishiro
AU - Nakamura, Hironobu
AU - Kano, Yoshinobu
AU - Eguchi, Yoko
AU - Kitazawa, Momoko
AU - Liang, Kuo Ching
AU - Kudo, Koki
AU - Sento, Ayako
AU - Takamiya, Akihiro
AU - Horigome, Toshiro
AU - Yamasaki, Toshihiko
AU - Sunami, Yuki
AU - Kikuchi, Toshiaki
AU - Nakajima, Kazuki
AU - Tomita, Masayuki
AU - Bun, Shogyoku
AU - Momota, Yuki
AU - Sawada, Kyosuke
AU - Murakami, Junichi
AU - Takahashi, Hidehiko
AU - Mimura, Masaru
N1 - Funding Information:
This research was supported by the Japan Science and Technology Agency CREST under Grant Number JPMJCR1684 and JPMJCR19F4. The 1st grant was awarded in 2016 and ended in 2018, the 2nd grant was awarded in 2019 and ended in 2021 and the 3rd grant was awarded in 2022 and will end in 2024. The funding source did not participate in the design of this study and will not be involved in the study's execution, analyses, or submission of results.
Funding Information:
We gratefully acknowledge the UNDERPIN collaborators: Yuki Tazawa, Yuki Ito, Yuriko Kaise, Sayaka Hanashiro, Yoshitaka Yamaoka, Noriko Maegaichi, Kaori Okubo, Kiko Shiga, Sakura Takeuchi, Shimpei Isa, Kelley Cortright (Keio University), Akiko Goto (Tsurugaoka Garden Hospital), Yoshino Humihiro (Tsurugaoka Garden Hospital), Nobuya Ishida (Biwako Hospital), Yuka Oba (Sato Hospital).
Publisher Copyright:
Copyright © 2022 Kishimoto, Nakamura, Kano, Eguchi, Kitazawa, Liang, Kudo, Sento, Takamiya, Horigome, Yamasaki, Sunami, Kikuchi, Nakajima, Tomita, Bun, Momota, Sawada, Murakami, Takahashi and Mimura.
PY - 2022/12/1
Y1 - 2022/12/1
N2 - Introduction: Psychiatric disorders are diagnosed through observations of psychiatrists according to diagnostic criteria such as the DSM-5. Such observations, however, are mainly based on each psychiatrist's level of experience and often lack objectivity, potentially leading to disagreements among psychiatrists. In contrast, specific linguistic features can be observed in some psychiatric disorders, such as a loosening of associations in schizophrenia. Some studies explored biomarkers, but biomarkers have yet to be used in clinical practice. Aim: The purposes of this study are to create a large dataset of Japanese speech data labeled with detailed information on psychiatric disorders and neurocognitive disorders to quantify the linguistic features of those disorders using natural language processing and, finally, to develop objective and easy-to-use biomarkers for diagnosing and assessing the severity of them. Methods: This study will have a multi-center prospective design. The DSM-5 or ICD-11 criteria for major depressive disorder, bipolar disorder, schizophrenia, and anxiety disorder and for major and minor neurocognitive disorders will be regarded as the inclusion criteria for the psychiatric disorder samples. For the healthy subjects, the absence of a history of psychiatric disorders will be confirmed using the Mini-International Neuropsychiatric Interview (M.I.N.I.). The absence of current cognitive decline will be confirmed using the Mini-Mental State Examination (MMSE). A psychiatrist or psychologist will conduct 30-to-60-min interviews with each participant; these interviews will include free conversation, picture-description task, and story-telling task, all of which will be recorded using a microphone headset. In addition, the severity of disorders will be assessed using clinical rating scales. Data will be collected from each participant at least twice during the study period and up to a maximum of five times at an interval of at least one month. Discussion: This study is unique in its large sample size and the novelty of its method, and has potential for applications in many fields. We have some challenges regarding inter-rater reliability and the linguistic peculiarities of Japanese. As of September 2022, we have collected a total of >1000 records from >400 participants. To the best of our knowledge, this data sample is one of the largest in this field. Clinical Trial Registration: Identifier: UMIN000032141.
AB - Introduction: Psychiatric disorders are diagnosed through observations of psychiatrists according to diagnostic criteria such as the DSM-5. Such observations, however, are mainly based on each psychiatrist's level of experience and often lack objectivity, potentially leading to disagreements among psychiatrists. In contrast, specific linguistic features can be observed in some psychiatric disorders, such as a loosening of associations in schizophrenia. Some studies explored biomarkers, but biomarkers have yet to be used in clinical practice. Aim: The purposes of this study are to create a large dataset of Japanese speech data labeled with detailed information on psychiatric disorders and neurocognitive disorders to quantify the linguistic features of those disorders using natural language processing and, finally, to develop objective and easy-to-use biomarkers for diagnosing and assessing the severity of them. Methods: This study will have a multi-center prospective design. The DSM-5 or ICD-11 criteria for major depressive disorder, bipolar disorder, schizophrenia, and anxiety disorder and for major and minor neurocognitive disorders will be regarded as the inclusion criteria for the psychiatric disorder samples. For the healthy subjects, the absence of a history of psychiatric disorders will be confirmed using the Mini-International Neuropsychiatric Interview (M.I.N.I.). The absence of current cognitive decline will be confirmed using the Mini-Mental State Examination (MMSE). A psychiatrist or psychologist will conduct 30-to-60-min interviews with each participant; these interviews will include free conversation, picture-description task, and story-telling task, all of which will be recorded using a microphone headset. In addition, the severity of disorders will be assessed using clinical rating scales. Data will be collected from each participant at least twice during the study period and up to a maximum of five times at an interval of at least one month. Discussion: This study is unique in its large sample size and the novelty of its method, and has potential for applications in many fields. We have some challenges regarding inter-rater reliability and the linguistic peculiarities of Japanese. As of September 2022, we have collected a total of >1000 records from >400 participants. To the best of our knowledge, this data sample is one of the largest in this field. Clinical Trial Registration: Identifier: UMIN000032141.
KW - biomarker
KW - language
KW - machine learning
KW - natural language processing (computer science)
KW - neurocognitive disorders
KW - psychiatric disorders
UR - http://www.scopus.com/inward/record.url?scp=85144012376&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85144012376&partnerID=8YFLogxK
U2 - 10.3389/fpsyt.2022.954703
DO - 10.3389/fpsyt.2022.954703
M3 - Article
AN - SCOPUS:85144012376
SN - 1664-0640
VL - 13
JO - Frontiers in Psychiatry
JF - Frontiers in Psychiatry
M1 - 954703
ER -