Abstract
We present JNV (Japanese Nonverbal Vocalizations) corpus, a corpus of Japanese nonverbal vocalizations (NVs) with diverse phrases and emotions. Existing Japanese NV corpora either lack phrase diversity or focus on a small number of emotions, which makes it difficult to analyze the characteristics of Japanese NVs and support downstream tasks like emotion recognition. We first propose a corpus-design method that contains two phases: (1) collecting NVs phrases based on crowd-sourcing; (2) recording NVs by stimulating speakers with emotional scenarios. We then collect 420 audio clips from 4 speakers that cover 6 emotions based on the proposed method. Results of comprehensive objective and subjective experiments demonstrate that (1) the emotions of the collected NVs can be recognized with high accuracy by both human evaluators and statistical models; (2) the collected NVs have a high authenticity comparable to previous corpora of English NVs. Additionally, we analyze the distributions of vowel types in Japanese and conduct feature importance analysis to show discriminative acoustic features between emotion categories in Japanese NVs. We publicate JNV to advance further development in this field.
Original language | English |
---|---|
Article number | 103004 |
Journal | Speech Communication |
Volume | 156 |
DOIs | |
Publication status | Published - 2024 Jan |
Externally published | Yes |
Keywords
- Corpus design
- Emotion
- Japanese
- Nonverbal expression
- Nonverbal vocalization
ASJC Scopus subject areas
- Software
- Modelling and Simulation
- Communication
- Language and Linguistics
- Linguistics and Language
- Computer Vision and Pattern Recognition
- Computer Science Applications