JNV corpus: A corpus of Japanese nonverbal vocalizations with diverse phrases and emotions

Detai Xin, Shinnosuke Takamichi, Hiroshi Saruwatari

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

We present JNV (Japanese Nonverbal Vocalizations) corpus, a corpus of Japanese nonverbal vocalizations (NVs) with diverse phrases and emotions. Existing Japanese NV corpora either lack phrase diversity or focus on a small number of emotions, which makes it difficult to analyze the characteristics of Japanese NVs and support downstream tasks like emotion recognition. We first propose a corpus-design method that contains two phases: (1) collecting NVs phrases based on crowd-sourcing; (2) recording NVs by stimulating speakers with emotional scenarios. We then collect 420 audio clips from 4 speakers that cover 6 emotions based on the proposed method. Results of comprehensive objective and subjective experiments demonstrate that (1) the emotions of the collected NVs can be recognized with high accuracy by both human evaluators and statistical models; (2) the collected NVs have a high authenticity comparable to previous corpora of English NVs. Additionally, we analyze the distributions of vowel types in Japanese and conduct feature importance analysis to show discriminative acoustic features between emotion categories in Japanese NVs. We publicate JNV to advance further development in this field.

Original languageEnglish
Article number103004
JournalSpeech Communication
Volume156
DOIs
Publication statusPublished - 2024 Jan
Externally publishedYes

Keywords

  • Corpus design
  • Emotion
  • Japanese
  • Nonverbal expression
  • Nonverbal vocalization

ASJC Scopus subject areas

  • Software
  • Modelling and Simulation
  • Communication
  • Language and Linguistics
  • Linguistics and Language
  • Computer Vision and Pattern Recognition
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'JNV corpus: A corpus of Japanese nonverbal vocalizations with diverse phrases and emotions'. Together they form a unique fingerprint.

Cite this