Utterance classification using linguistic and non-linguistic information for network-based speech-to-speech translation systems

Komei Sugiura, Ryong Lee, Hideki Kashioka, Koji Zettsu, Yutaka Kidawara

研究成果: Conference article査読

抄録

Network-based mobile services, such as speech-to-speech translation and voice search, enable the construction of large-scale log database including speech. We have developed a smartphone application called VoiceTra for speech-to-speech translation and have collected 10, 000, 000 utterances so far. This huge corpus is unique in size and spatio-temporal information, it contains information on anonymized user locations. This spatio-temporal corpus can be used for improving the accuracy of its speech recognition and machine translation, and it will open the door for the study of the location dependency of vocabulary and new applications for location-based services. This paper first analyzes the corpus and then presents a novel method for classifying utterances using linguistic and non-linguistic information. L2-regularized Logistic Regression is used for utterance classification. Our experiments performed on the VoiceTra log corpus revealed that our proposed method outperformed baseline methods in terms of F measure.

本文言語English
論文番号6569092
ページ(範囲)212-216
ページ数5
ジャーナルProceedings - IEEE International Conference on Mobile Data Management
2
DOI
出版ステータスPublished - 2013
外部発表はい
イベント14th International Conference on Mobile Data Management, MDM 2013 - Milan, Italy
継続期間: 2013 6月 32013 6月 6

ASJC Scopus subject areas

  • 工学(全般)

フィンガープリント

「Utterance classification using linguistic and non-linguistic information for network-based speech-to-speech translation systems」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル