TY - JOUR
T1 - Utterance classification using linguistic and non-linguistic information for network-based speech-to-speech translation systems
AU - Sugiura, Komei
AU - Lee, Ryong
AU - Kashioka, Hideki
AU - Zettsu, Koji
AU - Kidawara, Yutaka
PY - 2013
Y1 - 2013
N2 - Network-based mobile services, such as speech-to-speech translation and voice search, enable the construction of large-scale log database including speech. We have developed a smartphone application called VoiceTra for speech-to-speech translation and have collected 10, 000, 000 utterances so far. This huge corpus is unique in size and spatio-temporal information, it contains information on anonymized user locations. This spatio-temporal corpus can be used for improving the accuracy of its speech recognition and machine translation, and it will open the door for the study of the location dependency of vocabulary and new applications for location-based services. This paper first analyzes the corpus and then presents a novel method for classifying utterances using linguistic and non-linguistic information. L2-regularized Logistic Regression is used for utterance classification. Our experiments performed on the VoiceTra log corpus revealed that our proposed method outperformed baseline methods in terms of F measure.
AB - Network-based mobile services, such as speech-to-speech translation and voice search, enable the construction of large-scale log database including speech. We have developed a smartphone application called VoiceTra for speech-to-speech translation and have collected 10, 000, 000 utterances so far. This huge corpus is unique in size and spatio-temporal information, it contains information on anonymized user locations. This spatio-temporal corpus can be used for improving the accuracy of its speech recognition and machine translation, and it will open the door for the study of the location dependency of vocabulary and new applications for location-based services. This paper first analyzes the corpus and then presents a novel method for classifying utterances using linguistic and non-linguistic information. L2-regularized Logistic Regression is used for utterance classification. Our experiments performed on the VoiceTra log corpus revealed that our proposed method outperformed baseline methods in terms of F measure.
KW - GIS
KW - smartphone
KW - speech-to-speech translation
UR - http://www.scopus.com/inward/record.url?scp=84883535622&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84883535622&partnerID=8YFLogxK
U2 - 10.1109/MDM.2013.96
DO - 10.1109/MDM.2013.96
M3 - Conference article
AN - SCOPUS:84883535622
SN - 1551-6245
VL - 2
SP - 212
EP - 216
JO - Proceedings - IEEE International Conference on Mobile Data Management
JF - Proceedings - IEEE International Conference on Mobile Data Management
M1 - 6569092
T2 - 14th International Conference on Mobile Data Management, MDM 2013
Y2 - 3 June 2013 through 6 June 2013
ER -