TY - GEN
T1 - Robot-directed speech detection using multimodal semantic confidence based on speech, image, and motion
AU - Zuo, Xiang
AU - Iwahashi, Naoto
AU - Taguchi, Ryo
AU - Matsuda, Shigeki
AU - Sugiura, Komei
AU - Funakoshi, Kotaro
AU - Nakano, Mikio
AU - Oka, Natsuki
PY - 2010
Y1 - 2010
N2 - In this paper, we propose a novel method to detect robot-directed (RD) speech that adopts the Multimodal Semantic Confidence (MSC) measure. The MSC measure is used to decide whether the speech can be interpreted as a feasible action under the current physical situation in an object manipulation task. This measure is calculated by integrating speech, image, and motion confidence measures with weightings that are optimized by logistic regression. Experimental results show that, compared with a baseline method that uses speech confidence only, MSC achieved an absolute increase of 5% for clean speech and 12% for noisy speech in terms of average maximum F-measure.
AB - In this paper, we propose a novel method to detect robot-directed (RD) speech that adopts the Multimodal Semantic Confidence (MSC) measure. The MSC measure is used to decide whether the speech can be interpreted as a feasible action under the current physical situation in an object manipulation task. This measure is calculated by integrating speech, image, and motion confidence measures with weightings that are optimized by logistic regression. Experimental results show that, compared with a baseline method that uses speech confidence only, MSC achieved an absolute increase of 5% for clean speech and 12% for noisy speech in terms of average maximum F-measure.
KW - Human-robot interaction
KW - Multimodal semantic confidence
KW - Robot-directed speech detection
UR - http://www.scopus.com/inward/record.url?scp=78049359510&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78049359510&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2010.5494889
DO - 10.1109/ICASSP.2010.5494889
M3 - Conference contribution
AN - SCOPUS:78049359510
SN - 9781424442966
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 2458
EP - 2461
BT - 2010 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2010 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2010 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2010
Y2 - 14 March 2010 through 19 March 2010
ER -