Detecting Robot-Directed Speech by Situated Understanding in Physical Interaction
スポンサーリンク
概要
- 論文の詳細を見る
In this paper, we propose a novel method for a robot to detect robot-directed speech: to distinguish speech that users speak to a robot from speech that users speak to other people or to themselves. The originality of this work is the introduction of a multimodal semantic confidence (MSC) measure, which is used for domain classification of input speech based on the decision on whether the speech can be interpreted as a feasible action under the current physical situation in an object manipulation task. This measure is calculated by integrating speech, object, and motion confidence with weightings that are optimized by logistic regression. Then we integrate this measure with gaze tracking and conduct experiments under conditions of natural human-robot interactions. Experimental results show that the proposed method achieves a high performance of 94% and 96% in average recall and precision rates, respectively, for robot-directed speech detection.
著者
-
Sugiura Komei
National Institute Of Information And Communications Technology
-
Matsuda Shigeki
National Inst. Of Information And Communications Technol. “keihanna Sci. City "
-
FUNAKOSHI Kotaro
Honda Research Institute Japan Co., Ltd.
-
NAKANO Mikio
Honda Research Institute Japan Co., Ltd.
-
Taguchi Ryo
Advanced Telecommunication Research Labs and Nagoya Institute of Technology
-
Iwahashi Naoto
Advanced Telecommunication Research Labs and National Institute of Information and Communications Technology
-
Oka Natsuki
Kyoto Institute of Technology
-
Zuo Xiang
Advanced Telecommunication Research Labs and Kyoto Institute of Technology
関連論文
- Detecting Robot-Directed Speech by Situated Understanding in Physical Interaction
- Learning, Generation and Recognition of Motions by Reference-Point-Dependent Probabilistic Models
- A Method for Predicting Stressed Words in Teaching Materials for English Jazz Chants
- Automatic Allocation of Training Data for Speech Understanding Based on Multiple Model Combinations
- CENSREC-4: An evaluation framework for distant-talking speech recognition in reverberant environments
- Collecting Colloquial and Spontaneous-like Sentences from Web Resources for Constructing Chinese Language Models of Speech Recognition
- Collecting Colloquial and Spontaneous-like Sentences from Web Resources for Constructing Chinese Language Models of Speech Recognition
- Situated Spoken Dialogue with Robots Using Active Learning
- Automatic Allocation of Training Data for Speech Understanding Based on Multiple Model Combinations