Response Timing Detection Using Prosodic and Linguistic Information for Human-friendly Spoken Dialog Systems
スポンサーリンク
概要
- 論文の詳細を見る
If a dialog system can respond to the user as reasonably as a human, the interaction will become smoother. Timing of the response such as back-channels and turn-taking plays an important role in such a smooth dialog as in human-human interaction. We developed a response timing generator for such a dialog system. This generator uses a decision tree to detect the timing based on the features coming from some prosodic and linguistic information. The timing generator decides the action of the system at every 100 ms during the users pause. In this paper, we describe a robust spoken dialog system using the timing generator. Subjective evaluation proved that almost all of the subjects experienced a friendly feeling from the system.
- 社団法人 人工知能学会の論文
- 2005-11-01
著者
-
KITAOKA Norihide
Nagoya University
-
Nakagawa Seiichi
Toyohashi Univ. Of Technol. Toyohashi‐shi Jpn
-
Kitaoka Norihide
Toyohashi University Of Technology
-
TAKEUCHI Masashi
Toyohashi University of Technology
-
NISHIMURA Ryota
Toyohashi University of Technology
-
Nishimura Ryota
Toyohashi Univ. Technol. Aichi Jpn
関連論文
- Acoustic Feature Transformation Combining Average and Maximum Classification Error Minimization Criteria
- Acoustic Feature Transformation Based on Discriminant Analysis Preserving Local Structure for Speech Recognition
- CENSREC-1-C : An evaluation framework for voice activity detection under noisy environments
- Auditory perception versus automatic estimation of location and orientation of an acoustic source in a real environment
- TEXT-INDEPENDENT SPEAKER IDENTIFICATION ON TIMIT DATABASE
- AURORA-2J: An Evaluation Framework for Japanese Noisy Speech Recognition(Speech Corpora and Related Topics, Corpus-Based Speech Technologies)
- Evaluation of Combinational Use of Discriminant Analysis-Based Acoustic Feature Transformation and Discriminative Training
- Robust distant speech recognition by combining variable-term spectrum based position-dependent CMN with conventional CMN (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- Linear Discriminant Analysis Using a Generalized Mean of Class Covariances and Its Application to Speech Recognition
- Robust Speech Recognition by Combining Short-Term and Long-Term Spectrum Based Position-Dependent CMN with Conventional CMN
- Robust distant speech recognition by combining variable-term spectrum based position-dependent CMN with conventional CMN
- Acoustic Feature Transformation Based on Discriminant Analysis Preserving Local Structure for Speech Recognition
- Noisy Speech Recognition Based on Integration/Selection of Multiple Noise Suppression Methods Using Noise GMMs
- Acoustic Feature Transformation Combining Average and Maximum Classification Error Minimization Criteria
- A Survey on Automatic Speech Recognition(Special Issue on the 2000 IEICE Excellent Paper Award)
- Speaker Recognition by Combining MFCC and Phase Information in Noisy Conditions
- Distant-Talking Speech Recognition Based on Spectral Subtraction by Multi-Channel LMS Algorithm
- Response Timing Detection Using Prosodic and Linguistic Information for Human-friendly Spoken Dialog Systems
- INVESTIGATIONS ON TEXT-INDEPENDENT SPEAKER IDENTIFICATION
- Risk-Based Semi-Supervised Discriminative Language Modeling for Broadcast Transcription
- Acoustic Model Training Using Pseudo-Speaker Features Generated by MLLR Transformations for Robust Speaker-Independent Speech Recognition
- Risk-Based Semi-Supervised Discriminative Language Modeling for Broadcast Transcription
- Response Timing Detection Using Prosodic and Linguistic Information for Human-friendly Spoken Dialog Systems