Comparison of Methods for Topic Classification of Spoken Inquiries
スポンサーリンク
概要
- 論文の詳細を見る
In this work, we address the topic classification of spoken inquiries in Japanese that are received by a speech-oriented guidance system operating in a real environment. The classification of spoken inquiries is often hindered by automatic speech recognition (ASR) errors, the sparseness of features and the shortness of spontaneous speech utterances. Here, we compare the performances of a support vector machine (SVM) with a radial basis function (RBF) kernel, PrefixSpan boosting (pboost) and the maximum entropy (ME) method, which are supervised learning methods. We also combine their predictions using a stacked generalization (SG) scheme. We also perform an evaluation using words or characters as features for the classifiers. Using characters as features is possible in Japanese owing to the presence of kanji, ideograms originating from Chinese characters that represent not only sounds but also meanings. We performed analyses on the performance of the above methods and their combination in dealing with the indicated problems. Experimental results show an F-measure of 86.87% for the classification of ASR results from children's inquiries with an average performance improvement of 2.81% compared with the performance of individual classifiers, and an F-measure of 93.96% with an average improvement of 1.89% for adults' inquiries when using the SG scheme and character features.
著者
-
MATSUI Tomoko
The Institute of Statistical Mathematics
-
SARUWATARI Hiroshi
Nara Institute of Science and Technology
-
SHIKANO Kiyohiro
Nara Institute of Science and Technology
-
KAWANAMI Hiromichi
Nara Institute of Science and Technology
-
Torres Rafael
Nara Institute of Science and Technology
関連論文
- Speaker Recognition without Feature Extraction Process
- Development of real-time audio localization control system (応用音響)
- EA2010-24 Development of real-time audio localization control system
- Sound reproduction based on multi-channel inverse filtering and WFS
- Building an Effective Speech Corpus by Utilizing Statistical Multidimensional Scaling Method
- Cost Reduction of Acoustic Modeling for Real-Environment Applications Using Unsupervised and Selective Training
- Reducing Computation Time of the Rapid Unsupervised Speaker Adaptation Based on HMM-Sufficient Statistics(Speech and Hearing)
- Improving Rapid Unsupervised Speaker Adaptation Based on HMM-Sufficient Statistics in Noisy Environments Using Multi-Template Models(Speech Recognition, Statistical Modeling for Speech Processing)
- Utterance-Based Selective Training for the Automatic Creation of Task-Dependent Acoustic Models(Speech Recognition, Statistical Modeling for Speech Processing)
- Designing Target Cost Function Based on Prosody of Speech Database(Speech Synthesis and Prosody, Corpus-Based Speech Technologies)
- Designing Target Cost Function Based on Prosody of Speech Database
- A MAP Estimator for the Enhancement of Speech Signal Separated by ICA Algorithm (国際ワークショップ Frontiers in Speech and Hearing Research)
- Effect of Central Limit Theorem non-compliance on blind separation of speech by negentropy maximization
- Blind Separation of Speech by Fixed-Point ICA with Source Adaptive Negentropy Approximation(Blind Source Separation, Multi-channel Acoustic Signal Processing)
- Robots that can hear, understand and talk
- Probability Distribution of Time-Series of Speech Spectral Components(Audio/Speech Coding)(Applications and Implementations of Digital Signal Processing)
- A Microphone Array-Based 3-D N-Best Search Method for Recognizing Multiple Sound Sources
- 複数話者の音声認識における音源方向経路間距離を用いた3-D N-best探索法の評価
- Non-Audible Murmur (NAM) Recognition Exploiting Adaptation Techniques
- An HMM State Duration Control Algorithm Applied to Large-Vocabulary Spontaneous Speech Recognition
- Development and evaluation of pocket-size real-time blind source separation microphone
- Objective sound quality comparison based on higher-order statistics for nonlinear noise reduction methods (応用音響)
- Objective sound quality evaluation for combination method of beamforming and spectral subtraction (応用音響)
- Fast Convergence Blind Source Separation Using Frequency Subband Interpolation by Null Beamforming
- Rapid Compensation of Temperature Fluctuation Effect for Multichannel Sound Field Reproduction System
- Development, Long-Term Operation and Portability of a Real-Environment Speech-Oriented Guidance System
- Interface for Barge-in Free Spoken Dialogue System Using Nullspace Based Sound Field Control and Beam forming (Speech/Audio Processing, Multidimensional Signal Processing and Its Application)
- On-Line Relaxation Algorithm Applicable to Acoustic Fluctuation for Inverse Filter in Multichannel Sound Reproduction System(Sound Field Reproduction, Multi-channel Acoustic Signal Processing)
- 複数モデルを用いた十分統計量に基く教師なし話者適応における学習話者のクラス化の検討
- Iterative Inverse Filter Relaxation Algorithm for Adaptation to Acoustic Fluctuation in Sound Reproduction System
- Sound Reproduction System Including Adaptive Compensation of Temperature Fluctuation Effect for Broad-Band Sound Control(Special Section on Digital Signal Processing)
- A Self-Generator Method for Initial Filters of SIMO-ICA Applied to Blind Separation of Binaural Sound Mixtures(Blind Source Separation, Multi-channel Acoustic Signal Processing)
- Multistage SIMO-Model-Based Blind Source Separation Combining Frequency-Domain ICA and Time-Domain ICA(Adaptive Signal Processing and Its Applications)
- Evaluation of Extremely Small Sound Source Signals Used in Speaking-Aid System with Statistical Voice Conversion
- Improvements of the One-to-Many Eigenvoice Conversion System
- Esophageal Speech Enhancement Based on Statistical Voice Conversion with Gaussian Mixture Models
- Adaptive Training for Voice Conversion Based on Eigenvoices
- Blind Separation and Deconvolution for Convolutive Mixture of Speech Combining SIMO-Model-Based ICA and Multichannel Inverse Filtering(Engineering Acoustics)
- High-Fidelity Blind Separation of Acoustic Signals Using SIMO-Model-Based Independent Component Analysis(Engineering Acoustics)
- Speaker Recognition without Feature Extraction Process
- Speaker Recognition without Feature Extraction Process
- A Speech Dialogue System with Multimodal Interface for Telephone Directory Assistance
- Overdetermined Blind Separation for Real Convolutive Mixtures of Speech Based on Multistage ICA Using Subarray Processing(Speech/Acoustic Signal Processing)(Digital Signal Processing)
- Stable Learning Algorithm for Blind Separation of Temporally Correlated Acoustic Signals Combining Multistage ICA and Linear Prediction(Digital Signal Processing)
- Blind Source Separation of Acoustic Signals Based on Multistage ICA Combining Frequency-Domain ICA and Time-Domain ICA
- Fast-Convergence Algorithm for Blind Source Separation Based on Array Signal Processing
- An Iterative Inverse Filter Design Method for the Multichannel Sound Field Sound Field Reproduction System(Special Section on Acoustic Signal Processing)
- Sound Field Reproduction by Wavefront Synthesis Using Directly Aligned Multi Point Control
- Theoretical Analysis of Amounts of Musical Noise and Speech Distortion in Structure-Generalized Parametric Blind Spatial Subtraction Array
- Speech Prior Estimation for Generalized Minimum Mean-Square Error Short-Time Spectral Amplitude Estimator
- Comparison of Methods for Topic Classification of Spoken Inquiries
- Evaluation Framework Design of Spoken Term Detection Study at the NTCIR-9 IR for Spoken Documents Task
- Robust Sound Field Reproduction against Listeners Movement Utilizing Image Sensor