Predicting the Degradation of Speech Recognition Performance from Sub-band Dynamic Ranges (特集 音声言語情報処理とその応用)
スポンサーリンク
概要
- 論文の詳細を見る
An acoustic measure for predicting the degradation of speech recognition performance due to noise contamination is developed. The merits of the proposed measure over using conventional SNR are that 1) the measure does not require original clean signal as a reference signal, 2) the measure takes the spectral shape of noise into account and, 3) the measure can be used to predict recognition performance directly. The basic idea of the measure is to utilize the dynamic range of the sub-band signals as an estimate of the SNR and to predict the degradation of recognition performance by taking the product of the recognition accuracy of each sub-band. The proposed measure is tested through experimental evaluation using white Gaussian noise and human-speech-like noise (HSN). In the experiment, the correlation between the predicted and the actual recognition accuracies are 0.96 and 0.99 for white noise and HSN, respectively. The results confirm the effectiveness of the proposed measure.
- 一般社団法人情報処理学会の論文
- 2002-07-15
著者
-
Takeda K
Nagoya Univ. Nagoya Jpn
-
Takeda Kazuya
Graduate School Of Information Science Nagoya University
-
Takeda Kazuya
Department Of Information Electronics Graduate School Of Engineering Nagoya University
-
Takeda K
Center For Integrated Acoustic Information Research Graduate School Of Engineering Nagoya University
-
ITAKURA Fumitada
Graduate School of Information Engineering, Meijo University
-
Kondo M
Graduate School Of Engineering Nagoya University
-
Itakura F
Graduate School Of Information Engineering Meijo University
-
KONDO MASATO
Graduate School of Engineering, Nagoya University
-
Itakura Fumitada
Graduate School Of Information Engineering Meijo University
-
Takeda Kazuya
Graduate School Of Information Science At Nagoya University
-
Itakura Fumitada
Graduate School of Engineering, Nagoya University:Center for Integrated Acoustic Information Research, Nagoya University
-
Takeda Kazuya
Graduate School of Engineering, Nagoya University:Center for Integrated Acoustic Information Research, Nagoya University
-
TAKEDA Kazuya
Graduate School of Engineering, Nagoya University
-
Itakura Fumitada
Graduate School of Engineering, Nagoya University/CIAIR
関連論文
- 磁化シートプラズマを用いたガス・ダイバータの基礎実験
- AN INTEGRATED AUDIO-VISUAL VIEWER FOR A LARGE SCALE MULTIPOINT CAMERAS AND MICROPHONES(International Workshop on Advanced Image Technology 2007)
- CENSREC-1-C : An evaluation framework for voice activity detection under noisy environments
- Driver Identification Using Driving Behavior Signals(Human-computer Interaction)
- AN INTEGRATED AUDIO-VISUAL VIEWER FOR A LARGE SCALE MULTIPOINT CAMERAS AND MICROPHONES
- G_007 Arbitrary Listening-point Generation Using Acoustic Transfer Function Interpolation in A Large Microphone Array
- THE SUB-BAND SOUND WAVE RAY-SPACE REPRESENTATION(International Workshop on Advanced Image Technology 2006)
- A-16-24 3D Sound Wave Field Representation Based on Ray-Space Method(A-16. マルチメディア・仮想環境基礎, 基礎・境界)
- IMPROVEMENT OF CHOLEDOCHOSCOPY : CHROMOENDOCHOLEDOCHOSCOPY, AUTOFLUORESCENCE IMAGING, OR NARROW-BAND IMAGING
- AURORA-2J: An Evaluation Framework for Japanese Noisy Speech Recognition(Speech Corpora and Related Topics, Corpus-Based Speech Technologies)
- Selective Listening Point Audio Based on Blind Signal Separation and Stereophonic Technology
- Head-Related Transfer Function measurement in sagittal and frontal coordinates
- CENSREC-3: An Evaluation Framework for Japanese Speech Recognition in Real Car-Driving Environments(Speech and Hearing)
- Evaluation of HRTFs estimated using physical features
- MC-32 Development of microdrive assembly process
- Multiple Regression of Log Spectra for In-Car Speech Recognition Using Multiple Distributed Microphones(Feature Extraction and Acoustic Medelings, Corpus-Based Speech Technologies)
- Evaluation of Combinational Use of Discriminant Analysis-Based Acoustic Feature Transformation and Discriminative Training
- Acoustic Feature Transformation Based on Discriminant Analysis Preserving Local Structure for Speech Recognition
- Multichannel Speech Enhancement Based on Generalized Gamma Prior Distribution with Its Online Adaptive Estimation
- SNR and sub-band SNR estimation based on Gaussian mixture modeling in the log power domain with application for speech enhancements (第6回音声言語シンポジウム)
- SNR and sub-band SNR estimation based on Gaussian mixture modeling in the log power domain with application for speech enhancements (第6回音声言語シンポジウム)
- SNR and sub-band SNR estimation based on Gaussian mixture modeling in the log power domain with application for speech enhancements (第6回音声言語シンポジウム)
- Acoustic Feature Transformation Combining Average and Maximum Classification Error Minimization Criteria
- Driver's irritation detection using speech recognition results (音声・第10回音声言語シンポジウム)
- Driver's irritation detection using speech recognition results (音声言語情報処理)
- Driver's irritation detection using speech recognition results (言語理解とコミュニケーション・第10回音声言語シンポジウム)
- サブバンドに含まれる周波数成分の瞬時周波数に基づく推定
- Lack of Interaction Between Cefdinir and Calcium Polycarbophil : In vitro and In vivo Studies
- Predicting the Degradation of Speech Recognition Performance from Sub-band Dynamic Ranges (特集 音声言語情報処理とその応用)
- A model of perceptual distance for group delays based on ellipsoidal mapping
- The effect of group delay spectrum on timbre
- Direction of Arrival Estimation Using Nonlinear Microphone Array
- Speech Enhancement Using Nonlinear Microphone Array Based on Noise Adaptive Complementary Beamforming
- Speech Enhancement Using Nonlinear Microphone Array Based on Complementary Beamforming (Special Section on Digital Signal Processing)
- Noise Robust Speech Recognition Using Subband-Crosscorrelation Analysis
- An Acoustically Oriented Vocal-Tract Model
- Comparison of acoustic measures for evaluating speech recognition performance in an automobile
- Estimation of speaker and listener positions in a car using binaural signals
- Sound localization under conditions of covered ears on the horizontal plane
- Single-Channel Multiple Regression for In-Car Speech Enhancement
- Adaptive Nonlinear Regression Using Multiple Distributed Microphones for In-Car Speech Recognition(Speech Enhancement, Multi-channel Acoustic Signal Processing)
- Speech Recognition Using Finger Tapping Timings(Speech and Hearing)
- CIAIR In-Car Speech Corpus : Influence of Driving Status(Corpus-Based Speech Technologies)
- Construction and Evaluation of a Large In-Car Speech Corpus(Speech Corpora and Related Topics, Corpus-Based Speech Technologies)
- Blind Source Separation Using Dodecahedral Microphone Array under Reverberant Conditions
- FOREWORD : Spercial Section on Robust Speech Processing in Realistic Environments
- Method for determining sound localization by auditory masking
- On the use of two-mass vocal cord model in characterizing the stress speech (音声)
- Classification of speech under stress by physical modeling
- Particle Size Distribution Measurement of Free-Falling Fine Particles in a Dusty Plasma Experiment
- Classification of speech under stress using physical features based on two-mass model
- Relaxation behavior of laser-peening residual stress under tensile loading investigated by X-ray and neutron diffraction