Filter Bank Subtraction for Robust Speech Recognition (<Special Issue>Special Issue on Speech Information Processing)
スポンサーリンク
概要
- 論文の詳細を見る
In this paper, we propose a new technique of filter bank subtraction for robust speech recognition under various acoustic conditions. Spectral subtraction is a simple and useful technique for reducing the influence of additive noise. Conventional spectral subtraction assumes accurate estimation of the noise spectrum and no correlation between speech and noise. Those assumptions, however, are rarely satisfied in reality, leading to the degradation of speech recognition accuracy. Moreover, the recognition improvement attained by conventional methods is slight when the input SNR changes sharply. We propose a new method in which the output values of filter banks are used for noise estimation and subtraction. By estimating noise at each filter bank, instead of at each frequency point, the method alleviates the necessity for precise estimation of noise. We also take into consideration expected phase differences between the spectra of speech and noise in the subtraction and control a subtraction coefficient theoretically. Recognition experiments on test sets at several SNRs showed that the filter bank subtraction technique improved the word accuracy significantly and got better results than conventional spectral subtraction on all the test sets. In other experiments, on recognizing speech from TV news field reports with environmental noise, the proposed subtraction method yielded better results than the conventional method.
- 社団法人電子情報通信学会の論文
- 2003-03-01
著者
-
Imai Toru
Nhk Science And Technical Research Laboratories
-
Imai T
Nhk Science And Technical Research Laboratories
-
Ando A
Science And Technical Research Laboratories
-
ONOE Kazuo
NHK Science and Technical Research Laboratories
-
SATO Shoei
NHK Science and Technical Research Laboratories
-
ONOE Kazuo
Science and Technical Research Laboratories, NHK Japan Broadcasting Corporation
-
SEGI Hiroyuki
Science and Technical Research Laboratories, NHK Japan Broadcasting Corporation
-
KOBAYAKAWA Takeshi
Science and Technical Research Laboratories, NHK Japan Broadcasting Corporation
-
SATO Shoei
Science and Technical Research Laboratories, NHK Japan Broadcasting Corporation
-
HOMMA Shinichi
Science and Technical Research Laboratories, NHK Japan Broadcasting Corporation
-
IMAI Toru
Science and Technical Research Laboratories, NHK Japan Broadcasting Corporation
-
ANDO Akio
Science and Technical Research Laboratories, NHK Japan Broadcasting Corporation
-
Segi Hiroyuki
Science And Technical Research Laboratories
-
Homma Shinich
Nhk Science And Technical Research Laboratories
-
Kobayashi Tetsunori
Waseda Univ. Tokyo Jpn
-
Kobayakawa Takeshi
Science And Technical Research Laboratories
-
Ando Akio
Science and Technical Research Laboratories
関連論文
- Robust Speech Recognition by Using Compensated Acoustic Scores(Speech Recognition, Statistical Modeling for Speech Processing)
- Ears of the Robot : Direction of Arrival Estimation Based on Pattern Recognition Using Robot-Mounted Microphones
- Mutual Information Based Dynamic Integration of Multiple Feature Streams for Robust Real-Time LVCSR
- Bi-Spectral Acoustic Features for Robust Speech Recognition
- Online Speech Detection and Dual-Gender Speech Recognition for Captioning Broadcast News(Speech and Hearing)
- Word Error Rate Minimization Using an Integrated Confidence Measure(Speech and Hearing)
- Filter Bank Subtraction for Robust Speech Recognition (Special Issue on Speech Information Processing)
- Simultaneous Subtitling System for Broadcast News Programs with a Speech Recognizer(Special Issue on the 2001 IEICE Excellent Paper Award)
- Acoustic Model Adaptation by Selective Training Using 2-Stage Clustering
- An HMM learning algorithm for minimizing an error function on all training data
- Ears of the Robot : Three Simultaneous Speech Segregation and Recognition Using Robot-Mounted Microphones(Speech and Hearing)
- Genetic Algorithm Based Optimization of Partly-Hidden Markov Model Structure Using Discriminative Criterion(Speech Recognition, Statistical Modeling for Speech Processing)
- Learning Speech Variability in Discriminative Acoustic Model Adaptation
- 連続発話認識のための言語モデル
- Spectral Features for Perceptually Natural Phoneme Replacement by Another Speaker's Speech
- Risk-Based Semi-Supervised Discriminative Language Modeling for Broadcast Transcription
- Decoder for Japanese broadcast news transcription
- Risk-Based Semi-Supervised Discriminative Language Modeling for Broadcast Transcription