The Use of Overlapped Sub-Bands in Multi-Band, Multi-SNR, Multi-Path Recognition of Noisy Word Utterances
スポンサーリンク
概要
- 論文の詳細を見る
A solution to the problem of improving robustness to noise in automatic speech recognition is presented in the framework of multi-band, multi-SNR, and multi-path approaches. In our word recognizer, the whole frequency band is divided into seven-overlapped subbands, and then sub-band noisy phoneme HMMs are trained on speech data mixed with the filtered white Gaussian noise at multiple SNRs. The acoustic model of a word is built as a set of concatenations of clean and noisy sub-band phoneme HMMs arranged in parallel. A Viterbi decoder allows a search path to transit to another SNR condition at a phoneme boundary. The recognition scores of the sub-bands are then recombined to give the score for a word. Experiments show that the overlapped seven-band system yields the best performance under nonstationary ambient noises. It is also shown that the use of filtered white Gaussian noise is advantageous for training noisy phoneme HMMs.
- (社)電子情報通信学会の論文
- 2008-06-01
著者
-
Takagi Kazuyuki
University Of Electro-communications
-
OZEKI Kazuhiko
University of Electro-Communications
-
TSUBOI Yutaka
University of Electro-Communications
-
IHARA Takehiro
University of Electro-Communications
関連論文
- Effectiveness of Word String Language Models on Noisy Broadcast News Speech Recognition
- The Use of Overlapped Sub-Bands in Multi-Band, Multi-SNR, Multi-Path Recognition of Noisy Word Utterances
- Automatic Adjustment of Subband Likelihood Recombination Weights for Improving Noise-Robustness of a Multi-SNR Multi-Band Speaker Identification System(Speech and Hearing)