Distant-Talking Speech Recognition Based on Spectral Subtraction by Multi-Channel LMS Algorithm
スポンサーリンク
概要
- 論文の詳細を見る
We propose a blind dereverberation method based on spectral subtraction using a multi-channel least mean squares (MCLMS) algorithm for distant-talking speech recognition. In a distant-talking environment, the channel impulse response is longer than the short-term spectral analysis window. By treating the late reverberation as additive noise, a noise reduction technique based on spectral subtraction was proposed to estimate the power spectrum of the clean speech using power spectra of the distorted speech and the unknown impulse responses. To estimate the power spectra of the impulse responses, a variable step-size unconstrained MCLMS (VSS-UMCLMS) algorithm for identifying the impulse responses in a time domain is extended to a frequency domain. To reduce the effect of the estimation error of the channel impulse response, we normalize the early reverberation by cepstral mean normalization (CMN) instead of spectral subtraction using the estimated impulse response. Furthermore, our proposed method is combined with conventional delay-and-sum beamforming. We conducted recognition experiments on a distorted speech signal simulated by convolving multi-channel impulse responses with clean speech. The proposed method achieved a relative error reduction rate of 22.4% in relation to conventional CMN. By combining the proposed method with beamforming, a relative error reduction rate of 24.5% in relation to the conventional CMN with beamforming was achieved using only an isolated word (with duration of about 0.6s) to estimate the spectrum of the impulse response.
- 2011-03-01
著者
-
KITAOKA Norihide
Nagoya University
-
Nakagawa Seiichi
Toyohashi Univ. Technol. Toyohashi‐shi Jpn
-
Nakagawa Seiichi
Department Of Information And Computer Sciences Toyohashi University
-
Nakagawa Seiichi
Toyohashi Univ. Of Technol. Toyohashi‐shi Jpn
-
WANG Longbiao
Toyohashi University of Technology
-
Wang Longbiao
Shizuoka Univ. Hamamatsu‐shi Jpn
-
Kitaoka Norihide
Nagoya Univ.
関連論文
- Acoustic Feature Transformation Combining Average and Maximum Classification Error Minimization Criteria
- Acoustic Feature Transformation Based on Discriminant Analysis Preserving Local Structure for Speech Recognition
- Topic dependent language model based on on-line voting (言語理解とコミュニケーション)
- A transitive translation for Indonesian-Japanese CLQA (自然言語処理)
- A Machine Learning Approach for an Indonesian-English Cross Language Question Answering System(Natural Language Processing)
- Indonesian-Japanese Transitive Translation using English for CLIR
- CENSREC-1-C : An evaluation framework for voice activity detection under noisy environments
- Auditory perception versus automatic estimation of location and orientation of an acoustic source in a real environment
- Topic dependent language model based on on-line voting (音声)
- Topic dependent language model based on clustering of noun word history