Local Peak Enhancement for In-Car Speech Recognition in Noisy Environment
スポンサーリンク
概要
- 論文の詳細を見る
The accuracy of automatic speech recognition in a car is significantly degraded in a very low SNR (Signal to Noise Ratio) situation such as “Fan high” or “Window open”. In such cases, speech signals are often buried in broadband noise. Although several existing noise reduction algorithms are known to improve the accuracy, other approaches that can work with them are still required for further improvement. One of the candidates is enhancement of the harmonic structures in human voices. However, most conventional approaches are based on comb filtering, and it is difficult to use them in practical situations, because their assumptions for F0 detection and for voiced/unvoiced detection are not accurate enough in realistic noisy environments. In this paper, we propose a new approach that does not rely on such detection. An observed power spectrum is directly converted into a filter for speech enhancement, by retaining only the local peaks considered to be harmonic structures in the human voice. In our experiments, this approach reduced the word error rate by 17% in realistic automobile environments. Also, it showed further improvement when used with existing noise reduction methods.
- (社)電子情報通信学会の論文
- 2008-03-01
著者
-
FUKUDA Takashi
Tokyo Research Laboratory, IBM Japan Ltd.
-
Ichikawa Osamu
Tokyo Research Laboratory Ibm Japan Ltd.
-
NISHIMURA Masafumi
Tokyo Research Laboratory, IBM Japan Ltd.
-
Nishimura Masafumi
Tokyo Research Laboratory Ibm Japan Ltd.
-
Nishimura Masafumi
Tokyo Research Lab. Ibm Japan
-
Fukuda Takashi
Tokyo Research Laboratory Ibm Japan Ltd.
関連論文
- Canonicalization of Feature Parameters for Robust Speech Recognition Based on Distinctive Phonetic Feature (DPF) Vectors
- Photocurrent Excitation Spectra Observed with An-Al Heteroelectrodes Biased Reversely and Reflection Spectra in Trans-Polyacetylene
- Automatic Prosody Labeling Using Multiple Models for Japanese(Speech and Hearing)
- PS-ZCPA Based Feature Extraction with Auditory Masking, Modulation Enhancement and Noise Reduction for Robust ASR(Speech Recognition, Statistical Modeling for Speech Processing)
- Confidence Scoring for Accurate HMM-Based Speech Recognition by Using Monophone-Level Normalization Based on Subspace Method (Special Issue on Speech Information Processing)
- Local Peak Enhancement for In-Car Speech Recognition in Noisy Environment
- Simultaneous Adaptation of Echo Cancellation and Spectral Subtraction for In-Car Speech Recognition(Speech Enhancement, Multi-channel Acoustic Signal Processing)
- Sound Source Localization Using a Profile Fitting Method with Sound Reflectors(Speech Dynamics by Ear, Eye, Mouth and Machine)
- Speech Enhancement by Profile Fitting Method (Special Issue on Speech Information Processing)