Multichannel Two-Stage Beamforming with Unconstrained Beamformer and Distortion Reduction
スポンサーリンク
概要
- 論文の詳細を見る
This paper proposes a novel multichannel speech enhancement technique for reverberant rooms that is effective when noise sources are spatially stationary, such as a projector fan noise, an air-conditioner noise, and unwanted speech sources at the back of microphones. Speech enhancement performance of the conventional multichannel Wiener filter (MWF) degrades when the Signal-to-Noise Ratio (SNR) of the current microphone input signal changes from the noise-only period. Furthermore, the MWF structure is computationally inefficient, because the MWF updates the whole spatial beamformer periodically to track switching of the speakers (e.g. turn-taking). In contrast to the MWF, the proposed method reduces noise independently of the SNR. The proposed method has a novel two-stage structure, which reduces noise and distortion of the desired source signal in a cascade manner by using two different beamformers. The first beamformer focuses on noise reduction without any constraint on the desired source, which is insensitive to SNR variation. However, the output signal after the first beamformer is distorted. The second beamformer focuses on distortion reduction of the desired source signal. Theoretically, complete elimination of distortion is assured. Additionally, the proposed method has a computationally efficient structure optimized for spatially stationary noise reduction problems. The first beamformer is updated only when the speech enhancement system is initialized. Only the second beamformer is updated periodically to track switching of the active speaker. The experimental results indicate that the proposed method can reduce spatially stationary noise source signals effectively with less distortion of the desired source signal even in a reverberant conference room.
著者
-
Obuchi Yasunari
Central Research Laboratory Hitachi Ltd.
-
Togami Masahito
Central Research Laboratory Hitachi Ltd.
-
KAWAGUCHI Yohei
Central Research Laboratory, Hitachi Ltd.
関連論文
- Multi-Input Feature Combination in the Cepstral Domain for Practical Speech Recognition Systems
- Intentional Voice Command Detection for Trigger-Free Speech Interface
- Emotion Recognition using Mel-Frequency Cepstral Coefficients
- Stepwise Phase Difference Restoration Method for DOA Estimation of Multiple Sources
- Multichannel Two-Stage Beamforming with Unconstrained Beamformer and Distortion Reduction
- Noise suppression method for preprocessor of time-lag speech recognition system based on bidirectional optimally modified log spectral amplitude estimation
- Emotion Recognition using Mel-Frequency Cepstral Coefficients