Stepwise Phase Difference Restoration Method for DOA Estimation of Multiple Sources
スポンサーリンク
概要
- 論文の詳細を見る
We propose a new methodology of DOA (direction of arrival) estimation named SPIRE (Stepwise Phase dIfference REstoration) that is able to estimate sound source directions even if there is more than one source in a reverberant environment. DOA estimation in reverberant environments is difficult because the variance of the direction of an estimated sound source increases in reverberant environments. Therefore, we want the distance between microphones to be long. However, because of the spatial aliasing problem, the distance cannot be longer than half the wavelength of the maximum frequency of a source. DOA estimation performance of SPIRE is not limited by the spatial aliasing problem. The major feature of SPIRE is restoration of the phase difference of a microphone pair (M1) by using the phase difference of another microphone pair (M2) under the condition that the distance between the M1 microphones is longer than the distance between the M2 microphones. This restoration process enables the reduction of the variance of an estimated sound source direction and can alleviates the spatial aliasing problem that occurs with the M1 phase difference using direction estimation of the M2 microphones. The experimental results in a reverberant environment (reverberation time =about 300ms) indicate that even when there are multiple sources, the proposed method can estimate the source direction more accurately than conventional methods. In addition, DOA estimation performance of SPIRE with the array length 0.2m is shown to be almost equivalent to that of GCC-PHAT with the array length 0.5m. SPIRE can executes DOA estimation with a smaller microphone array than GCC-PHAT. From the viewpoint of the hardware size and coherence problem, the array length is required to be as small as possible. This feature of SPIRE is preferable.
- (社)電子情報通信学会の論文
- 2008-11-01
著者
-
Obuchi Yasunari
Central Research Laboratory Hitachi Ltd.
-
TOGAMI Masahito
Central Research Laboratory, Hitachi Ltd.
-
Togami Masahito
Central Research Laboratory Hitachi Ltd.
関連論文
- Multi-Input Feature Combination in the Cepstral Domain for Practical Speech Recognition Systems
- Intentional Voice Command Detection for Trigger-Free Speech Interface
- Emotion Recognition using Mel-Frequency Cepstral Coefficients
- Stepwise Phase Difference Restoration Method for DOA Estimation of Multiple Sources
- Multichannel Two-Stage Beamforming with Unconstrained Beamformer and Distortion Reduction
- Noise suppression method for preprocessor of time-lag speech recognition system based on bidirectional optimally modified log spectral amplitude estimation
- Emotion Recognition using Mel-Frequency Cepstral Coefficients