Robust Talker Direction Estimation Based on Weighted CSP Analysis and Maximum Likelihood Estimation(Speech Enhancement, <Special Section> Statistical Modeling for Speech Processing)
スポンサーリンク
概要
- 論文の詳細を見る
This paper describes a new talker direction estimation method for front-end processing to capture distant-talking speech by using a microphone array. The proposed method consists of two algorithms: One is a TDOA (Time Delay Of Arrival) estimation algorithm based on a weighted CSP (Cross-power Spectrum Phase) analysis with an average speech spectrum and CSP coefficient subtraction. The other is a talker direction estimation algorithm based on ML (Maximum Likelihood) estimation in a time sequence of the estimated TDOAs. To evaluate the effectiveness of the proposed method, talker direction estimation experiments were carried out in an actual office room. The results confirmed that the talker direction estimation performance of the proposed method is superior to that of the conventional methods in both diffused- and directional-noise environments.
- 社団法人電子情報通信学会の論文
- 2006-03-01
著者
-
DENDA Yuki
Ritsumeikan University
-
Denda Yuki
Ritsumeikan Univ.
-
Yamashita Yoichi
College Of Information Science And Engineering Ritsumeikan University
-
Nishiura Takanobu
College Of Information Science And Engineering Ritsumeikan University
-
DENDA Yuki
the Graduate School of Science and Engineering, Ritsumeikan University
-
NISHIURA Takanobu
the Graduate School of Science and Engineering, Ritsumeikan University
-
YAMASHITA Yoichi
the Graduate School of Science and Engineering, Ritsumeikan University
-
Nishiura Takanobu
Ritsumeikan Univ. Kusatsu‐shi Jpn
-
Yamashita Yoichi
Ritsumeikan Univ. Kusatsu‐shi Jpn
関連論文
- CENSREC-1-C : An evaluation framework for voice activity detection under noisy environments
- AURORA-2J: An Evaluation Framework for Japanese Noisy Speech Recognition(Speech Corpora and Related Topics, Corpus-Based Speech Technologies)
- Omnidirectional Audio-Visual Talker Localization Based on Dynamic Fusion of Audio-Visual Features Using Validity and Reliability Criteria
- Robust Talker Direction Estimation Based on Weighted CSP Analysis and Maximum Likelihood Estimation(Speech Enhancement, Statistical Modeling for Speech Processing)
- Multiple Sound Source Localization Based on Inter-Channel Correlation Using a Distributed Microphone System in a Real Environment
- Construction of a Test Collection for Spoken Document Retrieval from Lecture Audio Data
- Multiple-nulls-steering beamformer based on both talker and noise direction-of-arrival estimation