Robust Speaker Identification System Based on Multilayer Eigen-Codebook Vector Quantization(<Special Section>Speech Dynamics by Ear, Eye, Mouth and Machine)
スポンサーリンク
概要
- 論文の詳細を見る
This paper presents some effective methods for improving the performance of a speaker identification system. Based on the multiresolution property of the wavelet transform, the input speech signal is decomposed into various frequency subbands in order not to spread noise distortions over the entire feature space. For capturing the characteristics of the vocal tract, the linear predictive cepstral coefficients (LPCC) of the lower frequency subband for each decomposition process are calculated. In addition, a hard threshold technique for the lower frequency subband in each decomposition process is also applied to eliminate the effect of noise interference. Furthermore, cepstral domain feature vector normalization is applied to all computed features in order to provide similar parameter statistics in all acoustic environments. In order to effectively utilize all these multiband speech features, we propose a modified vector quantization as the identifier. This model uses the multilayer concept to eliminate the interference among the multiband speech features and then uses the principal component analysis (PCA) method to evaluate the codebooks for capturing a more detailed distribution of the speaker's phoneme characteristics. The proposed method is evaluated using the KING speech database for text-independent speaker identification. Experimental results show that the recognition performance of the proposed method is better than those of the vector quantization (VQ) and the Gaussian mixture model (GMM) using full-band LPCC and mel-frequency cepstral coefficients (MFCC) features in both clean and noisy environments. Also, a satisfactory performance can be achieved in low SNR environments.
- 社団法人電子情報通信学会の論文
- 2004-05-01
著者
-
HSIEH Ching-Tang
Department of Electrical Engineering Tamkang University
-
LAI Eugene
Department of Electrical Engineering, Tamkang University
-
CHEN Wan-Chen
Department of Electrical Engineering, Tamkang University
-
Chen Wan-chen
Department Of Electrical Engineering Tamkang University
-
Lai Eugene
Department Of Electrical Engineering Tamkang University
関連論文
- Continuous Speech Segmentation Based on a Self-Learning Neuro-Fuzzy System (Special Section on Digital Signal Processing)
- A Novel Bandelet-Based Image Inpainting
- Progressive Image Inpainting Based on Wavelet Transform(Image Coding, Information Theory and Its Applications)
- Robust Speaker Identification System Based on Multilayer Eigen-Codebook Vector Quantization(Speech Dynamics by Ear, Eye, Mouth and Machine)
- Generalized Fuzzy Kohonen Clustering Networks (Special Section on Information Theory and Its Applications)