テキスト独立型話者認識システムにおけるLPC-ケプストラム,ピッチ,LPC-残差の統合利用
スポンサーリンク
概要
- 論文の詳細を見る
In the speaker recognition, when the cepstral coefficients are calculated from the LPC analysis parameters, the prediction error, or LPC residual signal, is usually ignored. However, there is an evidence that it contains a speaker specific information. The fundamental frequency of the speech signal or the pitch, which is usually extracted from the LPC residual, has been used for speaker recognition purposes, but because of the high intraspeaker variability of the pitch it is also often ignored. This paper describes our approach to integrating the pitch and LPC-residual with the LPC-cepstrum in a Gaussian Mixture Model (GMM) based speaker recognition system. The pitch and/or LPC-residual are considered as an additional features to the main LPC derived cepstral coefficients and are represented as a logarithm of the F_O and as a filter bank mel frequency cepstral (MFCC) vector respectively. The second task of this research was to verify whether the correlation between the different information sources is useful for the speaker recognition task. For the experiments we used the NTT database consisting of high quality speech samples. The speaker recognition system was evaluated in three modes-integrating only pitch or only LPC-residual and integrating both of them. The results showed that adding the pitch gives significant improvement only when the correlation between the pitch and cepstral coefficients is used. Adding only LPC-residual also gives significant improvement, but in contrast to the pitch, using the correlation with the cepstral coefficients does not have big effect. The best results we achieved using both the pitch and LPC-residual and are 98.5% speaker identification rate and O.21% speaker verification equal error rate compared to 97.0% and 1.07% of the baseline system respectively.
著者
-
マルコフ コンスタンティン
豊橋技術科学大学
-
中川 聖一
Toyohashi Univ. Technol. Aichi Jpn
-
マルコフ コンスタンティン
Department of Information and Computer Sciences, Toyohashi University of Technology
-
中川 聖一
Department of Information and Computer Sciences, Toyohashi University of Technology
関連論文
- テキスト独立型話者認識システムにおけるLPC-ケプストラム,ピッチ,LPC-残差の統合利用
- クリーン音声と電話音声によるLPCケプストラムとMFCCの話者認識の比較
- 最大正規化尤度推定-話者識別のための新しい識別学習手法
- フレームレベルの尤度正規化に基づく話者照合の評価
- フレームレベルで尤度処理を用いるテキスト独立話者識別システム