Text-Independent Speaker Identification Utilizing Likelihood Normalization Technique

概要

論文の詳細を見る
In this paper we describe a method, which allows the likelihood normalization technique, widely used for speaker verification, to be implemented in a text-independent speaker identification system. The essence of this method is to apply likelihood normalization at frame level instead of, as it is usually done, at utterance level. Every frame of the test utterance is inputed to all the reference models in parallel. In this procedure, for each frame, likelihoods from all the models are available, hence they can be normalized at every frame. A special kind of likelihood normalization, called Weighting Models Rank, is also experimented. We have implemented these techniques in speaker identification system based on VQ-distortion codebooks or Gaussian Mixture Models. Evaluation results showed that the frame level likelihood normalization technique gives higher speaker identification rates than the standard accumulated likelihood approach.
社団法人電子情報通信学会の論文
1997-05-25