Text-Independent/Text-Prompted Speaker Recognition by Combining Speaker-Specific GMM with Speaker Adapted Syllable-Based HMM(Speaker Recognition, <Special Section> Statistical Modeling for Speech Processing)
スポンサーリンク
概要
- 論文の詳細を見る
We presented a new text-independent/text-prompted speaker recognition method by combining speaker-specific Gaussian Mixture Model (GMM) with syllable-based HMM adapted by MLLR or MAP. The robustness of this speaker recognition method for speaking style's change was evaluated in this paper. The speaker identification experiment using NTT database which consists of sentences data uttered at three speed modes (normal, fast and slow) by 35 Japanese speakers (22 males and 13 females) on five sessions over ten months was conducted. Each speaker uttered only 5 training utterances (about 20 seconds in total). A combination method reduced the identification error rate by about 50%. We obtained the accuracy of 98.8% for text-independent speaker identification for three speaking style modes (normal, fast, slow) by using a short test utterance (about 4 seconds). Especially, we obtained the accuracy of 99.4% for normal speaking mode. This result was superior to conventional methods for the same database. We show that the attractive result was brought from the compensational effect between speaker specific GMM and speaker adapted syllable based HMM.
- 社団法人電子情報通信学会の論文
- 2006-03-01
著者
-
Nakagawa Seiichi
Toyohashi Univ. Technol. Toyohashi‐shi Jpn
-
Nakagawa Seiichi
Department Of Information And Computer Sciences Toyohashi University
-
NAKAGAWA Seiichi
the Department of Information and Computer Sciences, Toyohashi University of Technology
-
ZHANG Wei
the Department of Information and Computer Sciences, Toyohashi University of Technology
-
TAKAHASHI Mitsuo
the Department of Information and Computer Sciences, Toyohashi University of Technology
-
Nakagawa Seiichi
The Department Of Information And Computer Sciences Toyohashi University Of Technology
-
Zhang Wei
The Department Of Information And Computer Sciences Toyohashi University Of Technology
-
Takahashi Mitsuo
The Department Of Information And Computer Sciences Toyohashi University Of Technology
-
NAKAGAWA Seiichi
the Department of Computer Science and Engineering, Toyohashi University of Technology
関連論文
- Topic dependent language model based on on-line voting (言語理解とコミュニケーション)
- A transitive translation for Indonesian-Japanese CLQA (自然言語処理)
- A Machine Learning Approach for an Indonesian-English Cross Language Question Answering System(Natural Language Processing)
- Indonesian-Japanese Transitive Translation using English for CLIR
- Topic dependent language model based on on-line voting (音声)
- Topic dependent language model based on clustering of noun word history
- Word and class dependency of N-gram language model (音声言語情報処理)
- Word and class dependency of N-gram language model (言語理解とコミュニケーション・第9回音声言語シンポジウム)
- Word and class dependency of N-gram language model (音声・第9回音声言語シンポジウム)
- TEXT-INDEPENDENT SPEAKER IDENTIFICATION ON TIMIT DATABASE
- Text-Independent/Text-Prompted Speaker Recognition by Combining Speaker-Specific GMM with Speaker Adapted Syllable-Based HMM(Speaker Recognition, Statistical Modeling for Speech Processing)
- Idiopathic granulomatous mastitis : Dynamic contrast-enhanced MRI findings with histopathologic correlation
- Bax Induction Activates Apoptotic Cascade via Mitochondrial Cytochrome c Release and Bax Overexpression Enhances Apoptosis Induced by Chemotherapeutic Agents in DLD-1 Colon Cancer Cells
- Evaluation of Combinational Use of Discriminant Analysis-Based Acoustic Feature Transformation and Discriminative Training
- LVCSR based on context-dependent syllable acoustic models (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- Robust distant speech recognition by combining variable-term spectrum based position-dependent CMN with conventional CMN (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- Linear Discriminant Analysis Using a Generalized Mean of Class Covariances and Its Application to Speech Recognition
- Robust Speech Recognition by Combining Short-Term and Long-Term Spectrum Based Position-Dependent CMN with Conventional CMN
- LVCSR based on context-dependent syllable acoustic models
- Robust distant speech recognition by combining variable-term spectrum based position-dependent CMN with conventional CMN
- Improving Keyword Recognition of Spoken Queries by Combining Multiple Speech Recognizer's Outputs for Speech-driven WEB Retrieval Task(Spoken Language Systems, Corpus-Based Speech Technologies)
- An Unsupervised Speaker Adaptation Method for Lecture-Style Spontaneous Speech Recognition Using Multiple Recognition Systems(Spoken Language Systems, Corpus-Based Speech Technologies)
- Speaker Change Detection and Speaker Clustering Using VQ Distortion Measure
- Noisy Speech Recognition Based on Integration/Selection of Multiple Noise Suppression Methods Using Noise GMMs
- Experimental System Assembled for Studying the Chemical Oscillation Behavior of Belousov-Zhabotinskii Reactions in the Microgravity
- Succeeding Word Prediction for Speech Recognition Based on Stochastic Language Model
- A Survey on Automatic Speech Recognition(Special Issue on the 2000 IEICE Excellent Paper Award)
- Relationship among Recognition Rate, Rejection Rate and False Alarm Rate in a Spoken Word Recognition System
- Extension of Heat Exchange Calorimetry to Continuously Stirred Tank Reactor and the Calorimetric Evaluation
- Speaker Recognition by Combining MFCC and Phase Information in Noisy Conditions
- Distant Speech Recognition Using a Microphone Array Network
- Auditory perception versus automatic estimation of location and orientation of an acoustic source in a real environment
- Continuous Speech Recognition Using an On-Line Speaker Adaptation Method Based on Automatic Speaker Clustering (Special Issue on Speech Information Processing)
- Distant-Talking Speech Recognition Based on Spectral Subtraction by Multi-Channel LMS Algorithm
- A Spoken Dialog System for Spontaneous Conversations Considering Response Timing and Response Type
- Neurofibromatosis Type 1 with Basilar Artery Fusiform Aneurysm Manifesting Wallenberg's Syndrome
- Diseases Affecting Sudomotor Function
- Improving the Readability of ASR Results for Lectures Using Multiple Hypotheses and Sentence-Level Knowledge
- Hidden Conditional Neural Fields for Continuous Phoneme Speech Recognition
- Selegiline (L-Deprenyl) and L-Dopa Treatment of Parkinson's Disease: A Double-Blind Trial.
- Class-Based N-Gram Language Model for New Words Using Out-of-Vocabulary to In-Vocabulary Similarity