Automatic Language Identification with Discriminative Language Characterization Based on SVM
スポンサーリンク
概要
- 論文の詳細を見る
Robust automatic language identification (LID) is the task of identifying the language from a short utterance spoken by an unknown speaker. The mainstream approaches include parallel phone recognition language modeling (PPRLM), support vector machine (SVM) and the general Gaussian mixture models (GMMs). These systems map the cepstral features of spoken utterances into high level scores by classifiers. In this paper, in order to increase the dimension of the score vector and alleviate the inter-speaker variability within the same language, multiple data groups based on supervised speaker clustering are employed to generate the discriminative language characterization score vectors (DLCSV). The back-end SVM classifiers are used to model the probability distribution of each target language in the DLCSV space. Finally, the output scores of back-end classifiers are calibrated by a pair-wise posterior probability estimation (PPPE) algorithm. The proposed language identification frameworks are evaluated on 2003 NIST Language Recognition Evaluation (LRE) databases and the experiments show that the system described in this paper produces comparable results to the existing systems. Especially, the SVM framework achieves an equal error rate (EER) of 4.0% in the 30-second task and outperforms the state-of-art systems by more than 30% relative error reduction. Besides, the performances of proposed PPRLM and GMMs algorithms achieve an EER of 5.1% and 5.0% respectively.
- (社)電子情報通信学会の論文
- 2008-03-01
著者
-
Yan Yonghong
Thinkit Speech Lab. Institute Of Acoustics Chinese Academy Of Sciences
-
Yan Yonghong
Institute Of Acoustics Chinese Academy Of Science
-
SUO Hongbin
ThinkIT Speech Lab., Institute of Acoustics, Chinese Academy of Sciences
-
LU Ping
ThinkIT Speech Lab., Institute of Acoustics, Chinese Academy of Sciences
-
Yan Yonghong
Thinkit Speech Lab Institute Of Acoustics Chinese Academy Of Sciences
-
Yan Yonghong
Thinkit Speech Lab.
-
Yan Yonghong
Thinkit Speech Laboratory Institute Of Acoustics Chinese Academy Of Sciences Beijing
-
LI Ming
ThinkIT Speech Lab., Institute of Acoustics, Chinese Academy of Sciences
-
Li Ming
Thinkit Speech Lab. Institute Of Acoustics Chinese Academy Of Sciences
-
Lu Ping
Thinkit Speech Lab. Institute Of Acoustics Chinese Academy Of Sciences
-
Suo Hongbin
Thinkit Speech Lab Institute Of Acoustics Chinese Academy Of Sciences
関連論文
- Effects of single-channel speech enhancement algorithms on Mandarin speech intelligibility (応用音響)
- Approximate Decision Function and Optimization for GMM-UBM Based Speaker Verification
- Using a Kind of Novel Phonotactic Information for SVM Based Speaker Recognition
- Robust Speaker Clustering Using Affinity Propagation
- An LVCSR Based Reading Miscue Detection System Using Knowledge of Reference and Error Patterns
- Effective Acoustic Modeling for Pronunciation Quality Scoring of Strongly Accented Mandarin Speech
- A One-Pass Real-Time Decoder Using Memory-Efficient State Network
- Development of a Mandarin-English Bilingual Speech Recognition System for Real World Music Retrieval
- Automatic Singing Performance Evaluation for Untrained Singers
- Melody Track Selection Using Discriminative Language Model
- Automatic Language Identification with Discriminative Language Characterization Based on SVM
- A two-element-microphone-array-based speech recognition system in vehicle environment(Commemoration of the Japan-China Joint Conference on Acoustics 2007 (JCA2007))
- Speech Enhancement Using Improved Adaptive Null-Forming in Frequency Domain with Postfilter
- Effects of the Temporal Fine Structure in Different Frequency Bands on Mandarin Tone Perception
- Acoustic Feature Optimization Based on F-Ratio for Robust Speech Recognition
- A Hybrid Speech Emotion Recognition System Based on Spectral and Prosodic Features
- Enhancing the Robustness of the Posterior-Based Confidence Measures Using Entropy Information for Speech Recognition
- Two-Microphone Noise Reduction Using Spatial Information-Based Spectral Amplitude Estimation
- A bayesian logistic regression approach to spoken language identification