Comparison of Discriminative Models for Lexicon Optimization for ASR of Agglutinative Language
スポンサーリンク
概要
- 論文の詳細を見る
For automatic speech recognition (ASR) of agglutinative languages, selection of lexical unit is not obvious. Morpheme unit is usually adopted to ensure the sufficient coverage, but many morphemes are short, resulting in weak constraints and possible confusions. We have proposed a discriminative approach to select lexical entries which will directly contribute to ASR error reduction, considering not only linguistic constraint but also acoustic-phonetic confusability. It is based on an evaluation function for each word defined by a set of features and their weights, which are optimized by the difference of word error rates (WERs) by the morpheme-based model and those by the word-based model. In this paper, we investigate several discriminative models to realize this scheme. Specifically, we implement with Support Vector Machines (SVM) and Logistic Regression (LR) model as well as simple perceptron. Experimental evaluations on Uyghur LVCSR show that SVM and LR are more robustly trained and SVM results in the best performance with a large dimension of features.
- 2012-07-12
著者
-
Tatsuya Kawahara
Kyoto University
-
Tatsuya Kawahara
School Of Informatics Kyoto University
-
Mijit Ablimit
School of Informatics, Kyoto University
-
Askar Hamdulla
Institute of Information Engineering, Xinjiang University
-
Mijit Ablimit
School Of Informatics Kyoto University
-
Askar Hamdulla
Institute Of Information Engineering Xinjiang University
関連論文
- Construction of a Test Collection for Spoken Document Retrieval from Lecture Audio Data
- Joint Phrase Alignment and Extraction for Statistical Machine Translation
- Comparison of Discriminative Models for Lexicon Optimization for ASR of Agglutinative Language
- Partial and Synchronized Caption Generation to Enhance the Listening Comprehension Skills of Second Language Learners
- Partial and Synchronized Caption Generation to Enhance the Listening Comprehension Skills of Second Language Learners
- Classifier-based Data Selection for Lightly-Supervised Training of Acoustic Model for Lecture Transcription