Bayesian Context Clustering Using Cross Validation for Speech Recognition
スポンサーリンク
概要
- 論文の詳細を見る
This paper proposes Bayesian context clustering using cross validation for hidden Markov model (HMM) based speech recognition. The Bayesian approach is a statistical technique for estimating reliable predictive distributions by treating model parameters as random variables. The variational Bayesian method, which is widely used as an efficient approximation of the Bayesian approach, has been applied to HMM-based speech recognition, and it shows good performance. Moreover, the Bayesian approach can select an appropriate model structure while taking account of the amount of training data. Since prior distributions which represent prior information about model parameters affect estimation of the posterior distributions and selection of model structure (e.g., decision tree based context clustering), the determination of prior distributions is an important problem. However, it has not been thoroughly investigated in speech recognition, and the determination technique of prior distributions has not performed well. The proposed method can determine reliable prior distributions without any tuning parameters and select an appropriate model structure while taking account of the amount of training data. Continuous phoneme recognition experiments show that the proposed method achieved a higher performance than the conventional methods.
- (社)電子情報通信学会の論文
- 2011-03-01
著者
-
Zen Heiga
Department Of Computer Science And Engineering Nagoya Institute Of Technology
-
Hashimoto Kei
Department Of Bioproductive Sciences Utsunomiya University
-
Tokuda K
Department Of Computer Science And Engineering Nagoya Institute Of Technology
-
Tokuda Keiichi
Department Of Computer Science And Engineering Nagoya Institute Of Technology
-
Hashimoto Kei
Department Of Computer Science And Engineering Nagoya Institute Of Technology
-
Lee Akinobu
Department Of Computer Science Nagoya Institute Of Technology
-
Lee Akinobu
Department Of Computer Science And Engineering Nagoya Institute Of Technology
-
Nankaku Yoshihiko
Department Of Computer Science And Engineering Nagoya Institute Of Technology
-
Tokuda Keiichi
Department Of Computer Science Naogya Institute Of Technology
-
Zen Heiga
Department Of Computer Science Naogya Institute Of Technology
-
HASHIMOTO Kei
Department of Applied Biological Chemistry, Utsunomiya University
関連論文
- Details of the Nitech HMM-Based Speech Synthesis System for the Blizzard Challenge 2005(Speech and Herring)
- Antioxidative Effects of Phenolic Acids on Lipid Peroxidation Induced by H_2O_2 in the Presence of Myoglobin
- Determination of Hydrogen Peroxide by High-Performance Liquid Chromatography with a Cation-Exchange Resin Gel Column and Electrochemical Detector
- Absorption and Metabolism of Quercetin in Caco-2 Cells
- Applying Sparse KPCA for Feature Extraction in Speech Recognition(Feature Extraction and Acoustic Medelings, Corpus-Based Speech Technologies)
- On the Use of Kernel PCA for Feature Extraction in Speech Recognition(Speech and Hearing)
- The Nitech-NAIST HMM-Based Speech Synthesis System for the Blizzard Challenge 2006
- A Hidden Semi-Markov Model-Based Speech Synthesis System(Speech and Hearing)
- State Duration Modeling for HMM-Based Speech Synthesis(Speech and Hearing)
- A Training Method of Average Voice Model for HMM-Based Speech Synthesis(Digital Signal Processing)
- A Context Clustering Technique for Average Voice Models (Special Issue on Speech Information Processing)
- Speaker Adaptation of Pitch and Spectrum for HMM-Based Speech Synthesis
- Multi-Space Probability Distribution HMM(Special Issue on the 2000 IEICE Excellent Paper Award)
- Vector Quantization of Speech Spectral Parameters Using Statistics of Static and Dynamic Features
- Text-Independent Speaker Identification Using Gaussian Mixture Models Based on Multi-Space Probability Distribution (Special Issue on Biometric Person Authentication)
- A Reordering Model Using a Source-Side Parse-Tree for Statistical Machine Translation
- Improving Rapid Unsupervised Speaker Adaptation Based on HMM-Sufficient Statistics in Noisy Environments Using Multi-Template Models(Speech Recognition, Statistical Modeling for Speech Processing)
- Spectral Cosensitization in Organic Solar Cell with Mixed Film of Zinc Porphyrin and Merocyanine
- A Fully Consistent Hidden Semi-Markov Model-Based Speech Recognition System
- Mixture Density Models Based on Mel-Cepstral Representation of Gaussian Process(Digital Signal Processing)
- A 16kb/s Wideband CELP-Based Speech Coder Using Mel-Generalized Cepstral Analysis
- Non-Audible Murmur (NAM) Recognition Exploiting Adaptation Techniques
- Development, Long-Term Operation and Portability of a Real-Environment Speech-Oriented Guidance System
- 複数モデルを用いた十分統計量に基く教師なし話者適応における学習話者のクラス化の検討
- LMS-Based Algorithms with Multi-Band Decomposition of the Estimation Error Applied to System Identification (Special Section on Digital Signal Processing)
- Multi-Band Decomposition of the Linear Prediction Error Applied to Adaptive AR Spectral Estimation
- Inhibitory Effect of Arphamenine A on Intestinal Dipeptide Transport
- Adaptive AR Spectral Estimation Based on Wavelet Decomposition of the Linear Prediction Error
- A Covariance-Typing Technique for HMM-Based Speech Synthesis
- Effects of β-Lactoglobulin on the Tight-junctional Stability of Caco-2-SF Monolayer
- Parameter Sharing in Mixture of Factor Analyzers for Speaker Identification(Feature Extraction and Acoustic Medelings, Corpus-Based Speech Technologies)
- Suppression of the Menadione-Induced Cytotoxicity toward Hepalclc7 Murine Hepatoma by Quinone Reductase Inducers
- Deterministic Annealing EM Algorithm in Acoustic Modeling for Speaker and Speech Recognition(Feature Extraction and Acoustic Medelings, Corpus-Based Speech Technologies)
- Continuous Speech Recognition Based on General Factor Dependent Acoustic Models(Feature Extraction and Acoustic Medelings, Corpus-Based Speech Technologies)
- Bayesian Context Clustering Using Cross Validation for Speech Recognition
- Reformulating the HMM as a Trajectory Model
- Reformulating the HMM as a Trajectory Model
- Reformulating the HMM as a Trajectory Model
- Speech recognition based on statistical models including multiple phonetic decision trees
- A Bayesian Framework Using Multiple Model Structures for Speech Recognition
- Speaker interpolation for HMM-based speech synthesis system
- Inhibitory Effect of Methyl Methanethiosulfinate on β-Glucuronidase Activity