Deterministic Annealing EM Algorithm in Acoustic Modeling for Speaker and Speech Recognition(Feature Extraction and Acoustic Medelings, <Special Section>Corpus-Based Speech Technologies)
スポンサーリンク
概要
- 論文の詳細を見る
This paper investigates the effectiveness of the DAEM (Deterministic Annealing EM) algorithm in acoustic modeling for speaker and speech recognition. Although the EM algorithm has been widely used to approximate the ML estimates, it has the problem of initialization dependence. To relax this problem, the DAEM algorithm has been proposed and confirmed the effectiveness in artificial small tasks. In this paper, we applied the DAEM algorithm to practical speech recognition tasks: speaker recognition based on GMMs and continuous speech recognition based on HMMs. Experimental results show that the DAEM algorithm can improve the recognition performance as compared to the standard EM algorithm with conventional initialization algorithms, especially in the flat start training for continuous speech recognition.
- 社団法人電子情報通信学会の論文
- 2005-03-01
著者
-
Zen Heiga
Department Of Computer Science And Engineering Nagoya Institute Of Technology
-
Tokuda Keiichi
Department Of Computer Science And Engineering Nagoya Institute Of Technology
-
KITAMURA Tadashi
Department of Computer Science and Engineering, Nagoya Institute of Technology
-
MIYAJIMA Chiyomi
Department of Computer Science and Engineering, Nagoya Institute of Technology
-
Miyajima Chiyomi
Department Of Media Science Nagoya University
-
Kitamura Tadashi
Department Of Computer Science And Engineering Nagoya Institute Of Technology
-
Kitamura Tadashi
Department Of Cardiothoracic Surgery The University Of Tokyo
-
Nankaku Yoshihiko
Department Of Computer Science And Engineering Nagoya Institute Of Technology
-
Tokuda Keiichi
Department Of Computer Science Naogya Institute Of Technology
-
ITAYA Yohei
Department of Computer Science and Engineering, Nagoya Institute of Technology
-
Itaya Yohei
Department Of Computer Science And Engineering Nagoya Institute Of Technology
-
Zen Heiga
Department Of Computer Science Naogya Institute Of Technology
-
Kitamura Tadashi
Department Of Cardiothoracic Surgery Faculty Of Medicine University Of Tokyo
関連論文
- Details of the Nitech HMM-Based Speech Synthesis System for the Blizzard Challenge 2005(Speech and Herring)
- Effects of Nicorandil on Cardiovascular Events in Patients With Coronary Artery Disease in The Japanese Coronary Artery Disease (JCAD) Study
- Gender Differences in Patients With Coronary Artery Disease in Japan : The Japanese Coronary Artery Disease Study (The JCAD Study)
- Beta-Blocker Prescription Among Japanese Cardiologists and Its Effect on Various Outcomes
- Relationship Between Renal Dysfunction and Severity of Coronary Artery Disease in Japanese Patients
- PJ-038 Cystatin C Predicts Severity of Coronary Artery Disease Even in Patients without Chronic Kidney Disease (CKD)(PJ007,Kidney/Renal Circulation/CKD 1 (H),Poster Session (Japanese),The 73rd Annual Scientific Meeting of The Japanese Circulation Society)
- Applying Sparse KPCA for Feature Extraction in Speech Recognition(Feature Extraction and Acoustic Medelings, Corpus-Based Speech Technologies)
- On the Use of Kernel PCA for Feature Extraction in Speech Recognition(Speech and Hearing)
- The Nitech-NAIST HMM-Based Speech Synthesis System for the Blizzard Challenge 2006
- A Hidden Semi-Markov Model-Based Speech Synthesis System(Speech and Hearing)
- State Duration Modeling for HMM-Based Speech Synthesis(Speech and Hearing)
- A Training Method of Average Voice Model for HMM-Based Speech Synthesis(Digital Signal Processing)
- A Context Clustering Technique for Average Voice Models (Special Issue on Speech Information Processing)
- Speaker Adaptation of Pitch and Spectrum for HMM-Based Speech Synthesis
- Multi-Space Probability Distribution HMM(Special Issue on the 2000 IEICE Excellent Paper Award)
- Vector Quantization of Speech Spectral Parameters Using Statistics of Static and Dynamic Features
- Text-Independent Speaker Identification Using Gaussian Mixture Models Based on Multi-Space Probability Distribution (Special Issue on Biometric Person Authentication)
- Establishment of a method of anonymization of DNA samples in genetic research
- A Reordering Model Using a Source-Side Parse-Tree for Statistical Machine Translation
- Clinical Experience with Cryopreserved Allografts for Aortic Infection
- A Fully Consistent Hidden Semi-Markov Model-Based Speech Recognition System
- Plasma Cystatin C Concentration Reflects the Severity of Coronary Artery Disease in Patients Without Chronic Kidney Disease
- Mixture Density Models Based on Mel-Cepstral Representation of Gaussian Process(Digital Signal Processing)
- Pseudoaneurysm Developed after Aortic Root Homograft Implantation
- Multiple Regression of Log Spectra for In-Car Speech Recognition Using Multiple Distributed Microphones(Feature Extraction and Acoustic Medelings, Corpus-Based Speech Technologies)
- LMS-Based Algorithms with Multi-Band Decomposition of the Estimation Error Applied to System Identification (Special Section on Digital Signal Processing)
- Multi-Band Decomposition of the Linear Prediction Error Applied to Adaptive AR Spectral Estimation
- Adaptive AR Spectral Estimation Based on Wavelet Decomposition of the Linear Prediction Error
- A Covariance-Typing Technique for HMM-Based Speech Synthesis
- Characteristics of Multi-Layer Perceptron Models in Enhancing Degraded Speech
- Prevalence of Vitreous Hemorrhage Following Coronary Revascularization in Patients With Diabetic Retinopathy
- Parameter Sharing in Mixture of Factor Analyzers for Speaker Identification(Feature Extraction and Acoustic Medelings, Corpus-Based Speech Technologies)
- Adaptive Nonlinear Regression Using Multiple Distributed Microphones for In-Car Speech Recognition(Speech Enhancement, Multi-channel Acoustic Signal Processing)
- Deterministic Annealing EM Algorithm in Acoustic Modeling for Speaker and Speech Recognition(Feature Extraction and Acoustic Medelings, Corpus-Based Speech Technologies)
- Continuous Speech Recognition Based on General Factor Dependent Acoustic Models(Feature Extraction and Acoustic Medelings, Corpus-Based Speech Technologies)
- Bayesian Context Clustering Using Cross Validation for Speech Recognition
- Physical Model-Based Indirect Measurement of Blood Flow and Pressures for Pulsatile Circulatory Assist-In Vitro Study
- Reformulating the HMM as a Trajectory Model
- Reformulating the HMM as a Trajectory Model
- Reformulating the HMM as a Trajectory Model
- Intensively Lowering Both Low-Density Lipoprotein Cholesterol and Blood Pressure Does Not Reduce Cardiovascular Risk in Japanese Coronary Artery Disease Patients
- ファジィ推論とリアルタイムモデルの併用について
- Speech recognition based on statistical models including multiple phonetic decision trees
- On the use of two-mass vocal cord model in characterizing the stress speech (音声)
- A Modeling Support Tool for a Global Human Model on the Internet
- The Development of a Physiological Simulation System for the Human Circulatory System Coupling Macro and Micro Models
- Report of the Committee on the classification and diagnostic criteria of diabetes mellitus : The Committee of the Japan Diabetes Society on the diagnostic criteria of diabetes mellitus
- Diagnostic Simulation Tool for a Circulatory System Model Based on Interpretive Structural Modeling
- How can a robot have consciousness?
- Animal-like Behavior Design of Small Robots by the Model of Subjective World and Behavior
- An Integrated Simulation Tool for Modeling the Human Circulatory System(Bioengineering)
- Animal-like behavior design of small robots by consciousness-based architecture
- A Case of Implantation of ICD over 30 Years after CABG for Coronary Arterial Lesions Due to Kawasaki Disease
- Moderate Prosthesis-Patient Mismatch May Be Negligible in Elderly Patients Undergoing Conventional Aortic Valve Replacement for Aortic Stenosis
- A Bayesian Framework Using Multiple Model Structures for Speech Recognition
- Neutrophil Elastase Inhibitor Sivelestat Attenuates Perioperative Inflammatory Response in Pediatric Heart Surgery With Cardiopulmonary Bypass:A Prospective Randomized Study
- Speaker interpolation for HMM-based speech synthesis system