A Hidden Semi-Markov Model-Based Speech Synthesis System(Speech and Hearing)
スポンサーリンク
概要
- 論文の詳細を見る
A statistical speech synthesis system based on the hidden Markov model (HMM) was recently proposed. In this system, spectrum, excitation, and duration of speech are modeled simultaneously by context-dependent HMMs, and speech parameter vector sequences are generated from the HMMs themselves. This system defines a speech synthesis problem in a generative model framework and solves it based on the maximum likelihood (ML) criterion. However, there is an inconsistency: although state duration probability density functions (PDFs) are explicitly used in the synthesis part of the system, they have not been incorporated into its training part. This inconsistency can make the synthesized speech sound less natural. In this paper, we propose a statistical speech synthesis system based on a hidden semi-Markov model (HSMM), which can be viewed as an HMM with explicit state duration PDFs. The use of HSMMs can solve the above inconsistency because we can incorporate the state duration PDFs explicitly into both the synthesis and the training parts of the system. Subjective listening test results show that use of HSMMs improves the reported naturalness of synthesized speech.
- 社団法人電子情報通信学会の論文
- 2007-05-01
著者
-
ZEN Heiga
Department of Computer Science and Engineering, Nagoya Institute of Technology
-
TOKUDA Keiichi
Department of Computer Science and Engineering, Nagoya Institute of Technology
-
Masuko T
Interdisciplinary Graduate School Of Science And Engineering Tokyo Institute Of Technology:(present
-
Masuko Takashi
近畿大学 薬学部細胞生物学
-
Zen Heiga
Department Of Computer Science And Engineering Nagoya Institute Of Technology
-
Tachibana Makoto
Interdisciplinary Graduate School Of Science And Engineering Tokyo Institute Of Technology:(present
-
Masuko Takashi
Laboratory Of Cell Biology School Of Pharmaceutical Sciences Kinki University
-
Masuko Takashi
Departments Of Molecular Biology Pharmaceutical Institute Tohoku University
-
Masuko Takashi
Department Of Agricultural And Biological Chemistry College Of Bioresource Sciences Nihon University
-
Tokuda Keiichi
Department Of Computer Science And Engineering Nagoya Institute Of Technology
-
KITAMURA Tadashi
Department of Computer Science and Engineering, Nagoya Institute of Technology
-
Masuko Takashi
Tokyo Inst. Technol. Yokohama‐shi Jpn
-
MASUKO Takashi
Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
-
KOBAYASHI Takao
Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
-
KOBAYASIH Takao
Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
-
Kobayashi Takao
Tokyo Inst. Technol. Yokohama‐shi Jpn
-
Tachibana Makoto
Interdisciplinary Graduate School Of Science And Engineering Tokyo Institute Of Technology
-
Masuko Takashi
Interdisciplinary Graduate School Of Science And Engineering Tokyo Institute Of Technology
-
Kobayashi T
Interdisciplinary Graduate School Of Science And Engineering Tokyo Institute Of Technology
-
Kitamura Tadashi
Department Of Computer Science And Engineering Nagoya Institute Of Technology
-
Kitamura Tadashi
Department Of Cardiothoracic Surgery The University Of Tokyo
-
Kobayashi Takao
Interdisciplinary Graduate School Of Science And Engineering Tokyo Institute Of Technology
-
Matsuyama Taiji
Department Of Pharmacy Shizuoka Kousei Hospital
-
Tokuda Keiichi
Department Of Computer Science Naogya Institute Of Technology
-
Zen Heiga
Department Of Computer Science Naogya Institute Of Technology
-
Kitamura Tadashi
Department Of Cardiothoracic Surgery Faculty Of Medicine University Of Tokyo
-
Kobayashi Takao
Department Of Obstetrics And Gynecology Hamamatsu University School Of Medicine
関連論文
- The Nitech-NAIST HMM-Based Speech Synthesis System for the Blizzard Challenge 2006
- Details of the Nitech HMM-Based Speech Synthesis System for the Blizzard Challenge 2005(Speech and Herring)
- Effects of Nicorandil on Cardiovascular Events in Patients With Coronary Artery Disease in The Japanese Coronary Artery Disease (JCAD) Study
- Gender Differences in Patients With Coronary Artery Disease in Japan : The Japanese Coronary Artery Disease Study (The JCAD Study)
- Beta-Blocker Prescription Among Japanese Cardiologists and Its Effect on Various Outcomes
- Relationship Between Renal Dysfunction and Severity of Coronary Artery Disease in Japanese Patients
- PJ-038 Cystatin C Predicts Severity of Coronary Artery Disease Even in Patients without Chronic Kidney Disease (CKD)(PJ007,Kidney/Renal Circulation/CKD 1 (H),Poster Session (Japanese),The 73rd Annual Scientific Meeting of The Japanese Circulation Society)
- Enhancement of Veratridine-Induced Sodium Dynamics in NG108-15 Cells during Differentiation(Pharmacology)
- Antibody epitope peptides as potential inducers of IgG antibodies against CD98 oncoprotein
- Identification of cell proliferation-associated epitope on CD98 oncoprotein using phage display random peptide library
- Molecular Structural and Functional Characterization of Tumor Suppressive Anti-ErbB-2 Monoclonal Antibody by Phage Display System
- Immunohistochemical expression and pathogenesis of BLM in the human brain and visceral organs
- Phage Display Cloning and Characterization of Monoclonal Antibody Genes and Recombinant Fab Fragment against the CD98 Oncoprotein
- Colocalization of CP125/CD98 with Tropomyosin Isoforms at the Cell-Cell Adhesion Boundary^1
- Identification and Immunological Characterization of a Novel 40 - kDa Protein Linked to CD98 Antigen
- Applying Sparse KPCA for Feature Extraction in Speech Recognition(Feature Extraction and Acoustic Medelings, Corpus-Based Speech Technologies)
- On the Use of Kernel PCA for Feature Extraction in Speech Recognition(Speech and Hearing)
- A Style Control Technique for HMM-Based Expressive Speech Synthesis(Speech and Hearing)
- A Style Adaptation Technique for Speech Synthesis Using HSMM and Suprasegmental Features(Speech Synthesis, Statistical Modeling for Speech Processing)
- Speech Synthesis with Various Emotional Expressions and Speaking Styles by Style Interpolation and Morphing(Life-like Agent and its Communication)
- Acoustic Modeling of Speaking Styles and Emotional Expressions in HMM-Based Speech Synthesis(Speech Synthesis and Prosody, Corpus-Based Speech Technologies)
- Identification of Truncated Human Glutamate Transporter
- Partial Involvement of Group I Metabotropic Glutamate Receptors in the Neurotoxicity of 3-N-Oxalyl-L-2,3-diaminopropanoic Acid(L-β-ODAP)(Pharmacology)
- Characterization and In Vitro Cytotoxic Effect of Adriamycin-conjugated Monoclonal Antibody Prepared Against Breast Cancer Cell Line
- Characterization of A New Breast Cancer-Associated Antigen and Its Relationship to MUC1 and TAG-72 Antigens
- Characterization of Cell Surface Antigens Expressed in the HMA-1 Breast Cancer Cell Line
- Effects of Dexamethasone and Aminophylline on Survival of Jurkat and HL-60 Cells(Pharmacology)
- Malate dehydrogenases from nitrifying bacteria : purification and properties
- Ribulose-1,5-Bisphosphate Carboxylase/Oxygenase from a Nitrite-Oxidizing Chemoautotroph, Nitrobacter agilis ATCC 14123 : Purification and Properties
- The Nitech-NAIST HMM-Based Speech Synthesis System for the Blizzard Challenge 2006
- A Hidden Semi-Markov Model-Based Speech Synthesis System(Speech and Hearing)
- State Duration Modeling for HMM-Based Speech Synthesis(Speech and Hearing)
- A Training Method of Average Voice Model for HMM-Based Speech Synthesis(Digital Signal Processing)
- A Context Clustering Technique for Average Voice Models (Special Issue on Speech Information Processing)
- Speaker Adaptation of Pitch and Spectrum for HMM-Based Speech Synthesis
- Multi-Space Probability Distribution HMM(Special Issue on the 2000 IEICE Excellent Paper Award)
- Vector Quantization of Speech Spectral Parameters Using Statistics of Static and Dynamic Features
- Text-Independent Speaker Identification Using Gaussian Mixture Models Based on Multi-Space Probability Distribution (Special Issue on Biometric Person Authentication)
- Establishment of a method of anonymization of DNA samples in genetic research
- Application of Fluorescence Polarization Immunoassay for Determination of Methotrexate-Polyglutamates in Rheumatoid Arthritis Patients
- A Reordering Model Using a Source-Side Parse-Tree for Statistical Machine Translation
- Clinical Experience with Cryopreserved Allografts for Aortic Infection
- An Enzyme Immunoassay for Cell Proliferation Using Monoclonal Antibodies Directed against a Cell Proliferation-Associated Antigen
- A Fully Consistent Hidden Semi-Markov Model-Based Speech Recognition System
- Plasma Cystatin C Concentration Reflects the Severity of Coronary Artery Disease in Patients Without Chronic Kidney Disease
- Homotypic Adhesion through Carcinoembryonic Antigen Plays a Role in Hepatic Metastasis Development
- Mixture Density Models Based on Mel-Cepstral Representation of Gaussian Process(Digital Signal Processing)
- A 16kb/s Wideband CELP-Based Speech Coder Using Mel-Generalized Cepstral Analysis
- Development of Material Management System for Newspapers (Special Issue on New Generation Database Technologies)
- SUSCEPTIBILITY OF ANIMALS TO HEPATOCARCINOGENIC AROMATIC AMINES CORRELATES WITH THE INDUCTION OF THE CARCINOGEN ACTIVATION ENZYME (S) WITH THE AMINES
- Pseudoaneurysm Developed after Aortic Root Homograft Implantation
- Conidiomatal development of Pestalotiopsis guepinii and P. neglecta on leaves of Gardenia jasminoides
- Pycnidial development of Phyllosticta harai and Sphaeropsis sp.
- LMS-Based Algorithms with Multi-Band Decomposition of the Estimation Error Applied to System Identification (Special Section on Digital Signal Processing)
- Multi-Band Decomposition of the Linear Prediction Error Applied to Adaptive AR Spectral Estimation
- Robust F_0 Estimation of Speech Signal Using Harmonicity Measure Based on Instantaneous Frequency(Speech and Hearing)
- Intracellular Localization of UDP-Glucuronosyltransferase Expressed from the Transfected cDNA in Cultured Cells
- Adaptive AR Spectral Estimation Based on Wavelet Decomposition of the Linear Prediction Error
- A Covariance-Typing Technique for HMM-Based Speech Synthesis
- An autopsy case of cyclopia with 13 trisomy with special reference to histological abnormalities of the eyeball
- Characteristics of Multi-Layer Perceptron Models in Enhancing Degraded Speech
- Acrania : an autopsy case and review of the literature
- Prevalence of Vitreous Hemorrhage Following Coronary Revascularization in Patients With Diabetic Retinopathy
- Parameter Sharing in Mixture of Factor Analyzers for Speaker Identification(Feature Extraction and Acoustic Medelings, Corpus-Based Speech Technologies)
- Significance of integrin αvβ5 and erbB3 in enhanced cell migration and liver metastasis of colon carcinomas stimulated by hepatocyte-derived heregulin
- Dihydrofolate Reductase Gene Intronic 19-bp Deletion Polymorphisms in a Japanese Population
- A Rapid Model Adaptation Technique for Emotional Speech Recognition with Style Estimation Based on Multiple-Regression HMM
- HMM-Based Style Control for Expressive Speech Synthesis with Arbitrary Speaker's Voice Using Model Adaptation
- Deterministic Annealing EM Algorithm in Acoustic Modeling for Speaker and Speech Recognition(Feature Extraction and Acoustic Medelings, Corpus-Based Speech Technologies)
- Continuous Speech Recognition Based on General Factor Dependent Acoustic Models(Feature Extraction and Acoustic Medelings, Corpus-Based Speech Technologies)
- Average-Voice-Based Speech Synthesis Using HSMM-Based Speaker Adaptation and Adaptive Training(Speech and Hearing)
- A Technique for Estimating Intensity of Emotional Expressions and Speaking Styles in Speech Based on Multiple-Regression HSMM
- Bayesian Context Clustering Using Cross Validation for Speech Recognition
- Physical Model-Based Indirect Measurement of Blood Flow and Pressures for Pulsatile Circulatory Assist-In Vitro Study
- Reformulating the HMM as a Trajectory Model
- Reformulating the HMM as a Trajectory Model
- Reformulating the HMM as a Trajectory Model
- HMM-Based Voice Conversion Using Quantized F0 Context
- Intensively Lowering Both Low-Density Lipoprotein Cholesterol and Blood Pressure Does Not Reduce Cardiovascular Risk in Japanese Coronary Artery Disease Patients
- ファジィ推論とリアルタイムモデルの併用について
- Speech recognition based on statistical models including multiple phonetic decision trees
- Human Walking Motion Synthesis with Desired Pace and Stride Length Based on HSMM(Life-like Agent and its Communication)
- A Modeling Support Tool for a Global Human Model on the Internet
- The Development of a Physiological Simulation System for the Human Circulatory System Coupling Macro and Micro Models
- Report of the Committee on the classification and diagnostic criteria of diabetes mellitus : The Committee of the Japan Diabetes Society on the diagnostic criteria of diabetes mellitus
- Diagnostic Simulation Tool for a Circulatory System Model Based on Interpretive Structural Modeling
- How can a robot have consciousness?
- FOREWORD
- Animal-like Behavior Design of Small Robots by the Model of Subjective World and Behavior
- Malate dehydrogenases from nitrifying bacteria : purification and properties
- An Integrated Simulation Tool for Modeling the Human Circulatory System(Bioengineering)
- Animal-like behavior design of small robots by consciousness-based architecture
- A Case of Implantation of ICD over 30 Years after CABG for Coronary Arterial Lesions Due to Kawasaki Disease
- A context clustering technique for improvement of tone intelligibility of average-voice-based Thai speech synthesis (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- Moderate Prosthesis-Patient Mismatch May Be Negligible in Elderly Patients Undergoing Conventional Aortic Valve Replacement for Aortic Stenosis
- A Bayesian Framework Using Multiple Model Structures for Speech Recognition
- Neutrophil Elastase Inhibitor Sivelestat Attenuates Perioperative Inflammatory Response in Pediatric Heart Surgery With Cardiopulmonary Bypass:A Prospective Randomized Study
- Speaker interpolation for HMM-based speech synthesis system