Statistical Approaches to Excitation Modeling in HMM-Based Speech Synthesis
スポンサーリンク
概要
- 論文の詳細を見る
In our previous study, we proposed the waveform interpolation (WI) approach to model the excitation signals for hidden Markov model (HMM)-based speech synthesis. This letter presents several techniques to improve excitation modeling within the WI framework. We propose both the time domain and frequency domain zero padding techniques to reduce the spectral distortion inherent in the synthesized excitation signal. Furthermore, we apply non-negative matrix factorization (NMF) to obtain a low-dimensional representation of the excitation signals. From a number of experiments, including a subjective listening test, the proposed method has been found to enhance the performance of the conventional excitation modeling techniques.
著者
-
Kim Nam
School Of Electrical And Computer Engineering Chungbuk National Univ.
-
KIM Nam
School of Electrical Engineering and the Institute of New Media and Communications, Seoul National University
-
KOO Hyun
School of Electrical Engineering and the Institute of New Media and Communications, Seoul National University
-
HONG Doo
School of Electrical Engineering and the Institute of New Media and Communications, Seoul National University
-
SUNG June
School of Electrical Engineering and the Institute of New Media and Communications, Seoul National University
-
KIM Nam
School of Architecture and Architectural Engineering, Korea University of Technology and Education
関連論文
- Two-Dimensional Electrophoretic Analysis of Radio Frequency Radiation-Exposed MCF7 Breast Cancer Cells
- Feature Compensation with Model-Based Estimation for Noise Masking(Speech and Hearing)
- Computationally Efficient Cepstral Domain Feature Compensation
- On Detecting Target Acoustic Signals Based on Non-negative Matrix Factorization
- Improved Frame Mode Selection for AMR-WB+ Based on Decision Tree
- Estimation of Phone Mismatch Penalty Matrices for Two-Stage Keyword Spotting
- Implementation of HMM-Based Human Activity Recognition Using Single Triaxial Accelerometer
- Speech Enhancement Based on Perceptually Comfortable Residual Noise(Multimedia Systems for Communications)
- Study of Prominence Detection Based on Various Phone-Specific Features
- Frame Splitting Scheme for Error-Robust Audio Streaming over Packet-Switching Networks
- Three-Dimensional Display System Based on Integral Imaging with Viewing Direction Control
- Depth Discrimination Enhanced Computational Integral Imaging Using Random Pattern Illumination
- Speech Enhancement Based on Data-Driven Residual Gain Estimation
- Analysis of the Cellular Stress Response in MCF10A Cells Exposed to Combined Radio Frequency Radiation
- Outlier Detection and Removal for HMM-Based Speech Synthesis with an Insufficient Speech Database
- Spectral Magnitude Adjustment for MCLT-Based Acoustic Data Transmission
- Statistical Approaches to Excitation Modeling in HMM-Based Speech Synthesis
- APPLICATION OF RELIABILITY -BASED SAFETY FACTORS TO MECHANISTIC-EMPIRICAL FLEXIBLE PAVEMENT DESIGN