Voice Conversion Using Low Dimensional Vector Mapping(Regular Section)
スポンサーリンク
概要
- 論文の詳細を見る
In this paper, a new voice personality transformation algorithm which uses the vocal tract characteristics and pitch period as feature parameters is proposed. The vocal tract transfer function is divided into time-invariant and time-varying parts. Conversion rules for the time-varying part are constructed by the classified-linear transformation matrix based on soft-clustering techniques for LPC cepstrum expressed in KL (Karhunen-Loeve) coefficients. An excitation signal containing prosodic information is transformed by average pitch ratio. In order to improve the naturalness, transformation on the excitation signal is separately applied to voiced and unvoiced bands to preserve the overall spectral structure. Objective tests show that the distance between the LPC cepstrum of a target speaker and that of the speech synthesized using the proposed method is reduced by about 70% compared with the distance between the target speaker's LPC cepstrum and the source speaker's. Also, subjective listening tests show that 60-70% of listeners identify the transformed speech as the target speaker's.
- 社団法人電子情報通信学会の論文
- 2002-07-25
著者
-
Lee K‐s
Department Of Electronic Engineering Konkuk University
-
Lee Ki-seung
Department Of Electronic Engineering Konkuk University
-
YOUN Dae-Hee
Department of Electronic Engineering, Yonsei University
-
Youn Dae-hee
Department Of Electrical And Electronic Engineering Yonsei University
-
Youn Dae-hee
Department Of Electronic Engineering Yonsei University
-
DOH Won
Department of Electronic Engineering, Yonsei University
-
Doh Won
Department Of Chemistry Kyungpook National University
関連論文
- An Efficient Active Noise Control Algorithm Based on the Lattice-Transversal Joint(LTJ) Filter Structure
- Effect of Equal Channel Angular Pressing on the Distribution of Reinforcements in the Discontinuous Metal Matrix Composites
- A GMM-Based Feature Selection Algorithm for Multi-Class Classification
- Duration Modeling Using Cumulative Duration Probability
- A Grammatical Structure of the FSN for the Recognition of Korean Price Sentences
- Speaker Adaptation Based on a Maximum Observation Probability Criterion
- A Very Low Complexity VSELP Speech Coder Using Regular Pulse Basis Vectors (Special Section of Papers Selected from ITC-CSCC'96)
- Performance Comparison of Single and Multi-Stage Algebraic Codebooks(Speech and Hearing)
- Performance Comparison of Single and Multi-Stage Algebraic Codebooks
- Voice Conversion Using Low Dimensional Vector Mapping(Regular Section)
- Adsorption Configuration Changes and Reactions of N_2O on V(110) between 80 and 200K
- Robust Recognition of Fast Speech(Speech and Hearing)
- Nonlinear Long-Term Prediction of Speech Signal(Regular Section)