Tone Recognition of Continuous Mandarin Speech Based on Tone Nucleus Model and Neural Network
スポンサーリンク
概要
- 論文の詳細を見る
A method was developed for automatic recognition of syllable tone types in continuous speech of Mandarin by integrating two techniques, tone nucleus modeling and neural network classifier. The tone nucleus modeling considers a syllable F0 contour as consisting of three parts: onset course, tone nucleus, and offset course. Two courses are transitions from/to neighboring syllable F0 contours, while the tone nucleus is intrinsic part of the F0 contour. By viewing only the tone nucleus, acoustic features less affected by neighboring syllables are obtained. When using the tone nucleus modeling, automatic detection of tone nucleus comes crucial. An improvement was added to the original detection method. Distinctive acoustic features for tone types are not limited to F0 contours. Other prosodic features, such as waveform power and syllable duration, are also useful for tone recognition. Their heterogeneous features are rather difficult to be handled simultaneously in hidden Markov models (HMM), but are easy in neural networks. We adopted multi-layer perception (MLP) as a neural network. Tone recognition experiments were conducted for speaker dependent and independent cases. In order to show the effect of integration, experiments were conducted also for two baselines: HMM classifier with tone nucleus modeling, and MLP classifier viewing entire syllable instead of tone nucleus. The integrated method showed 87.1% of tone recognition rate in speaker dependent case, and 80.9% in speaker independent case, which was about 10% relative error reduction as compared to the baselines.
- (社)電子情報通信学会の論文
- 2008-06-01
著者
-
Wang Xiao-dong
Department Of Electronic Engineering The University Of Tokyo
-
HIROSE Keikichi
Department of Information and Communication Engineering, The University of Tokyo
-
ZHANG Jin-Song
Knowledge Creating Communication Research Center, NiCT-ATR
-
MINEMATSU Nobuaki
Department of Frontier Informatics, The University of Tokyo
-
Zhang Jin-song
Knowledge Creating Communication Research Center Nict-atr
-
Hirose Keikichi
Department Of Information And Communication Engineering The University Of Tokyo
-
Minematsu Nobuaki
Department Of Frontier Informatics The University Of Tokyo
-
Hirose Keikichi
Department Of Applied Electronics Science University Of Tokyo
関連論文
- Tone Recognition of Continuous Mandarin Speech Based on Tone Nucleus Model and Neural Network
- Automatic alignment of a musical score to performed music
- A Scheme for Word Detection in Continuous Speech Using Likelihood Scores of Segments Modified by Their Context Within a Word