Robust F_0 Estimation of Speech Signal Using Harmonicity Measure Based on Instantaneous Frequency(Speech and Hearing)
スポンサーリンク
概要
- 論文の詳細を見る
Borrowing the notion of instantaneous frequency that was developed in the context of time-frequency signal analysis, an instantaneous frequency amplitude spectrum (IFAS) is introduced for estimating fundamental frequency of speech signal in both noiseless and adverse environments. We define harmonicity measure as a quantity that indicates degree of periodical regularity in the IFAS and that shows substantial difference between periodic signal and noise-like waveform. The harmonicity measure is applied to estimate the existence of fundamental frequency. We provide experimental examples to demonstrate the general applicability of the harmonicity measure and apply the proposed procedure to Japanese continuous speech signals. The results show that the proposed method out-performs the conventional methods with or without the presence of noise.
- 社団法人電子情報通信学会の論文
- 2004-12-01
著者
-
MASUKO Takashi
Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
-
KOBAYASHI Takao
Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
-
Masuko Takashi
Interdisciplinary Graduate School Of Science And Engineering Tokyo Institute Of Technology
-
ARIFIANTO Dhany
Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
-
TANAKA Tomohiro
Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
-
Kobayashi Takao
Interdisciplinary Graduate School Of Science And Engineering Tokyo Institute Of Technology
-
Tanaka Tomohiro
Interdisciplinary Graduate School Of Science And Engineering Tokyo Institute Of Technology
-
Arifianto Dhany
Interdisciplinary Graduate School Of Science And Engineering Tokyo Institute Of Technology
関連論文
- A Style Control Technique for HMM-Based Expressive Speech Synthesis(Speech and Hearing)
- A Style Adaptation Technique for Speech Synthesis Using HSMM and Suprasegmental Features(Speech Synthesis, Statistical Modeling for Speech Processing)
- Speech Synthesis with Various Emotional Expressions and Speaking Styles by Style Interpolation and Morphing(Life-like Agent and its Communication)
- Acoustic Modeling of Speaking Styles and Emotional Expressions in HMM-Based Speech Synthesis(Speech Synthesis and Prosody, Corpus-Based Speech Technologies)
- A Hidden Semi-Markov Model-Based Speech Synthesis System(Speech and Hearing)
- State Duration Modeling for HMM-Based Speech Synthesis(Speech and Hearing)
- A Training Method of Average Voice Model for HMM-Based Speech Synthesis(Digital Signal Processing)
- A Context Clustering Technique for Average Voice Models (Special Issue on Speech Information Processing)
- Speaker Adaptation of Pitch and Spectrum for HMM-Based Speech Synthesis
- Multi-Space Probability Distribution HMM(Special Issue on the 2000 IEICE Excellent Paper Award)
- Vector Quantization of Speech Spectral Parameters Using Statistics of Static and Dynamic Features
- Text-Independent Speaker Identification Using Gaussian Mixture Models Based on Multi-Space Probability Distribution (Special Issue on Biometric Person Authentication)
- Mixture Density Models Based on Mel-Cepstral Representation of Gaussian Process(Digital Signal Processing)
- A 16kb/s Wideband CELP-Based Speech Coder Using Mel-Generalized Cepstral Analysis
- Robust F_0 Estimation of Speech Signal Using Harmonicity Measure Based on Instantaneous Frequency(Speech and Hearing)
- An autopsy case of cyclopia with 13 trisomy with special reference to histological abnormalities of the eyeball
- Acrania : an autopsy case and review of the literature
- A Rapid Model Adaptation Technique for Emotional Speech Recognition with Style Estimation Based on Multiple-Regression HMM
- HMM-Based Style Control for Expressive Speech Synthesis with Arbitrary Speaker's Voice Using Model Adaptation
- Average-Voice-Based Speech Synthesis Using HSMM-Based Speaker Adaptation and Adaptive Training(Speech and Hearing)
- A Technique for Estimating Intensity of Emotional Expressions and Speaking Styles in Speech Based on Multiple-Regression HSMM
- HMM-Based Voice Conversion Using Quantized F0 Context
- Human Walking Motion Synthesis with Desired Pace and Stride Length Based on HSMM(Life-like Agent and its Communication)
- FOREWORD
- A context clustering technique for improvement of tone intelligibility of average-voice-based Thai speech synthesis (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")