A Rapid Model Adaptation Technique for Emotional Speech Recognition with Style Estimation Based on Multiple-Regression HMM
スポンサーリンク
概要
- 論文の詳細を見る
In this paper, we propose a rapid model adaptation technique for emotional speech recognition which enables us to extract paralinguistic information as well as linguistic information contained in speech signals. This technique is based on style estimation and style adaptation using a multiple-regression HMM (MRHMM). In the MRHMM, the mean parameters of the output probability density function are controlled by a low-dimensional parameter vector, called a style vector, which corresponds to a set of the explanatory variables of the multiple regression. The recognition process consists of two stages. In the first stage, the style vector that represents the emotional expression category and the intensity of its expressiveness for the input speech is estimated on a sentence-by-sentence basis. Next, the acoustic models are adapted using the estimated style vector, and then standard HMM-based speech recognition is performed in the second stage. We assess the performance of the proposed technique in the recognition of simulated emotional speech uttered by both professional narrators and non-professional speakers.
- (社)電子情報通信学会の論文
- 2010-01-01
著者
-
井島 勇祐
東京工業大学大学院総合理工学研究科物理情報システム専攻
-
Masuko Takashi
近畿大学 薬学部細胞生物学
-
Tachibana Makoto
Interdisciplinary Graduate School Of Science And Engineering Tokyo Institute Of Technology:(present
-
Masuko Takashi
Department Of Agricultural And Biological Chemistry College Of Bioresource Sciences Nihon University
-
NOSE Takashi
Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
-
Tachibana Makoto
Interdisciplinary Graduate School Of Science And Engineering Tokyo Institute Of Technology
-
Nose Takashi
Interdisciplinary Graduate School Of Science And Engineering Tokyo Institute Of Technology
-
Kobayashi Takao
Interdisciplinary Graduate School Of Science And Engineering Tokyo Institute Of Technology
-
IJIMA Yusuke
Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
-
Matsuyama Taiji
Department Of Pharmacy Shizuoka Kousei Hospital
-
Ijima Yusuke
Interdisciplinary Graduate School Of Science And Engineering Tokyo Institute Of Technology
-
井島 勇祐
日本電信電話株式会社NTTサイバーペース研究所
-
井島 勇祐
日本電信電話株式会社,NTTサイバースペース研究所
関連論文
- 重回帰HMMに基づく自然発話音声の発話様式識別(発音評価,認識,理解,対話,一般)
- 重回帰HMMに基づくスタイル推定を用いた音声認識における音響モデル学習法(音声認識・音響モデル,第10回音声言語シンポジウム)
- 重回帰HMMに基づくスタイル推定を用いた音声認識における音響モデル学習法(音声認識・音響モデル,第10回音声言語シンポジウム)
- 重回帰HMMに基づくスタイル推定を用いた音声認識における音響モデル学習法(音声認識・音響モデル,第10回音声言語シンポジウム)
- スタイル推定に基づく音響モデルのオンライン適応手法(認識,理解,対話,一般)
- Enhancement of Veratridine-Induced Sodium Dynamics in NG108-15 Cells during Differentiation(Pharmacology)
- Antibody epitope peptides as potential inducers of IgG antibodies against CD98 oncoprotein
- Identification of cell proliferation-associated epitope on CD98 oncoprotein using phage display random peptide library
- Molecular Structural and Functional Characterization of Tumor Suppressive Anti-ErbB-2 Monoclonal Antibody by Phage Display System
- Immunohistochemical expression and pathogenesis of BLM in the human brain and visceral organs
- Phage Display Cloning and Characterization of Monoclonal Antibody Genes and Recombinant Fab Fragment against the CD98 Oncoprotein
- Colocalization of CP125/CD98 with Tropomyosin Isoforms at the Cell-Cell Adhesion Boundary^1
- Identification and Immunological Characterization of a Novel 40 - kDa Protein Linked to CD98 Antigen
- A Style Control Technique for HMM-Based Expressive Speech Synthesis(Speech and Hearing)
- A Style Adaptation Technique for Speech Synthesis Using HSMM and Suprasegmental Features(Speech Synthesis, Statistical Modeling for Speech Processing)
- Speech Synthesis with Various Emotional Expressions and Speaking Styles by Style Interpolation and Morphing(Life-like Agent and its Communication)
- Acoustic Modeling of Speaking Styles and Emotional Expressions in HMM-Based Speech Synthesis(Speech Synthesis and Prosody, Corpus-Based Speech Technologies)
- Identification of Truncated Human Glutamate Transporter
- Characterization and In Vitro Cytotoxic Effect of Adriamycin-conjugated Monoclonal Antibody Prepared Against Breast Cancer Cell Line
- Characterization of A New Breast Cancer-Associated Antigen and Its Relationship to MUC1 and TAG-72 Antigens
- Characterization of Cell Surface Antigens Expressed in the HMA-1 Breast Cancer Cell Line
- Effects of Dexamethasone and Aminophylline on Survival of Jurkat and HL-60 Cells(Pharmacology)
- Malate dehydrogenases from nitrifying bacteria : purification and properties
- Ribulose-1,5-Bisphosphate Carboxylase/Oxygenase from a Nitrite-Oxidizing Chemoautotroph, Nitrobacter agilis ATCC 14123 : Purification and Properties
- A Hidden Semi-Markov Model-Based Speech Synthesis System(Speech and Hearing)
- State Duration Modeling for HMM-Based Speech Synthesis(Speech and Hearing)
- A Training Method of Average Voice Model for HMM-Based Speech Synthesis(Digital Signal Processing)
- A Context Clustering Technique for Average Voice Models (Special Issue on Speech Information Processing)
- Speaker Adaptation of Pitch and Spectrum for HMM-Based Speech Synthesis
- Multi-Space Probability Distribution HMM(Special Issue on the 2000 IEICE Excellent Paper Award)
- Vector Quantization of Speech Spectral Parameters Using Statistics of Static and Dynamic Features
- Text-Independent Speaker Identification Using Gaussian Mixture Models Based on Multi-Space Probability Distribution (Special Issue on Biometric Person Authentication)
- Application of Fluorescence Polarization Immunoassay for Determination of Methotrexate-Polyglutamates in Rheumatoid Arthritis Patients
- Homotypic Adhesion through Carcinoembryonic Antigen Plays a Role in Hepatic Metastasis Development
- SUSCEPTIBILITY OF ANIMALS TO HEPATOCARCINOGENIC AROMATIC AMINES CORRELATES WITH THE INDUCTION OF THE CARCINOGEN ACTIVATION ENZYME (S) WITH THE AMINES
- Robust F_0 Estimation of Speech Signal Using Harmonicity Measure Based on Instantaneous Frequency(Speech and Hearing)
- Intracellular Localization of UDP-Glucuronosyltransferase Expressed from the Transfected cDNA in Cultured Cells
- Significance of integrin αvβ5 and erbB3 in enhanced cell migration and liver metastasis of colon carcinomas stimulated by hepatocyte-derived heregulin
- Dihydrofolate Reductase Gene Intronic 19-bp Deletion Polymorphisms in a Japanese Population
- A Rapid Model Adaptation Technique for Emotional Speech Recognition with Style Estimation Based on Multiple-Regression HMM
- HMM-Based Style Control for Expressive Speech Synthesis with Arbitrary Speaker's Voice Using Model Adaptation
- Average-Voice-Based Speech Synthesis Using HSMM-Based Speaker Adaptation and Adaptive Training(Speech and Hearing)
- A Technique for Estimating Intensity of Emotional Expressions and Speaking Styles in Speech Based on Multiple-Regression HSMM
- HMM-Based Voice Conversion Using Quantized F0 Context
- Human Walking Motion Synthesis with Desired Pace and Stride Length Based on HSMM(Life-like Agent and its Communication)
- 雑音重畳音声の聞き取りやすさと音響特徴量の関係の分析 (音声)
- FOREWORD
- 雑音重畳音声の聞き取りやすさと音響特徴量の関係の分析
- Malate dehydrogenases from nitrifying bacteria : purification and properties
- A context clustering technique for improvement of tone intelligibility of average-voice-based Thai speech synthesis (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- 強調音声合成のための局所韻律コンテキスト自動付与の検討(一般,音声知覚生成/聴覚コミュニケーション,一般)
- 雑音重畳音声の聞き取りやすさと音響特徴量の関係の分析(一般,音声知覚生成/聴覚コミュニケーション,一般)
- Speaker interpolation for HMM-based speech synthesis system
- 多様な韻律生成のための多クラス局所韻律コンテキストの検討(オーガナイズドセッション「多様な音声・歌声の合成に向けて」,音声・言語・対話,一般)