Noise Robust Speech Recognition Using F_0 Contour Information(<Special Section>Speech Dynamics by Ear, Eye, Mouth and Machine)
スポンサーリンク
概要
- 論文の詳細を見る
This paper proposes a noise robust speech recognition method using prosodic information. In Japanese, the fundamental frequency (F_0) contour represents phrase intonation and word accent information. Consequently, it conveys information about prosodic phrases and word boundaries. This paper first describes a noise robust F_0 extraction method using the Hough transform, which achieves high extraction rates under various noise environments. Then it proposes a robust speech recognition method using multi-stream HMMs which model both segmental spectral and F_0 contour information. Speaker-independent experiments are conducted using connected digits uttered by 11 male speakers in various kinds of noise and SNR conditions. The recognition error rate is reduced in all noise conditions, and the best absolute improvement of digit accuracy is about 4.5%. This improvement is achieved by robust digit boundary detection using the prosodic information.
- 社団法人電子情報通信学会の論文
- 2004-05-01
著者
-
SEKI Takahiro
Department of Molecular and Pharmacological Neuroscience, Graduate School of Biomedical Sciences, Hi
-
Seki Takahiro
Department Of Molecular And Pharmacological Neuroscience Graduate School Of Biomedical Sciences Hiro
-
FURUI Sadaoki
Department of Computer Science, Graduate School of Information Science and Engineering, Tokyo Instit
-
IWANO Koji
Department of Computer Science, Tokyo Institute of Technology
-
Iwano Koji
Department Of Computer Science Tokyo Institute Of Technology
-
Furui Sadaoki
Department Of Computer Science Graduate School Of Information Science And Engineering Tokyo Institut
-
Seki Takahiro
Department Of Computer Science Tokyo Institute Of Technology:(present Address)ibm Global Service-jap
-
Seki Takahiro
Department Of Applied Physics Nagoya University
関連論文
- The C-Terminal Region of Serotonin Transporter Is Important for Its Trafficking and Glycosylation
- Fragmentation of Protein Kinase N (PKN) in the Hydrocephalic Rat Brain
- Antiepileptic Effects of Single and Repeated Oral Administrations of S-312-d, a Novel Calcium Channel Antagonist, on Tonic Convulsions in Spontaneously Epileptic Rats
- Electrophysiological Characterization of Nicotine-Induced Excitaiton of Dopaminergic Neurons in the Rat Substantia Nigra
- Perospirone, a Novel Antipsychotic Agent, Hyperpolarizes Rat Dorsal Raphe Neurons via 5-HT_ Receptor
- Gait-based Person Identification Robust against Speed Variation using CHLAC features and HMMs
- Gait-based Person Identification Robust against Speed Variation using CHLAC features and HMMs
- Gait-based Person Identification Robust against Speed Variation using CHLAC features and HMMs
- Congo Red, an Amyloid-Inhibiting Compound, Alleviates Various Types of Cellular Dysfunction Triggered by Mutant Protein Kinase Cγ That Causes Spinocerebellar Ataxia Type 14 (SCA14) by Inhibiting Oligomerization and Aggregation
- 二段階鋳型重合法を用いた温度応答型ゲル微粒子単層組織化膜の作製
- Systemic T Cell Large Granular Lymphocyte Lymphoma with Multifocal White Matter Degeneration in the Brain of a Japanese Domestic Cat
- メソポーラス材料の光配向制御
- メソポーラス材料を光で並べる
- 塩基増殖高分子の化学修飾によるカラーパターニング
- PB19 リオトロピック液晶性色素とシリカからなるナノ構造ハイブリッド膜(トピカルセッション-液晶物性計測の最前線-, 2005年日本液晶学会討論会)
- Tree-Structured Clustering Methods for Piecewise Linear-Transformation-Based Noise Adaptation(Speech and Hearing)
- 塩基増殖高分子の特性と光イメージング材料への応用
- 塩基増殖高分子と光塩基発生剤からなる水現像可能な新規光イメージング材料
- Long-lasting Antiepileptic Effects of Levetiracetam against Epileptic Seizures in the Spontaneously Epileptic Rat (SER) : Differentiation of Levetiracetam from Conventional Antiepileptic Drugs
- Separation of Antiepileptogenic and Antiseizure Effects of Levetiracetam in the Spontaneously Epileptic Rat (SER)
- Electrophysiological Characterization of Nicotine-Induced Excitaiton of Dopaminergic Neurons in the Rat Substantia Nigra
- Robust Scene Extraction Using Multi-Stream HMMs for Baseball Broadcast(Image Processing and Video Processing)
- Automatic recognition of Indonesian declarative questions and statements using polynomial coefficients of the pitch contours
- Initial evaluation of the drivers' Japanese speech corpus in a car environment (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- Structure and Characteristics of an Endo-β-1,4-glucanase, Isolated from Trametes hirsuta, with High Degradation to Crystalline Cellulose
- Accent analysis for Mandarin large vocabulary continuous speech recognition (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- Evaluation of a Noise-Robust Multi-Stream Speaker Verification Method Using F_0 Information
- Topic Extraction Based on Continuous Speech Recognition in Broadcast News Speech
- 液晶性アゾベンゼン高分子における高感度な光誘起物質移動
- 高感度光誘起表面レリーフ形成とその応用
- Is bone density in the distal femur affected by use of cement and by femoral component design in total knee arthroplasty?
- 名古屋大学大学院工学研究科 関研究室
- 2004年日本液晶学会講演会・討論会報告
- Two-Dimensional Manipulation of Poly(3-dodecylthiophene) using Light-Driven Instant Mass Migration as a Molecular Conveyer
- 高分子の界面組織化と光機能
- Dynamic Photoresponsive Functions in Organized Layer Systems Comprised of Azobenzene-containing Polymers
- 光応答単分子膜からの表面転写に基づく高分子薄膜およびメソ細孔シリカの光配向制御
- Role of Hydrogen Bonding in Azobenzene-Urea Assemblies. Structural Evaluations of Multilayers on Solid Substrates
- 2003年液晶学会サマースクール報告
- 尿素頭部を有したアゾベンゼン誘導体単分子膜の充填状態の湿度応答性
- Enhanced colitis-associated colon carcinogenesis in a novel Apc mutant rat
- Bremazocine Recognizes the Difference in Four Amino Acid Residues to Discriminate Between a Nociceptin/Orphanin FQ Receptor and Opioid Receptors
- Reversible Photoswitching Liquid-phase Adsorption on Azobenzene Derivative-grafted Mesoporous Silica
- Noise Robust Speech Recognition Using F_0 Contour Information(Speech Dynamics by Ear, Eye, Mouth and Machine)
- Structure and Characteristics of an Endo-β-1,4-glucanase, Isolated from Trametes hirsuta, with High Degradation to Crystalline Cellulose
- Smart Photoresponsive Polymer Systems Organized in Two Dimensions
- Elucidation of the Molecular Mechanism and Exploration of Novel Therapeutics for Spinocerebellar Ataxia Caused by Mutant Protein Kinase Cγ
- Congo Red, an Amyloid-Inhibiting Compound, Alleviates Various Types of Cellular Dysfunction Triggered by Mutant Protein Kinase Cγ That Causes Spinocerebellar Ataxia Type 14 (SCA14) by Inhibiting Oligomerization and Aggregation
- Endomorphin-1 Discriminates the μ-Opioid Receptor From the δ-and κ-Opioid Receptors by Recognizing the Difference in Multiple Regions
- Surface-mediated photoalignment of organic/inorganic nanohybrids(Global Innovation in Advanced Ceramics)
- In situ Polymerization of Liquid Crystalline Monomers within Photoaligned Mesoporous Silica Thin Film
- Possible Involvement of Descending Serotonergic Systems in Antinociception by Centrally Administered Elcatonin in Mice
- Recent advances in hydrogels in terms of fast stimuli responsiveness and superior mechanical performance
- Phonology and Morphology Modeling in a Very Large Vocabulary Hungarian Dictation System(Speech and Hearing)
- 連続発話認識のための言語モデル
- Dynamic Bayesian Network-Based Acoustic Models Incorporating Speaking Rate Effects(Speech and Hearing)
- Light-directed Dynamic Structure Formation and Alignment in Photoresponsive Thin Films
- Neural-network-based HMM adaptation for noisy speech recognition
- Elucidation of the Molecular Mechanism and Exploration of Novel Therapeutics for Spinocerebellar Ataxia Caused by Mutant Protein Kinase Cγ
- Charge Transport Anisotropy due to Interfacial Molecular Orientation in Polymeric Transistors with Controlled In-Plane Chain Orientation
- Speaker Verification Using MMAP Adaptation (言語理解とコミュニケーション)
- Speaker Verification Using MMAP Adaptation (音声)
- Subject Adaptation and Adaptive Training for Gait-based Person Identification
- Subject Adaptation and Adaptive Training for Gait-based Person Identification
- Two-pass Approach for Recognizing Code-Switching Speech
- Two-pass Approach for Recognizing Code-Switching Speech
- A versatile photochemical procedure to introduce a photoreactive molecular layer onto a polyimide film for liquid crystal alignment
- Two-pass Approach for Recognizing Code-Switching Speech
- Subject Adaptation and Adaptive Training for Gait-based Person Identification
- Long-Term Exposure of RN46A Cells Expressing Serotonin Transporter (SERT) to a cAMP Analog Up-regulates SERT Activity and Is Accompanied by Neural Differentiation of the Cells
- Speaker Verification Using MMAP Adaptation
- Two-Dimensional Manipulation of Poly(3-dodecylthiophene) using Light-Driven Instant Mass Migration as a Molecular Conveyer
- Effects of the Chemical Chaperone 4-Phenylbutylate on the Function of the Serotonin Transporter (SERT) Expressed in COS-7 Cells
- Two-pass Approach for Recognizing Code-Switching Speech
- Antiepileptic Effects of Single and Repeated Oral Administrations of S-312-d, a Novel Calcium Channel Antagonist, on Tonic Convulsions in Spontaneously Epileptic Rats
- Long-Term Exposure of RN46A Cells Expressing Serotonin Transporter (SERT) to a cAMP Analog Up-regulates SERT Activity and Is Accompanied by Neural Differentiation of the Cells