LVCSR based on context-dependent syllable acoustic models (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
スポンサーリンク
概要
- 論文の詳細を見る
We propose an effective and accurate inter-word context-dependent modeling for large vocabulary continuous speech recognition (LVCSR). As well known, intra-word context-dependent modeling can be realized by describing the context-dependent syllables in the dictionary. However, it usually suffers from the limitation of less accuracy because it does not model inter-syllable pronunciation variations. In our laboratory, a combinational use of linear lexicon and tree-structured lexicon in a 1-best approximation search algorithm for LVCSR was proposed. We only need to make branches for the head syllable according to the contexts and the paths are merged at the second syllable for the linear lexicon. For the tree-structured lexicon, branches are made in a similar way. At the end node of a word, the language scores have to be compensated considering the inter-word context, but the scores of contexts other than that of the best history are lost because of the merge at the second syllable. To solve this problem, we introduce the 'likelihood difference index'. We also investigate the effect of rescoring of the acoustic model (AM) and the language model (LM) in the 2nd pass. The proposed algorithms were evaluated on JNAS and CSJ corpora. The proposed algorithms obtained a remarkable improvement of recognition performance, and the rescoring of the context-dependent syllable acoustic models in the 2nd pass mode also achieved a further improvement even the same acoustic models were used in the 1st pass.
- 社団法人電子情報通信学会の論文
- 2008-03-13
著者
-
ZHANG Jian
Department of Cardiology, Rui Jin Hospital, Shanghai Jiao Tong University School of Medicine
-
Nakagawa Seiichi
Department Of Information And Computer Sciences Toyohashi University Of Technology
-
Nakagawa Seiichi
Toyohashi Univ. Technol. Toyohashi‐shi Jpn
-
Nakagawa Seiichi
Department Of Information And Computer Sciences Toyohashi University
-
Zhang Jian
Department Of Information And Computer Sciences Toyohashi University Of Technology
-
Zhang Jian
Department Of Cardiology Rui Jin Hospital Shanghai Jiao Tong University School Of Medicine
-
WANG Longbiao
Department of Information and Computer Sciences, Toyohashi University of Technology
-
Wang Longbiao
Department Of Information And Computer Sciences Toyohashi University Of Technology
-
Zhang Jian
Department Of Biology Huaihua College
-
Nakagawa Seiichi
Department of Computer Science and Engineering, Toyohashi University of Technology
-
Zhang Jian
Department of Biochemistry and Molecular Biology, The State Key Laboratory of Cancer Biology, The Fourth Military Medical University
関連論文
- Partial vs Full Coverage for Tandem Lesions in Culprit Vessel During Primary Coronary Intervention in Patients With Acute ST-Elevation Myocardial Infarction : The PERFECT-AMI Study
- Topic dependent language model based on on-line voting (言語理解とコミュニケーション)
- A transitive translation for Indonesian-Japanese CLQA (自然言語処理)
- A Machine Learning Approach for an Indonesian-English Cross Language Question Answering System(Natural Language Processing)
- Indonesian-Japanese Transitive Translation using English for CLIR
- Resting Energy Expenditure and Substrate Metabolism in Chinese Patients with Acute or Chronic Hepatitis B or Liver Cirrhosis
- SF-085-3 The Relationship of VEGF in the Serum with Invasion and Metastasis in Gastric Carcinoma
- The existence of CD11c^+ sentinel and F4/80^+ interstitial dendritic cells in dental pulp and their dynamics and functional properties
- A Pharmacokinetic Study of Intramuscular Administration of Bulleyaconitine A in Healthy Volunteers(Pharmacology)
- Papillary adenocarcinoma of the lung is a more advanced adenocarcinoma than bronchioloalveolar carcinoma that is composed of two distinct histological subtypes
- Immunohistochemical expression of 14-3-3 sigma protein in various histological subtypes of uterine cervical cancers
- Topic dependent language model based on on-line voting (音声)
- Topic dependent language model based on clustering of noun word history
- Word and class dependency of N-gram language model (音声言語情報処理)
- Word and class dependency of N-gram language model (言語理解とコミュニケーション・第9回音声言語シンポジウム)
- Word and class dependency of N-gram language model (音声・第9回音声言語シンポジウム)
- TEXT-INDEPENDENT SPEAKER IDENTIFICATION ON TIMIT DATABASE
- Lateral Current Crowding in Deep UV Light Emitting Diodes over Sapphire Substrates
- 324 nm Light Emitting Diodes With MilliWatt Powers : Semiconductors
- Stripe Geometry Ultraviolet Light Emitting Diodes at 305 Nanometers Using Quaternary AlInGaN Multiple Quantum Wells : Semiconductors
- Quaternary AlInGaN Multiple Quantum Wells for Ultraviolet Light Emitting Diodes : Semiconductors
- Text-Independent/Text-Prompted Speaker Recognition by Combining Speaker-Specific GMM with Speaker Adapted Syllable-Based HMM(Speaker Recognition, Statistical Modeling for Speech Processing)
- Increased Serum Glycated Albumin Level is Associated With the Presence and Severity of Coronary Artery Disease in Type 2 Diabetic Patients
- Orbital Period Study of the RS CVn-Type Binary WW Draconis
- Laparoscopic Management of Recurrent Adhesive Small-Bowel Obstruction : Long-Term Follow-Up
- Evaluation of Combinational Use of Discriminant Analysis-Based Acoustic Feature Transformation and Discriminative Training
- LVCSR based on context-dependent syllable acoustic models (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- Robust distant speech recognition by combining variable-term spectrum based position-dependent CMN with conventional CMN (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- Linear Discriminant Analysis Using a Generalized Mean of Class Covariances and Its Application to Speech Recognition
- Robust Speech Recognition by Combining Short-Term and Long-Term Spectrum Based Position-Dependent CMN with Conventional CMN
- LVCSR based on context-dependent syllable acoustic models
- Robust distant speech recognition by combining variable-term spectrum based position-dependent CMN with conventional CMN
- Improving Keyword Recognition of Spoken Queries by Combining Multiple Speech Recognizer's Outputs for Speech-driven WEB Retrieval Task(Spoken Language Systems, Corpus-Based Speech Technologies)
- An Unsupervised Speaker Adaptation Method for Lecture-Style Spontaneous Speech Recognition Using Multiple Recognition Systems(Spoken Language Systems, Corpus-Based Speech Technologies)
- Speaker Change Detection and Speaker Clustering Using VQ Distortion Measure
- Noisy Speech Recognition Based on Integration/Selection of Multiple Noise Suppression Methods Using Noise GMMs
- Succeeding Word Prediction for Speech Recognition Based on Stochastic Language Model
- Observer-Based Semi-Time-Optimal Control for Reactor Power of Boiling Water Reactors
- A Survey on Automatic Speech Recognition(Special Issue on the 2000 IEICE Excellent Paper Award)
- Inhibition by Adrenomedullin of the Adrenergic Neurogenic Response in Canine Mesenteric Arteries
- Relationship among Recognition Rate, Rejection Rate and False Alarm Rate in a Spoken Word Recognition System
- Pharmacokinetics and Mechanism of Intestinal Absorption of JBP485 in Rats
- Speaker Recognition by Combining MFCC and Phase Information in Noisy Conditions
- Distant Speech Recognition Using a Microphone Array Network
- Auditory perception versus automatic estimation of location and orientation of an acoustic source in a real environment
- Continuous Speech Recognition Using an On-Line Speaker Adaptation Method Based on Automatic Speaker Clustering (Special Issue on Speech Information Processing)
- TNF-α in Hypothalamic Paraventricular Nucleus Contributes to Sympathoexcitation in Heart Failure by Modulating AT1 Receptor and Neurotransmitters
- Suppression of N-Myc Downstream-Regulated Gene 2 Is Associated with Induction of Myc in Colorectal Cancer and Correlates Closely with Differentiation
- Distant-Talking Speech Recognition Based on Spectral Subtraction by Multi-Channel LMS Algorithm
- A310 Experimental Study on Gasification of Biomass Using a Pebble Bed Slagging Gasifier
- A Spoken Dialog System for Spontaneous Conversations Considering Response Timing and Response Type
- Surgical Treatment of Multivessel Lesions in Takayasu'a Arteritis : Report of a Case
- Roles of Hypothalamic Subgroup Histamine and Orexin Neurons on Behavioral Responses to Sleep Deprivation Induced by the Treadmill Method in Adolescent Rats
- Biomechanical Study of Anterior Cervical Corpectomy and Step-Cut Grafting With Bioabsorbable Screws Fixation in Cadaveric Cervical Spine Model
- Novel Method for Simulating Optical Properties of Reflective Cholesteric Liquid Crystal Displays : Optical Properties of Condensed Matter
- Numerical Modeling of Nucleation and Growth of Inclusions in Molten Steel Based on Mean Processing Parameters
- Rapid and sensitive determination of vinorelbine in human plasma by liquid chromatography-tandem mass spectrometry and its pharmacokinetic application
- Variation in Chemical Composition and Antibacterial Activities of Essential Oils from Two Species of Houttuynia THUNB
- Roles of Hypothalamic Subgroup Histamine and Orexin Neurons on Behavioral Responses to Sleep Deprivation Induced by the Treadmill Method in Adolescent Rats
- Indonesian-Japanese Transitive Translation using English for CLIR
- Class-Based N-Gram Language Model for New Words Using Out-of-Vocabulary to In-Vocabulary Similarity
- Catalytic removal of acetaldehyde in saliva by a Gluconobacter strain(MICROBIAL PHYSIOLOGY AND BIOTECHNOLOGY)
- Semi-Time-Optimal Control of Boiling Water Reactor.
- Catalytic removal of acetaldehyde in saliva by a Gluconobacter strain