Automatic Language Identification Using Sequential Information of Phonemes
スポンサーリンク
概要
- 論文の詳細を見る
In this paper approaches to language identification based on the sequential information of phonemes are described. These approaches assume that each language can be identified from its own phoneme structure, or phonotactics. To extract this phoneme structure, we use phoneme classifiers and grammars for each language. The phoneme classifier for each language is implemented as a multi-layer perceptron trained on quasi-phonetic hand-labeled transcriptions. After training the phoneme classifiers, the grammars for each language are calculated as a set of transition probabilities for each phoneme pair. Because of the interest in automatic language identification for worldwide voice communication, we decided to use telephone speech for this study. The data for this study were drawn from the OGI (Oregon Graduate Institute)-TS (telephone speech) corpus, a standard corpus for this type of research. To investigate the basic issues of this approach, two languages, Japanese and English, were selected. The language classification algorithms are based on Viterbi search constrained by a bigram grammar and by minimum and maximum durations. Using a phoneme classifier trained only on English phonemes, we achieved 81.1% accuracy. We achieved 79.3% accuracy using a phoneme classifier trained on Japanese phonemes. Using both the English and the Japanese phoneme classifiers together, we obtained our best result : 83.3%. Our results were comparable to those obtained by other methods such as that based on the hidden Markov model.
- 一般社団法人電子情報通信学会の論文
- 1995-06-25
著者
-
Arai Takayuki
Department Of Chemistry Faculty Of Engineering Gunma University
-
Arai Takayuki
Department Of Electrical And Electronic Engineering Sophia University
関連論文
- Perception of speaker identity and its relation to the phonological features (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- Inverse correlation of intelligibility of speech in reverberation with the amount of overlap-masking(ACOUSTICAL LETTER)
- Decreasing speaking-rate with steady-state suppression to improve speech intelligibility in reverberant environments
- The Effects of Speech-Rate Slowing for Improving Speech Intelligibility in Reverberant Environments (国際ワークショップ Frontiers in Speech and Hearing Research)
- Suppressing steady-state portions of speech for improving intelligibility in various reverberant environments
- Effects of suppressing steady-state portions of speech on intelligibility in reverberant environments
- Synthesis and Properties of Fluorine-Containing Poly(arylenemethylene)s as New Heat Resistant Denatured Phenolic Resins
- Synthesis and Brain Regional Distribution of [^C]NPS 1506 in Mice and Rat: an N-Methyl-D-aspartate (NMDA) Receptor Antagonist(Medicinal Chemistry)
- Speech Processing for Hearing-Impaired Listeners Considering Threshold Elevation in the Critical Band with an Expanded Auditory Filter (国際ワークショップ Frontiers in Speech and Hearing Research)
- Improving Speech Intelligibility for Elderly Listeners by Steady-State Suppression (国際ワークショップ Frontiers in Speech and Hearing Research)
- Perception of long vowels in Japanese by Children
- Inactivation of Rat Cytochrome P450 2D Enzyme by a Further Metabolite of 4-Hydroxypropranolol, the Major and Active Metabolite of Propranolol
- Gel-type tongue for a physical model of the human vocal tract as an educational tool in acoustics of speech production
- Effects of linguistic contents on perceptual speaker identification : Comparison of familiar and unknown speaker identifications
- Comparison of consonant identification improvements by steady-state suppression via a loudspeaker system between with and without natural sounds from a talker in reverberation(Commemoration of the Japan-China Joint Conference on Acoustics 2
- The effect of pre-processing approach for improving speech intelligibility in a hall : Comparison between diotic and dichotic listening conditions
- Idiosyncrasy of nasal sounds in human speaker identification and their acoustic properties
- Visualization of Brain Activities of Single-Trial and Averaged Multiple-Trials MEG Data(Neuro, Fuzzy, GA)(Nonlinear Theory and its Applications)
- Digital pattern playback : Converting spectrograms to sound for educational purposes(introduction to the amazing world of sounds with demonstrations)
- Implementation of Steady-State Suppression Using a Digital Signal Processor for Real-Time Processing--Evaluation of the Processing in an Actual Hall (国際ワークショップ Frontiers in Speech and Hearing Research)
- Steady-state suppression for improving syllable identification in reverberant environments : A case study in an elderly person
- Critical-band based frequency compression for digital hearing aids
- Effect of the Carbamoyl Group Attached to an Axial Ligand Portion of a Novel Bleomycin Model on a Dioxygen Activating Reaction
- Distal Effect of Amide and Amino Groups on the Oxygen Activation Ability and Rate of the Redox Reaction of Simplified Analogs of Bleomycin
- Demonstrations for education in acoustics in Japan(introduction to the amazing world of sounds with demonstrations)
- Comparing the characteristics of the plate and cylinder type vocal tract models
- Speech perception experiment using binaural integration of phonemic and prosodic information
- Analysis of spontaneous Japanese in a multi-language telephone-speech corpus
- Modulation cepstrum discriminating between speech and environmental noise
- Education system in acoustics of speech production using physical models of the human vocal tract(Applied Systems)
- Padding zero into steady-state portions of speech as a preprocess for improving intelligibility in reverberant environments
- Processing of consonant clusters by Japanese native speakers: Influence of English learning backgrounds
- Masking speech with its time-reversed signal
- Human language identification with reduced segmental information
- Effects of stimulus contents and speaker familiarity on perceptual speaker identification
- Sliding three-tube model as a simple educational tool for vowel production(introduction to the amazing world of sounds with demonstrations)
- Lung model and head-shaped model with visible vocal tract as educational tools in acoustics
- What Is Rhythm? Can We Capture Syllable Shapes From Intensity Contours? (国際ワークショップ Frontiers in Speech and Hearing Research)
- Automatic Language Identification Using Sequential Information of Phonemes
- One-pot Synthesis of Permethylated α-CD-based Rotaxanes Having Alkylene Chain Axles and Their Structural Characteristics
- B2-3. Production variation of English schwa and Japanese listeners' perceptual assimilation pattern of English schwa(Summaries of Talks at the 26^ General Meeting)
- Identification of English voiceless fricatives in multispeaker babble noise by native Japanese and English listeners: Influence of English proficiency