A Scheme for Word Detection in Continuous Speech Using Likelihood Scores of Segments Modified by Their Context Within a Word
スポンサーリンク
概要
- 論文の詳細を見る
In conventional word-spotting methods for automatic recognition of continuous speech, individual frames or segments of the input speech are assigned labels and local likelihood scores solely on the basis of their own acoustic characteristics. On the other hand, experiments on human speech perception conducted by the present authors and others show that human perception of words in connected speech is based, not only on the acoustic characteristics of individual segments, but also on the acoustic and linguistic contexts in which these segments occur. In other words, individual segments are not correctly perceived by humans unless they are accompanied by their context. These findings on the process of human speech perception have to be applied in automatic speech recognition in order to improve the performance. From this point of view, the present paper proposes a new scheme for detecting words in continuous speech based on template matching where the likelihood of each segment of a word is determined not only by its own characteristics but also by the likelihood of its context within the framework of a word. This is accomplished by modifying the likelihood score of each segment by the likelihood score of its phonetic context, the latter representing the degree of similarity of the context to that of a candidate word in the lexicon. Higher enhancement is given to the segmental likelihood score if the likelihood score of its context is higher. The advantage of the proposed scheme over conventional schemes is demonstrated by an experiment on constructing a word lattice using connected speech of Japanese uttered by a male speaker. The result indicates that the scheme is especially effective in giving correct recognition in cases where there are two or more candidate words which are almost equal in raw segmental likelihood scores.
- 社団法人電子情報通信学会の論文
- 1995-06-25
著者
-
Ohno Sumio
Department Of Applied Electronics Science University Of Tokyo
-
Hirose Keikichi
Department Of Applied Electronics Science University Of Tokyo
-
Fujisaki Hiroya
Faculty of Engineering, The University of Tokyo
-
Fujisaki H
Univ. Tokyo Tokyo Jpn
-
Fujisaki Hiroya
Faculty Of Engineering University Of Tokyo
関連論文
- Tone Recognition of Continuous Mandarin Speech Based on Tone Nucleus Model and Neural Network
- Automatic alignment of a musical score to performed music
- A Scheme for Word Detection in Continuous Speech Using Likelihood Scores of Segments Modified by Their Context Within a Word
- Temporal organization of segmental features in Japanese disyllables
- The Second Joint Meeting of the Acoustical Society of America and the Acoustical Society of Japan,November 14∼18, 1988,Honolulu