Automatic Recognition of Nasals

概要

論文の詳細を見る
A method of recognizing |m| and |n| in monosyllables and words in real time is reported. The method consists of three parts, which perform segmentation of nasal consonant, discrimination between |m| and |n| and recognition of the following vowel. In order to extract the nasal part, by comparing the output of 300c/s LPF with that of 500∿1600c/s BPF |e, a, o, u, w| are excluded from nasals, by comparing 500∿1600c/s with 2800∿5000c/s frequency ranges |i, j| are excluded. Voiceless stops are easily omitted by comparing the output of 300c/s LPF with that of 700c/s HPF, but this circuit is also used for excluding vowels. For excluding voiced stops, fricatives and flappeds fundamental frequency components are extracted after filtering speech waves by 700∿3000c/s BPF and the parts which the outputs exist continuously are considered to be likely nasal. The parts of lower level of original speech waves are excluded. The segment which satisfies these five conditions is decided to be nasal consonant after leaving out 12ms of the onset. According to this method the initial part of the nasal is often missed largely but the boundary between the nasal consonant and the following vowel is pointed out explicitly. Concerning to discriminate between |m| and |n|, the components of two frequencies just before the boundary between nasal consonant and the following vowel are compared for distinguishing |mi| and |me| from |ni| and |ne| respectively. For the nasals followed by |a, o, u|, F_2 loci are utilized directly. Though between |me| and |ne| in words are not discriminated satisfactorily as a result of individuality, the others are recognized over 80% for three male voices.
山梨大学の論文

Automatic Recognition of Nasals

スポンサーリンク

概要

著者

関連論文

スポンサーリンク