Appearance feature extraction versus image transform-based approach for visual speech recognition
スポンサーリンク
概要
- 論文の詳細を見る
In this paper we propose a new appearance based system which consists of two stages: visual speech feature extraction and classification, followed by recognition of the extracted feature, thereby the result is a complete lip-reading system. This lip-reading system employs our Hyper Column Model (HCM) approach to extract and classify the visual features and uses the Hidden Markov Model (HMM) for recognition. This paper addresses mainly the first stage; i.e. feature extraction and classification. We investigate the HCM performance to achieve feature extraction and classification and then compare the performance when replacing HCM with Fast Discrete Cosine Transform (FDCT). Unlike FDCT, HCM could extract the entire features without any loss. Also the experiments have shown that HCM is generally better than FDCT and provides a good distribution of the phonemes in the feature space for recognition purposes. For fair comparison, two databases are exploited with three different sets of resolution for each database. One of these two databases is designed to include shifted and scaled objects. Experiments reveal that HCM is capable of recovering and dealing with such image restrictions whereas the effectiveness of FDCT drops drastically especially for new subjects.
論文 | ランダム
- コンクリートタンクの内面塗料について
- 機械製麹について
- 堆積仕込法に於ける酵素力の変化について
- イネいもち病防除におけるプロベナゾールのイネ体に与える影響
- 3)遅延型反応と Transfer(II 抗体の構造とその生物学的意義)