Interactive Learning of Spoken Words and Their Meanings Through an Audio-Visual Interface
スポンサーリンク
概要
- 論文の詳細を見る
This paper presents a new interactive learning method for spoken word acquisition through human-machine audio-visual interfaces. During the course of learning, the machine makes a decision about whether an orally input word is a word in the lexicon the machine has learned, using both speech and visual cues. Learning is carried out on-line, incrementally, based on a combination of active and unsupervised learning principles. If the machine judges with a high degree of confidence that its decision is correct, it learns the statistical models of the word and a corresponding image category as its meaning in an unsupervised way. Otherwise, it asks the user a question in an active way. The function used to estimate the degree of confidence is also learned adaptively on-line. Experimental results show that the combination of active and unsupervised learning principles enables the machine and the user to adapt to each other, which makes the learning process more efficient.
- (社)電子情報通信学会の論文
- 2008-02-01
著者
関連論文
- Learning, Generation and Recognition of Motions by Reference-Point-Dependent Probabilistic Models
- Interactive Learning of Spoken Words and Their Meanings Through an Audio-Visual Interface
- Situated Spoken Dialogue with Robots Using Active Learning
- Unsupervised Segmentation of Human Motion Data Using a Sticky Hierarchical Dirichlet Process-Hidden Markov Model and Minimal Description Length-Based Chunking Method for Imitation Learning