Topic Dependent Language Model based on On-Line Voting
スポンサーリンク
概要
- 論文の詳細を見る
In this paper, we propose an alternative approach to a topic dependent language model (LM), where the topic is decided by voting in an unsupervised manner. Latent Semantic Analysis (LSA) is employed to reveal hidden (latent) relations among nouns in the context word sequence. To decide the topic of an event, a fixed size word history sequence (window) is observed, and voting is then carried out based on noun class occurrences weighted by a confidence measure. Experiments on the Wall Street Journal corpus and Mainichi Shimbun (Japanese newspaper) corpus show that our proposed method gives better perplexity than the comparative baselines, including a word-based/class-based n-gram LM, their interpolated LM, a cache-based LM, and the Latent Dirichlet Allocation (LDA)-based topic dependent LM.
- 2009-12-14
著者
-
Welly Naptali
Department of Information and Computer Sciences, Toyohashi University of Technology
-
Masatoshi Tsuchiya
Information and Media Center, Toyohashi University of Technology
-
Seiichi Nakagawa
Department of Information and Computer Sciences, Toyohashi University of Technology
-
Welly Naptali
Department Of Information And Computer Sciences Toyohashi University Of Technology
-
Seiichi Nakagawa
Department Of Information And Computer Sciences Toyohashi University Of Technology
-
Masatoshi Tsuchiya
Information And Media Center Toyohashi University Of Technology