An Improved Classification Strategy for Filtering Relevant Tweets Using Bag-of-Word Classifiers (Preprint)
スポンサーリンク
概要
- 論文の詳細を見る
In this paper we have presented a classification framework for classifying tweets relevant to some specific target sectors. Due to the imposed length restriction on an individual tweet, tweet classification faces some additional challenges which are not present in most other short text classification problems, needless to say in classification of standard written text. Hence, bag-of-word classifiers, which have been successfully leveraged for text classification in other domains, fail to achieve a similar level of accuracy in classifying tweets. In this paper, we have proposed a collocation feature selection algorithm for tweet classification. Moreover, we have proposed a strategy, built on our selected collocation features, for identifying and removing confounding outliers from a training set. An Evaluation on two real world datasets shows that the proposed model yields a better accuracy than the unigram model, uni-bigram model and also a partially supervised topic model on two different classification tasks.------------------------------This is a preprint of an article intended for publication Journal ofInformation Processing(JIP). This preprint should not be cited. Thisarticle should be cited as: Journal of Information Processing Vol.21(2013) No.3 (online)------------------------------
- 2013-06-15
著者
-
Masayuki Iwai
School of Science and Technology for Future Life, Tokyo Denki University,“Advanced Integrated
-
Kaoru Sezaki
Center For Spatial Information Science The University Of Tokyo
-
Masayuki Iwai
School of Science and Technology for Future, Department of Information Systems and Multimedia Design, Tokyo Denki University
関連論文
- A Flexible Modeling Enine Enabling Inter-service Management
- An Energy-Efficient Mobile Node Scheduling Scheme with Realistic Sensing Region
- A distributed system architecture for pedestrian flock detection with participatory sensing
- A distributed system architecture for pedestrian flock detection with participatory sensing
- An Improved Classification Strategy for Filtering Relevant Tweets Using Bag-of-Word Classifiers (Preprint)
- An Online Method for Trajectory Simplification Under Uncertainty of GPS
- What Does the Chirping Tell Us? : Summarizing People's Opinion on Ongoing Events Using Tweets