An Improved Classification Strategy for Filtering Relevant Tweets Using Bag-of-Word Classifiers
スポンサーリンク
概要
- 論文の詳細を見る
In this paper we have presented a classification framework for classifying tweets relevant to some specific target sectors. Due to the imposed length restriction on an individual tweet, tweet classification faces some additional challenges which are not present in most other short text classification problems, needless to say in classification of standard written text. Hence, bag-of-word classifiers, which have been successfully leveraged for text classification in other domains, fail to achieve a similar level of accuracy in classifying tweets. In this paper, we have proposed a collocation feature selection algorithm for tweet classification. Moreover, we have proposed a strategy, built on our selected collocation features, for identifying and removing confounding outliers from a training set. An Evaluation on two real world datasets shows that the proposed model yields a better accuracy than the unigram model, uni-bigram model and also a partially supervised topic model on two different classification tasks.
著者
-
SEZAKI KAORU
Center for Spatial Information Science, The University of Tokyo
-
Sezaki Kaoru
Center For Spatial Information Center The University Of Tokyo
-
Khan Muhammad
Graduate School Of Information Science And Technology The University Of Tokyo
-
Khan Muhammad
Graduate School of Information Science and Technology, Department of Information and Communication Engineering, The University of Tokyo
-
Iwai Masayuki
School of Science and Technology for Future, Department of Information Systems and Multimedia Design, Tokyo Denki University
-
Iwai Masayuki
School of Science and Technology for Future Life, Tokyo Denki University
関連論文
- A Flexible Modeling Engine Enabling Inter-service Management
- A Flexible Modeling Engine Enabling Inter-service Management
- An Energy-Efficient Mobile Node Scheduling Scheme with Realistic Sensing Region
- A Two-Stage Simulated Annealing Logical Topology Reconfiguration in IP over WDM Networks(Internet)
- SB-10-8 A Fast Neighbour Discovery Simulated Annealing for Logical Topology Design in IP/WDM Networks
- B-6-36 Logical Topology Reconfiguration Trade-off in IP/WDM Optical Networks
- A Protocol for Policy-Based Session Control in Disruption Tolerant Sensor Networks(Ubiquitous Sensor Networks)
- Autonomous Configuration in Wireless Sensor Networks(Wide Band Systems)
- SDC: A Scalable Approach to Collect Data in Wireless Sensor Networks(Software Platform Technologies, Ubiquitous Networks)
- Symmetrical Routing and Wavelength Assignment for Two Regular-Topology All-Optical Networks
- Quick Data-Retrieving for U-APSD in IEEE802.11e WLAN Networks(Multi-dimensional Mobile Information Networks)
- An Improved Power Saving Mechanism for MAC Protocol in Ad Hoc Networks(Terrestrial Radio Communications)
- Towards robust localization in mobile sensor networks (情報ネットワーク)
- B-21-16 Group Mobility Modeling in Mobile Ad Hoc Networks using Pedestrian Tracked Data
- ESMO : An Energy-Efficient Mobile Node Scheduling Scheme for Sound Sensing
- A-7-25 Security and Privacy issues on RFID-based Positioning System
- B-7-5 Robust Localization Mechanism in RFID-Based Reference Point Systems
- B-21-38 Routing Algorithm for Ad Hoc Networks using Mobility Prediction(B-21. アドホックネットワーク, 通信2)
- Mobility Model for Ad Hoc Networks based on Experimental Data
- Mobility Model for Ad Hoc Networks based on Experimental Data
- Proposal and Evaluation of System for Haptics Collaboration, Vol.J86-B,No.2, pp.268-278
- FOREWORD
- A-16-2 A Survey on Haptic Interaction in 3D GIS
- Adjustment on End-to-End Delay Distortion
- B-20-49 Self-localization of Tags in RFID-Based Reference Point System
- Optimum Quantization Step Size for Integer Lossless Transform Coefficients
- Nonseparable 2D Lossless Transforms Based on Multiplier-Free Lossless WHT
- D-5-6 Generating Training Data Without Human Supervision for Classifying Emotions in Microblogs
- An Improved Classification Strategy for Filtering Relevant Tweets Using Bag-of-Word Classifiers