首页 | 本学科首页   官方微博 | 高级检索  
     


Exploiting effective features for chinese sentiment classification
Authors:Zhongwu Zhai  Hua Xu  Bada Kang  Peifa Jia
Affiliation:1. School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing 100044, China;2. Key Laboratory of Communication and Information Systems, Beijing Municipal Commission of Education, Beijing, China
Abstract:Features play a fundamental role in sentiment classification. How to effectively select different types of features to improve sentiment classification performance is the primary topic of this paper. Ngram features are commonly employed in text classification tasks; in this paper, sentiment-words, substrings, substring-groups, and key-substring-groups, which have never been considered in sentiment classification area before, are also extracted as features. The extracted features are then compared and analyzed. To demonstrate generality, we use two authoritative Chinese data sets in different domains to conduct our experiments. Our statistical analysis of the experimental results indicate the following: (1) different types of features possess different discriminative capabilities in Chinese sentiment classification; (2) character bigram features perform the best among the Ngram features; (3) substring-group features have greater potential to improve the performance of sentiment classification by combining substrings of different lengths; (4) sentiment words or phrases extracted from existing sentiment lexicons are not effective for sentiment classification; (5) effective features are usually at varying lengths rather than fixed lengths.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号