首页 | 本学科首页   官方微博 | 高级检索  
     

大数据时代中文文本褒贬倾向性分类研究
引用本文:曾凡锋,朱万山,王景中.大数据时代中文文本褒贬倾向性分类研究[J].信息网络安全,2014(11):30-35.
作者姓名:曾凡锋  朱万山  王景中
作者单位:北方工业大学信息工程学院,北京,100144
基金项目:北京市自然科学基金重点项目 B 类[KZ2010009008];科技成果转化项目[PXM2013];北京市创新团队计划项目
摘    要:在当前的大数据时代,互联网上的博客、论坛产生了海量的主观性评论信息,这些评论信息表达了人们的各种情感色彩和情感倾向性。如果仅仅用人工的方法来对网络上海量的评论信息进行分类和处理实在是太难了,那么,如何高效地挖掘出网络上大量的具有褒贬倾向性观点的信息就成为目前亟待解决的问题,中文文本褒贬倾向性分类技术研究正是解决这一问题的一个方法。文章介绍了常用的文本特征选择算法,分析了文档频率和互信息算法的不足,通过对两个算法的对比和研究,结合文本特征与文本类型的相关度和文本褒贬特征的出现概率,提出了改进的文本特征选择算法(MIDF)。实验结果表明,MIDF算法对文本褒贬倾向性分类是有效的。

关 键 词:褒贬倾向性分类  文本特征选择  褒贬特征提取

Research on Chinese Text Appraisive Classification in the Present Era of Big Data
ZENG Fan-feng,ZHU Wan-shan,WANG Jing-zhong.Research on Chinese Text Appraisive Classification in the Present Era of Big Data[J].Netinfo Security,2014(11):30-35.
Authors:ZENG Fan-feng  ZHU Wan-shan  WANG Jing-zhong
Affiliation:ZENG Fan-feng;ZHU Wan-shan;WANG Jing-zhong;College of Information Engineering of North China University of Technology;
Abstract:In the current era of big data, the Internet blog, forum produce a flood of subjective comment information which express various peoples’ color emotion and emotional tendency. It is so difficult to classify and process the massive comment information only by using the artificial methods, then how to efficiently dig out a lot of information that has appraisive views on the network has become an urgent problem at present. The research on Chinese text appraisive classification technology is the way to solve this problem. This article describes the common text feature selection algorithms, analyzes the shortcomings of document frequency and mutual information algorithm. By comparing and analyzing the two algorithms, combined with the relevance of text feature and text classification and the probability that the text feature appears, this article proposes an improved text feature selection algorithm(MIDF). The experimental results show that, MIDF is valid to the appraisive classification research.
Keywords:appraisive classification  text feature selection  appraisive feature extracting
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号