首页 | 本学科首页   官方微博 | 高级检索  
     

基于语义理解和机器学习的混合的中文文本情感分类算法框架
引用本文:徐健锋,许 园,许元辰,张远健,刘 清.基于语义理解和机器学习的混合的中文文本情感分类算法框架[J].计算机科学,2015,42(6):61-66.
作者姓名:徐健锋  许 园  许元辰  张远健  刘 清
作者单位:1. 南昌大学软件学院 南昌330047
2. 南昌大学信息工程学院 南昌330031
基金项目:本文受本体学习与粒计算基金(61070139)资助
摘    要:快速、准确和全面地从大量互联网文本信息中定位情感倾向是当前大数据技术领域面临的一大挑战.文本情感分类方法大致分为基于语义理解和基于有监督的机器学习两类.语义理解处理情感分类的优势在于其对不同领域的文本都可以进行情感分类,但容易受到中文存在的不同句式及搭配的影响,分类精度不高.有监督的机器学习虽然能够达到比较高的情感分类精度,但在一个领域方面得到较高分类能力的分类器不适应新领域的情感分类.在使用信息增益对高维文本做特征降维的基础上,将优化的语义理解和机器学习相结合,设计了一种新的混合语义理解的机器学习中文情感分类算法框架.基于该框架的多组对比实验验证了文本信息在不同领域中高且稳定的分类精度.

关 键 词:情感分类  语义  机器学习

Hybrid Algorithm Framework for Sentiment Classification of Chinese Based on Semantic Comprehension and Machine Learning
XU Jian-feng,XU Yuan,XU Yuan-chen,ZHANG Yuan-jian and LIU Qing.Hybrid Algorithm Framework for Sentiment Classification of Chinese Based on Semantic Comprehension and Machine Learning[J].Computer Science,2015,42(6):61-66.
Authors:XU Jian-feng  XU Yuan  XU Yuan-chen  ZHANG Yuan-jian and LIU Qing
Affiliation:Software College,Nanchang University,Nanchang 330047,China,Software College,Nanchang University,Nanchang 330047,China,School of Information Engineering,Nanchang University,Nanchang 330031,China,Software College,Nanchang University,Nanchang 330047,China and School of Information Engineering,Nanchang University,Nanchang 330031,China
Abstract:In the background of big data,it is a major challenge to distinguish sentiment orientation from a large number of Internet text information quickly,accurately and comprehensively.The main sentiment classification methods of text information are roughly divided into two categories:one is semantic comprehension and the other is supervised machine learning.The advantage of dealing with sentiment classification by using semantic comprehension method is that it can classify the text in different fields.However,the performance can be greatly affected by avariety of word collocations and sentence patterns.The supervised machine learning method can achieve higher classification accuracy,however,a satisfying classification classifier in a field may not be suitable for a new field.This paper proposed a new hybrid algorithm framework for Chinese sentiment classification combining optimized semantic comprehension and machine lear-ning based on the features extracted by information gain.Experimental results on two separate fields show that this framework has both high classification accuracy and satisfying portability.
Keywords:Sentiment classification  Semantic  Machine learning
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号