首页 | 本学科首页   官方微博 | 高级检索  
 共查询到19条相似文献,搜索用时 156 毫秒
基于机器学习的中文微博情感分类实证研究   总被引:3,自引:0,他引:3  
使用三种机器学习算法、三种特征选取算法以及三种特征项权重计算方法对微博进行了情感分类的实证研究。实验结果表明,针对不同的特征权重计算方法,支持向量机(SVM)和贝叶斯分类算法(Nave Bayes)各有优势,信息增益(IG)特征选取方法相比于其他的方法效果明显要好。综合考虑三种因素,采用SVM和IG,以及TF-IDF(Term Frequency-Inverse Document Frequency)作为特征项权重,三者结合对微博的情感分类效果最好。针对电影领域,比较了微博评论和普通评论之间分类模型的通用性,实验结果表明情感分类性能依赖于评论的风格。  相似文献   

笔者提出了一种新的情感分类的特征选择方法。为了识别属于特定类别的明显的特性,使用Z-score方法,可以识别确定的特征并使用信息增益(IG)方法来获得在确定特征领域中出现的词值。基于此,笔者提出了一个新的加权方案来进行情感分类。提出的特征选择和分类方法是在两个公开可用的数据集上使用各种文本表示方法来评估的。准确率超过10倍交叉验证法,所提出的方法执行同样层次的分类,有时优于SVM和Naive Bayes方法。  相似文献   

海量的观影评论包含着广大观众对影视作品各方面的偏好,可以为影视剧的拍摄和宣传提供决策支持。提出影视剧在线评论分析模型:利用Python爬取传媒平台的评论信息,经过数据预处理和分词,分别采用SVM方法和LSTM方法对短评文本和长评文本进行情感极性分析,运用统计和可视化方法研究评论词语、语义网络关系、情感倾向演化、文本内容特征和地域热度分布。以公安剧《狂飙》为例进行实证分析,结果表明所提模型可以合理揭示总体情感热度演变规律,发现观众发表评论的内容偏好、行为规律和地域特征。  相似文献   

电力数据安全随着电力信息网与互联网的接入变得尤为严峻,其数据与规模愈加庞大复杂。为了对其进行有效的安全分析及特征提取,提出一种基于特征提取的SQL注入攻击检测模型。从Web访问日志中提取SQL注入语法特征和行为特征,得到语法特征矩阵和行为特征矩阵数据集。以漏报率和误报率为评价指标,选取K-means、Naive Bayes、SVM和RF算法分别在两类数据集上实验。实验结果表明,与以语法特征矩阵作为数据集相比,行为特征矩阵在SQL注入攻击检测中具有更好的效果。此外SVM和RF检测效果较好,具有较低的漏报率和误报率,该方法能有效检测出SQL注入攻击。  相似文献   

基于ICA与Bayes的判别分析模型*   总被引:1,自引:0,他引:1  
简要介绍了Bayes判别分析模型的特点及存在的问题,概括了独立成分分析(ICA)的特点及发展现状,提出了基于ICA与Bayes的判别分析模型--IBD模型.该模型首先利用ICA的方法将相关性数据指标转换为互相独立的数据指标,并通过卡尔曼滤波方式滤去高频数据,有效地去除了噪声,最后利用Bayes方法对转换的数据进行判别分析.实验结果表明,当数据之间存在相关关系时,IBD模型的判别分析效果要优于Bayes与Fisher判别分析模型.  相似文献   

针对商品评论中的细粒度情感要素抽取问题,提出基于条件随机场模型( CRFs)和支持向量机( SVM)的层叠模型。针对情感对象与情感词的识别,将评论的句法信息、语义信息等引入CRFs模型,进一步提高CRFs特征模板的鲁棒性。在SVM模型中,引入情感对象和情感词的深层词义及情感词的基本情感倾向等特征,改进传统的词包模型,对掖情感对象,情感词业词对进行细粒度的情感分类判断,从而获得商品评论中的情感关键信息:(情感对象,情感词,情感倾向性)三元组。实验表明,文中的CRFs和SVM层叠模型可提高情感要素抽取与情感分类判断的准确性。  相似文献   

准确挖掘在线课程评论中蕴涵的情感信息对在线课程的健康发展极具价值.现有中文在线课程评论情感分析研究大多为分析整条评论句子情感极性的粗粒度模型,无法准确表达课程评论句子中各个方面的细粒度情感.为此,提出一种基于高效Transformer的中文在线课程评论方面情感分析模型.首先,通过ALBERT预训练模型获得评论文本方面和上下文的动态字向量编码;然后,采用可以并行输入字向量的高效Transformer分别对课程评论文本的方面和上下文进行语义表征;最后,使用交互注意机制交互地学习课程评论文本中方面和上下文的重要部分,并输入方面和上下文的最终表示到情感分类层进行在线课程评论情感极性预测.在中国MOOC网真实数据集上的实验结果表明,高效Transformer中文在线课程评论方面情感分析模型与基线模型相比,在更低的时间开销下准确率达到了80%以上.  相似文献   

分析在线课程评论中蕴含的情感对理解学习者状态变化、改进课程质量具有重要意义.依据课程评论的特征,提出一种激活-池化增强的BERT情感分析模型.构建BERT情感分析预训练模型来编码评论文本中分句内词语上下文语义和分句间逻辑关系;设计激活函数层和最大-平均池化层解决BERT模型在课程评论情感分析中存在的过拟合问题;通过新增的情感分类层对在线课程评论进行情感正负极性分类.实验结果表明,激活-池化增强的BERT模型准确率和AUC值与原始BERT模型相比分别提升了约5.5%和5.8%.  相似文献   

以竞争市场环境中的产品在线评论数据为研究对象,基于支持产品设计改进的视角,采用数据挖掘的方法与工具,开展面向产品设计改进的在线评论大数据分析研究。重点开展在线评论数据挖掘过程模型中的有用性建模和特征评价值情感分析。以某智能手机产品的在线评论数据为对象进行了实验,得到该产品各个属性的评价值,与更新换代后的产品属性进行比较,验证了此方法的有效性。  相似文献   

提出一种基于文本特征的专门面向酒店评论领域的情感分析模型,通过构建酒店评论领域专用情感词典,并结合酒店评论的句式特征、语法特点,解决了通用情感分析模型应用在酒店评论领域时,情感匹配不全面,情感值计算不精确等问题.本文实验结果表明,基于文本特征的情感分析模型能对酒店评论情感分析取得较好的分类效果.  相似文献   

In order to meet the requirement of customised services for online communities, sentiment classification of online reviews has been applied to study the unstructured reviews so as to identify users’ opinions on certain products. The purpose of this article is to select features for sentiment classification of Chinese online reviews with techniques well performed in traditional text classification. First, adjectives, adverbs and verbs are identified as the potential text features containing sentiment information. Then, four statistical feature selection methods, such as document frequency (DF), information gain (IG), chi-squared statistic (CHI) and mutual information (MI), are adopted to select features. After that, the Boolean weighting method is applied to set feature weights and construct a vector space model. Finally, a support vector machine (SVM) classifier is employed to predict the sentiment polarity of online reviews. Comparative experiments are conducted based on hotel online reviews in Chinese. The results indicate that the highest accuracy of the sentiment classification of Chinese online reviews is achieved by taking adjectives, adverbs and verbs together as the feature. Besides that, different feature selection methods make distinct performances on sentiment classification, as DF performs the best, CHI follows and IG ranks the last, whereas MI is not suitable for sentiment classification of Chinese online reviews. This conclusion will be helpful to improve the accuracy of sentiment classification and be useful for further research.  相似文献   

该文针对中文网络评论情感分类任务,提出了一种集成学习框架。首先针对中文网络评论复杂多样的特点,采用词性组合模式、频繁词序列模式和保序子矩阵模式作为输入特征。然后采用基于信息增益的随机子空间算法解决文本特征繁多的问题,同时提高基分类器的分类性能。最后基于产品属性构造基分类器算法综合评论文本中每个属性的情感信息,进而判别评论的句子级情感倾向。实验结果表明了该框架在中文网络评论情感分类任务上的有效性,特别是在Logistic Regression分类算法上准确率达到90.3%。  相似文献   

针对互联网出现的评论文本情感分析,引入潜在狄利克雷分布(Latent Dirichlet allocation,LDA)模型,提出一种分类方法。该分类方法结合情感词典,依据指定的情感单元搭配模式,提取情感信息,包括情感词和上、下文。使用主题模型发掘情感信息中的关键特征,并融入到情感向量空间中。最后利用机器学习分类算法,实现中文评论文本的情感分类。实验结果表明,提出的方法有效降低了特征向量的维度,并且在文本情感分类上有很好的效果。  相似文献   

The field of sentiment analysis (SA) has grown in tandem with the aid of social networking platforms to exchange opinions and ideas. Many people share their views and ideas around the world through social media like Facebook and Twitter. The goal of opinion mining, commonly referred to as sentiment analysis, is to categorise and forecast a target’s opinion. Depending on if they provide a positive or negative perspective on a given topic, text documents or sentences can be classified. When compared to sentiment analysis, text categorization may appear to be a simple process, but number of challenges have prompted numerous studies in this area. A feature selection-based classification algorithm in conjunction with the firefly with levy and multilayer perceptron (MLP) techniques has been proposed as a way to automate sentiment analysis (SA). In this study, online product reviews can be enhanced by integrating classification and feature election. The firefly (FF) algorithm was used to extract features from online product reviews, and a multi-layer perceptron was used to classify sentiment (MLP). The experiment employs two datasets, and the results are assessed using a variety of criteria. On account of these tests, it is possible to conclude that the FFL-MLP algorithm has the better classification performance for Canon (98% accuracy) and iPod (99% accuracy).  相似文献   

随着互联网和信息技术的迅速发展,网络上用户的评论信息越来越多。利用计算机技术分析网络中大规模文本的情感倾向,在政府的舆情分析和企业的产品评价智能回馈等应用中有着非常巨大的发展前景。文中着重研究了选取不同的文本特征对文本情感倾向性分类精度的影响。实验中所研究的不同文本特征主要包括情感词、形容词、副词、语气词和标点符号等。实验结果表明,选取情感词、形容词、副词作为特征项对情感分类具有较好的效果,在此基础上添加语气词和标点特征可以有效地提高情感分类的精度。该研究成果可用于社会舆情分析、垃圾博客过滤、商品评论与推荐、影视评价等领域。  相似文献   

具有较强褒贬倾向的词语搭配对于文本的情感分析具有重要的价值。该文提出了一种混合语言信息的词语搭配的倾向判别方法。该方法首先根据词语搭配六种模式的特点,确定出各模式的概率潜在语义模型,然后利用这些语义模型判别搭配的情感倾向。最后对部分包含情感词的搭配再利用规则修正其先前标注的情感倾向。基于汽车语料的实验结果表明,基于混合语言信息的词语搭配情感倾向判别方法优于单纯基于概率潜在语义模型或规则的方法。  相似文献   

With the growing availability and popularity of online reviews, consumers' opinions towards certain products or services are generated and spread over the Internet; sentiment analysis thus arises in response to the requirement of opinion seekers. Most prior studies are concerned with statistics-based methods for sentiment classification. These methods, however, suffer from weak comprehension of text-based messages at semantic level, thus resulting in low accuracy. We propose an ontology-based opinion-aware framework – EOSentiMiner – to conduct sentiment analysis for Chinese online reviews from a semantic perspective. The emotion space model is employed to express emotions of reviews in the EOSentiMiner, where sentiment words are classified into two types: emotional words and evaluation words. Furthermore, the former contains eight emotional classes, and the latter is divided into two opinion evaluation classes. An emotion ontology model is then built based on HowNet to express emotion in a fuzzy way. Based on emotion ontology, we evaluate some factors possibly affecting sentiment classification including features of products (services), emotion polarity and intensity, degree words, negative words, rhetoric and punctuation. Finally, sentiment calculation based on emotion ontology is proposed from sentence level to document level. We conduct experiments by using the data from online reviews of cellphone and wedding photography. The result shows the EOSentiMiner outperforms baseline methods in term of accuracy. We also find that emotion expression forms and connection relationship vary across different domains of review corpora.  相似文献   

Finding the weakness of the products from the customers’ feedback can help manufacturers improve their product quality and competitive strength. In recent years, more and more people express their opinions about products online, and both the feedback of manufacturers’ products or their competitors’ products could be easily collected. However, it’s impossible for manufacturers to read every review to analyze the weakness of their products. Therefore, finding product weakness from online reviews becomes a meaningful work. In this paper, we introduce such an expert system, Weakness Finder, which can help manufacturers find their product weakness from Chinese reviews by using aspects based sentiment analysis. An aspect is an attribute or component of a product, such as price, degerm, moisturizing are the aspects of the body wash products. Weakness Finder extracts the features and groups explicit features by using morpheme based method and Hownet based similarity measure, and identify and group the implicit features with collocation selection method for each aspect. Then utilize sentence based sentiment analysis method to determine the polarity of each aspect in sentences. The weakness of product could be found because the weakness is probably the most unsatisfied aspect in customers’ reviews, or the aspect which is more unsatisfied when compared with their competitor’s product reviews. Weakness Finder has been used to help a body wash manufacturer find their product weakness, and our experimental results demonstrate the good performance of the Weakness Finder.  相似文献   

以消费者行为分析和离散选择的相关理论为基础,通过对用户生成内容进行特征粒度的情感分析,同时从产品的客观数据和用户生成的主观内容中提取模型特征,使用有监督的学习训练MNL模型预测产品的消费者剩余作为搜索排序的依据,并实现了手机、笔记本电脑和数码相机类的产品搜索系统。双盲实验表明,该文提出的产品搜索模型搜索效果比基准算法有显著的提高。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号