首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
People and companies selling goods or providing services have always desired to know what people think about their products. The number of opinions on the Web has significantly increased with the emergence of microblogs. In this paper we present a novel method for sentiment analysis of a text that allows the recognition of opinions in microblogs which are connected to a particular target or an entity. This method differs from other approaches in utilizing appraisal theory, which we employ for the analysis of microblog posts. The results of the experiments we performed on Twitter showed that our method improves sentiment classification and is feasible even for such specific content as presented on microblogs.  相似文献   

2.
Social networks such as Twitter are used by millions of people who express their opinions on a variety of topics. Consequently, these media are constantly being examined by sentiment analysis systems which aim at classifying the posts as positive or negative. Given the variety of topics discussed and the short length of the posts, the standard approach of using the words as features for machine learning algorithms results in sparse vectors. In this work, we propose using features derived from the ranking generated by an Information Retrieval System in response to a query consisting of the post that needs to be classified. Our system can be fully automatic, has only 24 features, and does not depend on expensive resources. Experiments on real datasets have shown that a classifier that relies solely on these features outperforms established baselines and can reach accuracies comparable to the state-of-the-art approaches which are more costly.  相似文献   

3.
方丁  王刚 《计算机系统应用》2012,21(7):177-181,248
随着Web2.0的迅速发展,越来越多的用户乐于在互联网上分享自己的观点或体验。这类评论信息迅速膨胀,仅靠人工的方法难以应对网上海量信息的收集和处理,因此基于计算机的文本情感分类技术应运而生,并且研究的重点之一就是提高分类的精度。由于集成学习理论是提高分类精度的一种有效途径,并且已在许多领域显示出其优于单个分类器的良好性能,为此,提出基于集成学习理论的文本情感分类方法。实验结果显示三种常用的集成学习方法 Bagging、Boosting和Random Subspace对基础分类器的分类精度都有提高,并且在不同的基础分类器条件下,Random Subspace方法较Bagging和Boosting方法在统计意义上更优,以上结果进一步验证了集成学习理论在文本情感分类中应用的有效性。  相似文献   

4.
The idiosyncrasy of the Web has, in the last few years, been altered by Web 2.0 technologies and applications and the advent of the so-called Social Web. While users were merely information consumers in the traditional Web, they play a much more active role in the Social Web since they are now also data providers. The mass involved in the process of creating Web content has led many public and private organizations to focus their attention on analyzing this content in order to ascertain the general public’s opinions as regards a number of topics. Given the current Web size and growth rate, automated techniques are essential if practical and scalable solutions are to be obtained. Opinion mining is a highly active research field that comprises natural language processing, computational linguistics and text analysis techniques with the aim of extracting various kinds of added-value and informational elements from users’ opinions. However, current opinion mining approaches are hampered by a number of drawbacks such as the absence of semantic relations between concepts in feature search processes or the lack of advanced mathematical methods in sentiment analysis processes. In this paper we propose an innovative opinion mining methodology that takes advantage of new Semantic Web-guided solutions to enhance the results obtained with traditional natural language processing techniques and sentiment analysis processes. The main goals of the proposed methodology are: (1) to improve feature-based opinion mining by using ontologies at the feature selection stage, and (2) to provide a new vector analysis-based method for sentiment analysis. The methodology has been implemented and thoroughly tested in a real-world movie review-themed scenario, yielding very promising results when compared with other conventional approaches.  相似文献   

5.
Social media sites and applications, including Facebook, YouTube, Twitter and blogs, have become major social media attractions today. The huge amount of information from this medium has become an attractive resource for organisations to monitor the opinions of users, and therefore, it is receiving a lot of attention in the field of sentiment analysis. Early work on sentiment analysis approached this problem at a document-level, where the overall sentiment was identified, rather than the details of the sentiment. This research took into account the use of an aspect-based sentiment analysis on Twitter in order to perform a finer-grained analysis. A new hybrid sentiment classification for Twitter is proposed by embedding a feature selection method. A comparison of the accuracy of the classification by the principal component analysis (PCA), latent semantic analysis (LSA), and random projection (RP) feature selection methods are presented in this paper. Furthermore, the hybrid sentiment classification was validated using Twitter datasets to represent different domains, and the evaluation with different classification algorithms also demonstrated that the new hybrid approach produced meaningful results. The implementations showed that the new hybrid sentiment classification was able to improve the accuracy performance from the existing baseline sentiment classification methods by 76.55, 71.62 and 74.24%, respectively.  相似文献   

6.
为获得更具判别性的视觉特征并提升情感分类效果,构建融合双注意力多层特征的视觉情感分析模型。通过卷积神经网络提取图像多通道的多层次特征,根据空间注意力机制对多通道的低层特征赋予空间注意力权重,利用通道注意力机制对多通道的高层特征赋予通道注意力权重,分别强化不同层次的特征表示,将强化后的高层特征和低层特征进行融合,形成用于训练情感分类器的判别性特征。在3个真实数据集Twitter Ⅰ、Twitter Ⅱ和EmotionROI上进行对比实验,结果表明,该模型的分类准确率分别达到79.83%、78.25%和49.34%,有效提升了社交媒体视觉情感分析的效果。  相似文献   

7.
Twitter is a radiant platform with a quick and effective technique to analyze users’ perceptions of activities on social media. Many researchers and industry experts show their attention to Twitter sentiment analysis to recognize the stakeholder group. The sentiment analysis needs an advanced level of approaches including adoption to encompass data sentiment analysis and various machine learning tools. An assessment of sentiment analysis in multiple fields that affect their elevations among the people in real-time by using Naive Bayes and Support Vector Machine (SVM). This paper focused on analysing the distinguished sentiment techniques in tweets behaviour datasets for various spheres such as healthcare, behaviour estimation, etc. In addition, the results in this work explore and validate the statistical machine learning classifiers that provide the accuracy percentages attained in terms of positive, negative and neutral tweets. In this work, we obligated Twitter Application Programming Interface (API) account and programmed in python for sentiment analysis approach for the computational measure of user’s perceptions that extract a massive number of tweets and provide market value to the Twitter account proprietor. To distinguish the results in terms of the performance evaluation, an error analysis investigates the features of various stakeholders comprising social media analytics researchers, Natural Language Processing (NLP) developers, engineering managers and experts involved to have a decision-making approach.  相似文献   

8.
《Information & Management》2016,53(8):987-996
Social media is a major platform for opinion sharing. In order to better understand and exploit opinions on social media, we aim to classify users with opposite opinions on a topic for decision support. Rather than mining text content, we introduce a link-based classification model, named global consistency maximization (GCM) that partitions a social network into two classes of users with opposite opinions. Experiments on a Twitter data set show that: (1) our global approach achieves higher accuracy than two baseline approaches and (2) link-based classifiers are more robust to small training samples if selected properly.  相似文献   

9.
Web opinion feeds have become one of the most popular information sources users consult before buying products or contracting services. Negative opinions about a product can have a high impact in its sales figures. As a consequence, companies are more and more concerned about how to integrate opinion data in their business intelligence models so that they can predict sales figures or define new strategic goals. After analysing the requirements of this new application, this paper proposes a multidimensional data model to integrate sentiment data extracted from opinion posts in a traditional corporate data warehouse. Then, a new sentiment data extraction method that applies semantic annotation as a means to facilitate the integration of both types of data is presented. In this method, Wikipedia is used as the main knowledge resource, together with some well-known lexicons of opinion words and other corporate data and metadata stores describing the company products like, for example, technical specifications and user manuals. The resulting information system allows users to perform new analysis tasks by using the traditional OLAP-based data warehouse operators. We have developed a case study over a set of real opinions about digital devices which are offered by a wholesale dealer. Over this case study, the quality of the extracted sentiment data is evaluated, and some query examples that illustrate the potential uses of the integrated model are provided.  相似文献   

10.
After the outbreak of COVID-19, the global economy entered a deep freeze. This observation is supported by the Volatility Index (VIX), which reflects the market risk expected by investors. In the current study, we predicted the VIX using variables obtained from the sentiment analysis of data on Twitter posts related to the keyword “COVID-19,” using a model integrating the bidirectional long-term memory (BiLSTM), autoregressive integrated moving average (ARIMA) algorithm, and generalized autoregressive conditional heteroskedasticity (GARCH) model. The Linguistic Inquiry and Word Count (LIWC) program and Valence Aware Dictionary for Sentiment Reasoning (VADER) model were utilized as sentiment analysis methods. The results revealed that during COVID-19, the proposed integrated model, which trained both the Twitter sentiment values and historical VIX values, presented better results in forecasting the VIX in time-series regression and direction prediction than those of the other existing models.  相似文献   

11.
随着以用户为中心的Web 2.0的发展,社交网络平台以惊人的影响力渗入到生活的方方面面,对社交网络中的内容进行情感分析已经成为热点研究课题.Twitter、新浪微博等在线社交网站吸引了大量用户,通过用户间的交互,产生了许多包含用户间社会关系的信息,并且这些社会关系被广泛应用于社交网络的情感分析.融合社会关系的社交网络情...  相似文献   

12.
In this article, we address the issue of how emotional stability affects social relationships in Twitter. In particular, we focus our study on users’ communicative interactions, identified by the symbol “@.” We collected a corpus of about 200,000 Twitter posts, and we annotated it with our personality recognition system. This system exploits linguistic features, such as punctuation and emoticons, and statistical features, such as follower count and retweeted posts. We tested the system on a data set annotated with personality models produced by human subjects and against a software for the analysis of Twitter data. Social network analysis shows that, whereas secure users have more mutual connections, neurotic users post more than secure ones and have the tendency to build longer chains of interacting users. Clustering coefficient analysis reveals that, whereas secure users tend to build stronger networks, neurotic users have difficulty in belonging to a stable community; hence, they seek for new contacts in online social networks.  相似文献   

13.
基于情感词典扩展技术的网络舆情倾向性分析   总被引:7,自引:0,他引:7  
随着Web2.0时代的到来,网络已逐渐成为反映社会舆情的重要载体之一,网络舆情发现及网民的观点和倾向性挖掘也成为新的研究热点,但是目前尚无有效反应网民对热点事件或话题总体态度的舆情分析系统.本文针对网民关于话题评论简单、数目众多的特点,应用HowNet和NTUSD两种资源对现有情感词典进行扩展,建立了一个新的、具有倾向程度的情感词典.基于扩展的情感词典,开发了一个半自动化网络舆情分析系统.该系统能够为用户提供更加细致、准确的评论倾向性分析结果.  相似文献   

14.
This study addresses the problem of Chinese microblog opinion retrieval, which aims to retrieve opinionated Chinese microblog posts relevant to a target specified by a user query. Existing studies have shown that lexicon-based approaches employed online public sentiment resources to rank sentimentwords relying on the document features. However, this approach could not be effectively applied to microblogs that have typical user-generated content with valuable contextual information: “user–user” interpersonal interactions and “user–post/comment” intrapersonal interactions. This contextual information is very helpful in estimating the strength of sentiment words more accurately. In this study, we integrate the social contextual relationships among users, posts/comments, and sentiment words into a mutual reinforcement model and propose a unified three-layer heterogeneous graph, on which a random walk sentiment word weighting algorithm is presented to measure the strength of opinion of the sentiment words. Furthermore, the weights of sentiment words are incorporated into a lexicon-based model for Chinese microblog opinion retrieval. Comparative experiments are conducted on a Chinese microblog corpus, and the results show that our proposed mutual reinforcement model achieves significant improvement over previous methods.  相似文献   

15.
跨领域情感分类任务旨在利用富含情感标签的源域数据对缺乏标签的目标域数据进行情感极性分析.由此,文中提出基于对抗式分布对齐的跨域方面级情感分类模型,利用方面词与上下文的交互注意力学习语义关联,基于梯度反转层的领域分类器学习共享的特征表示.利用对抗式训练扩大领域分布的对齐边界,有效缓解模糊特征导致错误分类的问题.在Seme...  相似文献   

16.
随着网络的发展,Web论坛成为Web用户信息共享和分组合作的新平台.Web论坛上积累了海量的知识,由此成为互联网上进行数据挖掘的宝贵资源.在Web论坛上的应用常受到论坛上低质量帖子(垃圾贴)的影响.因此针对在Web论坛上进行垃圾贴过滤的问题,提出了基于隐含狄利克雷分布的CJTM和CAJTM模型,CJTM和CAJTM模型利用了论坛帖子的文本内容,帖子间的回复链接信息和作者信息,和传统的分类方法及基于规则的方法相比,CJTM和CAJTM模型不需要训练集和规则集.在实际Web论坛数据中进行的实验显示出较好的效果.  相似文献   

17.
基于随机游走模型的跨领域倾向性分析研究   总被引:2,自引:1,他引:1  
近年来,研究者们已经在跨领域倾向性分析方面取得了一些进展.然而,现有的方法和系统往往只根据已标注文本或者已标注情感词对目标领域文本进行倾向性分析,却缺乏一个统一的模型框架将文本与情感词之间全部知识进行有机的融合.提出了一种基于随机游走模型的跨领域倾向性分析方法,该模型能够同时利用源领域和目标领域文本与词之间的所有关系来对文本与词进行互相增强,旨在将文本之间的关系、词之间的关系、文本与词之间的相互关系集成到一个完整的理论框架中.实验结果表明,提出的算法能大幅度提高跨领域倾向性分析的精度.  相似文献   

18.
宋双永  李秋丹  路冬媛 《计算机科学》2012,39(105):226-228,260
微博客是一种新兴的网络信息交互平台,近年来受到越来越多的用户的关注。信息的简洁性以及传播渠道的多样性使得微博客成为广大网民浏览热点事件相关信息和发表个人观点的重要途径。分析和监测微博客内容中所包含的情感信息,能够了解民众对特定热点事件的关注程度和情感变化,从而辅助评佑和掌握事件的发展状况。因此,提出一种面向微博客的热点事件情感分析方法,该方法首先自动挖掘用户对某热点事件的多个关注点,并针对不同关注点进行情感分析以及情感趋势监测,最终实现一个可视化的热点事件情感趋势分析原型系统。通过实例验证了微博客信息在网络热点事件的情感分析和监测中的有效性。  相似文献   

19.
Nowadays, Big Data, a large volume of both structured and unstructured data, is generated from Social Media. Social Media are powerful marketing tools and social big data can offer the business insights. The major challenge facing social big data is attaining efficient techniques to collect a large volume of social data and extract insights from the huge amount of collected data. Sentiment Analysis of social big data can provide business insights by extracting the public opinions. The traditional analytic platforms need to be scaled up for analyzing a large volume of social big data. Social data are by nature shorter and generally not constructed with proper grammatical rules and hence difficult to achieve high reliable result in Sentiment Analysis. Acquiring effective training data is a challenge, although learning based approaches are good for sentiment classification. Manual Labeling for training data is time and labor consuming. In this paper, Sentiment Analysis system on Big Data Analytics platform is proposed to provide valuable information by analyzing large scale social data in an efficient and timely manner since they have been implemented using a MapReduce framework and a Hadoop distributed storage (HDFS). The proposed Sentiment Analysis system consists of four modules: data collection, data cleaning and preprocessing, class labeling and sentiment classification. The system enables high-level performance of sentiment classification while taking advantage of combining lexicon-based classifier’s effortless setup process and learning based classifier. Twitter stream data is used for system evaluation as the Twitter is widespread Social Media and a good source of information in the sense of snapshots of moods and feelings as well as up-to-date events. The evaluation results show that this system achieve a promising accuracy by 84.2%. Moreover, this system is able to scale up to analyze the large scale data by decreasing the processing time when adding more nodes in the cluster.  相似文献   

20.
目的 自动检测谣言至关重要,目前已有多种谣言检测方法,但存在以下两点局限:1)只考虑文本内容,忽略了可用于判断谣言的辅助多模态信息;2)只关注时间序列模型捕捉谣言事件的时间特征,没有很好地研究事件的局部信息和全局信息。为了克服这些局限性,有效利用多模态帖子信息并联合多种编码策略构建每个新闻事件的表示,本文提出一种新颖的基于多模态多层次事件网络的社交媒体谣言检测方法。方法 通过一个多模态的帖子嵌入层,同时利用文本内容和视觉内容;将多模态的帖子嵌入向量送入多层次事件编码网络,联合使用多种编码策略,以由粗到细的方式描述事件特征。结果 在Twitter和Pheme数据集上的大量实验表明,本文提出的多模态多层次事件网络模型比现有的SVM-TS(support vector machine—time structure)、CNN(convolutional neural network)、GRU(gated recurrent unit)、CallAtRumors和MKEMN(multimodal knowledge-aware event memory network)等方法在准确率上提升了4 %以上。结论 本文提出的谣言检测模型,对每个事件的全局、时间和局部信息进行建模,提升了谣言检测的性能。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号