首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Twitter is a radiant platform with a quick and effective technique to analyze users’ perceptions of activities on social media. Many researchers and industry experts show their attention to Twitter sentiment analysis to recognize the stakeholder group. The sentiment analysis needs an advanced level of approaches including adoption to encompass data sentiment analysis and various machine learning tools. An assessment of sentiment analysis in multiple fields that affect their elevations among the people in real-time by using Naive Bayes and Support Vector Machine (SVM). This paper focused on analysing the distinguished sentiment techniques in tweets behaviour datasets for various spheres such as healthcare, behaviour estimation, etc. In addition, the results in this work explore and validate the statistical machine learning classifiers that provide the accuracy percentages attained in terms of positive, negative and neutral tweets. In this work, we obligated Twitter Application Programming Interface (API) account and programmed in python for sentiment analysis approach for the computational measure of user’s perceptions that extract a massive number of tweets and provide market value to the Twitter account proprietor. To distinguish the results in terms of the performance evaluation, an error analysis investigates the features of various stakeholders comprising social media analytics researchers, Natural Language Processing (NLP) developers, engineering managers and experts involved to have a decision-making approach.  相似文献   

2.
Twitter messages are increasingly used to determine consumer sentiment towards a brand. The existing literature on Twitter sentiment analysis uses various feature sets and methods, many of which are adapted from more traditional text classification problems. In this research, we introduce an approach to supervised feature reduction using n-grams and statistical analysis to develop a Twitter-specific lexicon for sentiment analysis. We augment this reduced Twitter-specific lexicon with brand-specific terms for brand-related tweets. We show that the reduced lexicon set, while significantly smaller (only 187 features), reduces modeling complexity, maintains a high degree of coverage over our Twitter corpus, and yields improved sentiment classification accuracy. To demonstrate the effectiveness of the devised Twitter-specific lexicon compared to a traditional sentiment lexicon, we develop comparable sentiment classification models using SVM. We show that the Twitter-specific lexicon is significantly more effective in terms of classification recall and accuracy metrics. We then develop sentiment classification models using the Twitter-specific lexicon and the DAN2 machine learning approach, which has demonstrated success in other text classification problems. We show that DAN2 produces more accurate sentiment classification results than SVM while using the same Twitter-specific lexicon.  相似文献   

3.
In emergencies, Twitter is an important platform to get situational awareness simultaneously. Therefore, information about Twitter users’ location is a fundamental aspect to understand the disaster effects. But location extraction is a challenging task. Most of the Twitter users do not share their locations in their tweets. In that respect, there are different methods proposed for location extraction which cover different fields such as statistics, machine learning, etc. This study is a sample study that utilizes geo-tagged tweets to demonstrate the importance of the location in disaster management by taking three cases into consideration. In our study, tweets are obtained by utilizing the “earthquake” keyword to determine the location of Twitter users. Tweets are evaluated by utilizing the Latent Dirichlet Allocation (LDA) topic model and sentiment analysis through machine learning classification algorithms including the Multinomial and Gaussian Naïve Bayes, Support Vector Machine (SVM), Decision Tree, Random Forest, Extra Trees, Neural Network, k Nearest Neighbor (kNN), Stochastic Gradient Descent (SGD), and Adaptive Boosting (AdaBoost) classifications. Therefore, 10 different machine learning algorithms are applied in our study by utilizing sentiment analysis based on location-specific disaster-related tweets by aiming fast and correct response in a disaster situation. In addition, the effectiveness of each algorithm is evaluated in order to gather the right machine learning algorithm. Moreover, topic extraction via LDA is provided to comprehend the situation after a disaster. The gathered results from the application of three cases indicate that Multinomial Naïve Bayes and Extra Trees machine learning algorithms give the best results with an F-measure value over 80%. The study aims to provide a quick response to earthquakes by applying the aforementioned techniques.  相似文献   

4.
Sentiment analysis focuses on identifying and classifying the sentiments expressed in text messages and reviews. Social networks like Twitter, Facebook, and Instagram generate heaps of data filled with sentiments, and the analysis of such data is very fruitful when trying to improve the quality of both products and services alike. Classic machine learning techniques have a limited capability to efficiently analyze such large amounts of data and produce precise results; they are thus supported by deep learning models to achieve higher accuracy. This study proposes a combination of convolutional neural network and long short‐term memory (CNN‐LSTM) deep network for performing sentiment analysis on Twitter datasets. The performance of the proposed model is analyzed with machine learning classifiers, including the support vector classifier, random forest (RF), stochastic gradient descent (SGD), logistic regression, a voting classifier (VC) of RF and SGD, and state‐of‐the‐art classifier models. Furthermore, two feature extraction methods (term frequency‐inverse document frequency and word2vec) are also investigated to determine their impact on prediction accuracy. Three datasets (US airline sentiments, women's e‐commerce clothing reviews, and hate speech) are utilized to evaluate the performance of the proposed model. Experiment results demonstrate that the CNN‐LSTM achieves higher accuracy than those of other classifiers.  相似文献   

5.
基于LSTM的商品评论情感分析   总被引:1,自引:0,他引:1  
随着电子商务的发展,产生了大量的商品评论文本.针对商品评论的短文本特征,基于情感词典的情感分类方法需要大量依赖于情感数据库资源,而机器学习的方法又需要进行复杂的人工设计特征和提取特征过程.本文提出采用长短期记忆网络(Long Short-Term Memory)文本分类算法进行情感倾向分析,首先利用Word2vec和分词技术将评论短文本文本处理为计算机可理解的词向量传入LSTM网络并加入Dropout算法以防止过拟合得出最终的分类模型.实验表明:在基于深度学习的商品评论情感倾向分析中,利用LSTM网络的短时记忆独特特征对商品评论的情感分类取得了很好的效果,准确率达到99%以上.  相似文献   

6.
Twitter has emerged as a platform that produces new data every day through its users which can be utilized for various purposes. People express their unique ideas and views on multiple topics thus providing vast knowledge. Sentiment analysis is critical from the corporate and political perspectives as it can impact decision-making. Since the proliferation of COVID-19, it has become an important challenge to detect the sentiment of COVID-19-related tweets so that people’s opinions can be tracked. The purpose of this research is to detect the sentiment of people regarding this problem with limited data as it can be challenging considering the various textual characteristics that must be analyzed. Hence, this research presents a deep learning-based model that utilizes the positives of random minority oversampling combined with class label analysis to achieve the best results for sentiment analysis. This research specifically focuses on utilizing class label analysis to deal with the multiclass problem by combining the class labels with a similar overall sentiment. This can be particularly helpful when dealing with smaller datasets. Furthermore, our proposed model integrates various preprocessing steps with random minority oversampling and various deep learning algorithms including standard deep learning and bi-directional deep learning algorithms. This research explores several algorithms and their impact on sentiment analysis tasks and concludes that bidirectional neural networks do not provide any advantage over standard neural networks as standard Neural Networks provide slightly better results than their bidirectional counterparts. The experimental results validate that our model offers excellent results with a validation accuracy of 92.5% and an F1 measure of 0.92.  相似文献   

7.
Deniz Kılınç 《Software》2019,49(9):1352-1364
There are many data sources that produce large volumes of data. The Big Data nature requires new distributed processing approaches to extract the valuable information. Real-time sentiment analysis is one of the most demanding research areas that requires powerful Big Data analytics tools such as Spark. Prior literature survey work has shown that, though there are many conventional sentiment analysis researches, there are only few works realizing sentiment analysis in real time. One major point that affects the quality of real-time sentiment analysis is the confidence of the generated data. In more clear terms, it is a valuable research question to determine whether the owner that generates sentiment is genuine or not. Since data generated by fake personalities may decrease accuracy of the outcome, a smart/intelligent service that can identify the source of data is one of the key points in the analysis. In this context, we include a fake account detection service to the proposed framework. Both sentiment analysis and fake account detection systems are trained and tested using Naïve Bayes model from Apache Spark's machine learning library. The developed system consists of four integrated software components, ie, (i) machine learning and streaming service for sentiment prediction, (ii) a Twitter streaming service to retrieve tweets, (iii) a Twitter fake account detection service to assess the owner of the retrieved tweet, and (iv) a real-time reporting and dashboard component to visualize the results of sentiment analysis. The sentiment classification performances of the system for offline and real-time modes are 86.77% and 80.93%, respectively.  相似文献   

8.
9.
卢天兰  陈荔 《计算机应用研究》2021,38(5):1409-1415,1427
方面情感分析是指分析语句中目标方面项的情感极性,但目前较少研究语句中邻近方面项间依赖关系对情感分类的影响。基于此,针对方面情感分析提出一个结合基于注意力机制的双向LSTM和多跳端到端记忆网络的方面情感分类模型。首先利用Bi-LSTM的序列学习能力,并引入注意力机制来得到语义向量表示;然后用多跳记忆网络来对目标方面项和语句中其余方面项间相关性进行建模构建深层的情感分类特征向量,输入到softmax函数得到最终的情感极性分类结果。该模型在SemEval 2014任务中的restaurant和laptop两个数据集和一组公开的Twitter数据集上进行实验,在三个数据集上的分类准确率都有所提高。实验结果表明,该模型对方面级别情感分类的有效性和考虑方面间依赖关系对于情感分类是有益的。  相似文献   

10.
随着微博用户数量的快速增长,微博中所携带的一些情感和观点对社会的影响越来越大,尤其是一些涉及到公众人身安全的负面情绪,可能会影响到社会的稳定,因此进行微博情感分析意义重大。微博情感分析的内容包括微博语料的获取、微博语料的预处理和情感分析方法等,常用的情感分析方法有基于情感词典的方法、基于机器学习的方法和基于深度学习的方法。随着注意力机制在NLP领域的广泛使用,很多研究者开始将注意力机制融合到深度学习模型中进行情感分析,这使得情感分析的准确率得到了很大的提升。谷歌提出的BERT模型本质上也是基于注意力机制实现的,BERT模型在情感分析领域取得了突破性的进展。  相似文献   

11.
随着社交网络的日益普及,基于Twitter文本的情感分析成为近年来的研究热点。Twitter文本中蕴含的情感倾向对于挖掘用户需求和对重大事件的预测具有重要意义。但由于Twitter文本短小和用户自身行为存在随意性等特点,再加之现有的情感分类方法大都基于手工制作的文本特征,难以挖掘文本中隐含的深层语义特征,因此难以提高情感分类性能。本文提出了一种基于卷积神经网络的Twitter文本情感分类模型。该模型利用word2vec方法初始化文本词向量,并采用CNN模型学习文本中的深层语义信息,从而挖掘Twitter文本的情感倾向。实验结果表明,采用该模型能够取得82.3%的召回率,比传统分类方法的分类性能有显著提高。  相似文献   

12.
贾川  方睿  浦东  康刚 《中文信息学报》2019,33(9):123-128
目前,深度神经网络模型已经在文本情感分析领域取得了较好的效果,但是对于属性相关的细粒度的情感分析任务,现有研究方法的效果仍有待改进。该文提出了一种基于循环实体网络来进行细粒度情感分析的方法,在网络中嵌入预定义的评价属性类别信息,利用扩大的内部记忆链来抽取与每个属性类别相关的情感特征,并通过动态记忆单元控制与属性相关情感信息的远距离依赖,然后,对于给定的单个属性类别,利用注意力机制从内部记忆链中抽取该属性类别的情感特征进行分类。该文提出的方法在Sentihood数据上与目前精度最高的方法相比,取得了近1个百分点的提升,而且模型的收敛速度更快。  相似文献   

13.
The emergence of Web 2.0 has drastically altered the way users perceive the Internet, by improving information sharing, collaboration and interoperability. Micro-blogging is one of the most popular Web 2.0 applications and related services, like Twitter, have evolved into a practical means for sharing opinions on almost all aspects of everyday life. Consequently, micro-blogging web sites have since become rich data sources for opinion mining and sentiment analysis. Towards this direction, text-based sentiment classifiers often prove inefficient, since tweets typically do not consist of representative and syntactically consistent words, due to the imposed character limit. This paper proposes the deployment of original ontology-based techniques towards a more efficient sentiment analysis of Twitter posts. The novelty of the proposed approach is that posts are not simply characterized by a sentiment score, as is the case with machine learning-based classifiers, but instead receive a sentiment grade for each distinct notion in the post. Overall, our proposed architecture results in a more detailed analysis of post opinions regarding a specific topic.  相似文献   

14.
罗浩然  杨青 《计算机应用》2022,42(4):1099-1107
情感分析作为自然语言处理(NLP)的细分研究方向经历了使用情感词典、机器学习和深度学习分析的发展过程。针对使用一般化的深度学习模型作为文本分类器对于特定领域的网络评论类型的文本的分析的精准度较低,训练时发生过拟合现象以及情感词典覆盖率低、编纂工作量大的问题,提出了基于情感词典和堆叠残差的双向长短期记忆(Bi-LSTM)网络的情感分析模型。首先,借助情感词典中情感词的设计覆盖“教育机器人”研究领域内的专业词汇,从而弥补Bi-LSTM模型在分析此类文本时精准度的不足;然后,使用Bi-LSTM和SnowNLP来降低情感词典的编纂体量。长短期记忆(LSTM)网络的“记忆门”“遗忘门”结构可以在保证充分考虑评论文本中的前后词语的关联性的同时,适时选择遗忘一些已分析词语,从而避免反向传播时的梯度爆炸问题。而在将堆叠残差的Bi-LSTM引入后,不仅使得模型的层数加深至8层,而且还使残差网络避免了叠加LSTM时会导致的“退化”问题;最后,通过适当设置和调整两部分的得分权重,并将总分使用Sigmoid激活函数标准化到[0,1]的区间上,按照[0,0.5],(0.5,1]的区间划分分别表示负面和正面情绪,完成情感分类。实验结果表明,在“教育机器人”评论数据集中,所提模型对于情感分类准确率相较于标准的LSTM模型提升了约4.5个百分点,相较于BERT提升了约2.0个百分点。综上,所提模型将基于情感词典和深度学习模型的情感分类方法一般化;而通过修改情感词典中的情感词汇并适当调整深度学习模型的结构和层数,所提模型可以应用于电子商务平台中各类商品的购物评价的精确情感分析,从而帮助企业洞悉消费者的购物心理和市场需求,同时也可以为消费者提供商品质量的一种参考标准。  相似文献   

15.
Millions of people are connecting and exchanging information on social media platforms, where interpersonal interactions are constantly being shared. However, due to inaccurate or misleading information about the COVID-19 pandemic, social media platforms became the scene of tense debates between believers and doubters. Healthcare professionals and public health agencies also use social media to inform the public about COVID-19 news and updates. However, they occasionally have trouble managing massive pandemic-related rumors and frauds. One reason is that people share and engage, regardless of the information source, by assuming the content is unquestionably true. On Twitter, users use words and phrases literally to convey their views or opinion. However, other users choose to utilize idioms or proverbs that are implicit and indirect to make a stronger impression on the audience or perhaps to catch their attention. Idioms and proverbs are figurative expressions with a thematically coherent totality that cannot understand literally. Despite more than 10% of tweets containing idioms or slang, most sentiment analysis research focuses on the accuracy enhancement of various classification algorithms. However, little attention would decipher the hidden sentiments of the expressed idioms in tweets. This paper proposes a novel data expansion strategy for categorizing tweets concerning COVID-19. The following are the benefits of the suggested method: 1) no transformer fine-tuning is necessary, 2) the technique solves the fundamental challenge of the manual data labeling process by automating the construction and annotation of the sentiment lexicon, 3) the method minimizes the error rate in annotating the lexicon, and drastically improves the tweet sentiment classification’s accuracy performance.  相似文献   

16.
One of the drastically growing and emerging research areas used in most information technology industries is Bigdata analytics. Bigdata is created from social websites like Facebook, WhatsApp, Twitter, etc. Opinions about products, persons, initiatives, political issues, research achievements, and entertainment are discussed on social websites. The unique data analytics method cannot be applied to various social websites since the data formats are different. Several approaches, techniques, and tools have been used for big data analytics, opinion mining, or sentiment analysis, but the accuracy is yet to be improved. The proposed work is motivated to do sentiment analysis on Twitter data for cloth products using Simulated Annealing incorporated with the Multiclass Support Vector Machine (SA-MSVM) approach. SA-MSVM is a hybrid heuristic approach for selecting and classifying text-based sentimental words following the Natural Language Processing (NLP) process applied on tweets extracted from the Twitter dataset. A simulated annealing algorithm searches for relevant features and selects and identifies sentimental terms that customers criticize. SA-MSVM is implemented, experimented with MATLAB, and the results are verified. The results concluded that SA-MSVM has more potential in sentiment analysis and classification than the existing Support Vector Machine (SVM) approach. SA-MSVM has obtained 96.34% accuracy in classifying the product review compared with the existing systems.  相似文献   

17.
With the increasing usage of drugs to remedy different diseases, drug safety has become crucial over the past few years. Often medicine from several companies is offered for a single disease that involves the same/similar substances with slightly different formulae. Such diversification is both helpful and dangerous as such medicine proves to be more effective or shows side effects to different patients. Despite clinical trials, side effects are reported when the medicine is used by the mass public, of which several such experiences are shared on social media platforms. A system capable of analyzing such reviews could be very helpful to assist healthcare professionals and companies for evaluating the safety of drugs after it has been marketed. Sentiment analysis of drug reviews has a large potential for providing valuable insights into these cases. Therefore, this study proposes an approach to perform analysis on the drug safety reviews using lexicon-based and deep learning techniques. A dataset acquired from the ‘Drugs.Com’ containing reviews of drug-related side effects and reactions, is used for experiments. A lexicon-based approach, Textblob is used to extract the positive, negative or neutral sentiment from the review text. Review classification is achieved using a novel hybrid deep learning model of convolutional neural networks and long short-term memory (CNN-LSTM) network. The CNN is used at the first level to extract the appropriate features while LSTM is used at the second level. Several well-known machine learning models including logistic regression, random forest, decision tree, and AdaBoost are evaluated using term frequency-inverse document frequency (TF-IDF), a bag of words (BoW), feature union of (TF-IDF + BoW), and lexicon-based methods. Performance analysis with machine learning models, long short term memory and convolutional neural network models, and state-of-the-art approaches indicate that the proposed CNN-LSTM model shows superior performance with an 0.96 accuracy. We also performed a statistical significance T-test to show the significance of the proposed CNN-LSTM model in comparison with other approaches.  相似文献   

18.
A major problem in monitoring the online reputation of companies, brands, and other entities is that entity names are often ambiguous (apple may refer to the company, the fruit, the singer, etc.). The problem is particularly hard in microblogging services such as Twitter, where texts are very short and there is little context to disambiguate. In this paper we address the filtering task of determining, out of a set of tweets that contain a company name, which ones do refer to the company. Our approach relies on the identification of filter keywords: those whose presence in a tweet reliably confirm (positive keywords) or discard (negative keywords) that the tweet refers to the company.We describe an algorithm to extract filter keywords that does not use any previously annotated data about the target company. The algorithm allows to classify 58% of the tweets with 75% accuracy; and those can be used to feed a machine learning algorithm to obtain a complete classification of all tweets with an overall accuracy of 73%. In comparison, a 10-fold validation of the same machine learning algorithm provides an accuracy of 85%, i.e., our unsupervised algorithm has a 14% loss with respect to its supervised counterpart.Our study also shows that (i) filter keywords for Twitter does not directly derive from the public information about the company in the Web: a manual selection of keywords from relevant web sources only covers 15% of the tweets with 86% accuracy; (ii) filter keywords can indeed be a productive way of classifying tweets: the five best possible keywords cover, in average, 28% of the tweets for a company in our test collection.  相似文献   

19.
The current educational disruption caused by the COVID-19 pandemic has fuelled a plethora of investments and the use of educational technologies for Emergency Remote Learning (ERL). Despite the significance of online learning for ERL across most educational institutions, there are wide mixed perceptions about online learning during this pandemic. This study, therefore, aims at examining public perception about online learning for ERL during COVID-19. The study sample included 31,009 English language Tweets extracted and cleaned using Twitter API, Python libraries and NVivo, from 10 March 2020 to 25 July 2020, using keywords: COVID-19, Corona, e-learning, online learning, distance learning. Collected tweets were analysed using word frequencies of unigrams and bigrams, sentiment analysis, topic modelling, and sentiment labeling, cluster, and trend analysis. The results identified more positive and negative sentiments within the dataset and identified topics. Further, the identified topics which are learning support, COVID-19, online learning, schools, distance learning, e-learning, students, and education were clustered among each other. The number of daily COVID-19 related cases had a weak linear relationship with the number of online learning tweets due to the low number of tweets during the vacation period from April to June 2020. The number of tweets increased during the early weeks of July 2020 as a result of the increasing number of mixed reactions to the reopening of schools. The study findings and recommendations underscore the need for educational systems, government agencies, and other stakeholders to practically implement online learning measures and strategies for ERL in the quest of reopening of schools.  相似文献   

20.
属性情感分析是细粒度的情感分类任务。针对传统神经网络模型无法准确构建属性情感特征的问题,提出了一种融合多注意力和属性上下文的长短时记忆(LSTM-MATT-AC)神经网络模型。在双向长短时记忆(LSTM)的不同位置加入不同类型的注意力机制,充分利用多注意力机制的优势,让模型能够从不同的角度关注句子中特定属性的情感信息,弥补了单一注意力机制的不足;同时,融合双向LSTM独立编码的属性上下文语义信息,获取更深层次的情感特征,有效识别特定属性的情感极性;最后在SemEval2014 Task4和Twitter数据集上进行实验,验证了不同注意力机制和独立上下文处理方式对属性情感分析模型的有效性。实验结果表明,模型在Restaurant、Laptop和Twitter领域数据集上的准确率分别达到了80.6%、75.1%和71.1%,较之前基于神经网络的情感分析模型在准确率上有了进一步的提高。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号