首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
Twitter has emerged as a platform that produces new data every day through its users which can be utilized for various purposes. People express their unique ideas and views on multiple topics thus providing vast knowledge. Sentiment analysis is critical from the corporate and political perspectives as it can impact decision-making. Since the proliferation of COVID-19, it has become an important challenge to detect the sentiment of COVID-19-related tweets so that people’s opinions can be tracked. The purpose of this research is to detect the sentiment of people regarding this problem with limited data as it can be challenging considering the various textual characteristics that must be analyzed. Hence, this research presents a deep learning-based model that utilizes the positives of random minority oversampling combined with class label analysis to achieve the best results for sentiment analysis. This research specifically focuses on utilizing class label analysis to deal with the multiclass problem by combining the class labels with a similar overall sentiment. This can be particularly helpful when dealing with smaller datasets. Furthermore, our proposed model integrates various preprocessing steps with random minority oversampling and various deep learning algorithms including standard deep learning and bi-directional deep learning algorithms. This research explores several algorithms and their impact on sentiment analysis tasks and concludes that bidirectional neural networks do not provide any advantage over standard neural networks as standard Neural Networks provide slightly better results than their bidirectional counterparts. The experimental results validate that our model offers excellent results with a validation accuracy of 92.5% and an F1 measure of 0.92.  相似文献   

The current educational disruption caused by the COVID-19 pandemic has fuelled a plethora of investments and the use of educational technologies for Emergency Remote Learning (ERL). Despite the significance of online learning for ERL across most educational institutions, there are wide mixed perceptions about online learning during this pandemic. This study, therefore, aims at examining public perception about online learning for ERL during COVID-19. The study sample included 31,009 English language Tweets extracted and cleaned using Twitter API, Python libraries and NVivo, from 10 March 2020 to 25 July 2020, using keywords: COVID-19, Corona, e-learning, online learning, distance learning. Collected tweets were analysed using word frequencies of unigrams and bigrams, sentiment analysis, topic modelling, and sentiment labeling, cluster, and trend analysis. The results identified more positive and negative sentiments within the dataset and identified topics. Further, the identified topics which are learning support, COVID-19, online learning, schools, distance learning, e-learning, students, and education were clustered among each other. The number of daily COVID-19 related cases had a weak linear relationship with the number of online learning tweets due to the low number of tweets during the vacation period from April to June 2020. The number of tweets increased during the early weeks of July 2020 as a result of the increasing number of mixed reactions to the reopening of schools. The study findings and recommendations underscore the need for educational systems, government agencies, and other stakeholders to practically implement online learning measures and strategies for ERL in the quest of reopening of schools.  相似文献   

Twitter messages are increasingly used to determine consumer sentiment towards a brand. The existing literature on Twitter sentiment analysis uses various feature sets and methods, many of which are adapted from more traditional text classification problems. In this research, we introduce an approach to supervised feature reduction using n-grams and statistical analysis to develop a Twitter-specific lexicon for sentiment analysis. We augment this reduced Twitter-specific lexicon with brand-specific terms for brand-related tweets. We show that the reduced lexicon set, while significantly smaller (only 187 features), reduces modeling complexity, maintains a high degree of coverage over our Twitter corpus, and yields improved sentiment classification accuracy. To demonstrate the effectiveness of the devised Twitter-specific lexicon compared to a traditional sentiment lexicon, we develop comparable sentiment classification models using SVM. We show that the Twitter-specific lexicon is significantly more effective in terms of classification recall and accuracy metrics. We then develop sentiment classification models using the Twitter-specific lexicon and the DAN2 machine learning approach, which has demonstrated success in other text classification problems. We show that DAN2 produces more accurate sentiment classification results than SVM while using the same Twitter-specific lexicon.  相似文献   

Deniz Kılınç 《Software》2019,49(9):1352-1364
There are many data sources that produce large volumes of data. The Big Data nature requires new distributed processing approaches to extract the valuable information. Real-time sentiment analysis is one of the most demanding research areas that requires powerful Big Data analytics tools such as Spark. Prior literature survey work has shown that, though there are many conventional sentiment analysis researches, there are only few works realizing sentiment analysis in real time. One major point that affects the quality of real-time sentiment analysis is the confidence of the generated data. In more clear terms, it is a valuable research question to determine whether the owner that generates sentiment is genuine or not. Since data generated by fake personalities may decrease accuracy of the outcome, a smart/intelligent service that can identify the source of data is one of the key points in the analysis. In this context, we include a fake account detection service to the proposed framework. Both sentiment analysis and fake account detection systems are trained and tested using Naïve Bayes model from Apache Spark's machine learning library. The developed system consists of four integrated software components, ie, (i) machine learning and streaming service for sentiment prediction, (ii) a Twitter streaming service to retrieve tweets, (iii) a Twitter fake account detection service to assess the owner of the retrieved tweet, and (iv) a real-time reporting and dashboard component to visualize the results of sentiment analysis. The sentiment classification performances of the system for offline and real-time modes are 86.77% and 80.93%, respectively.  相似文献   

Social media sites and applications, including Facebook, YouTube, Twitter and blogs, have become major social media attractions today. The huge amount of information from this medium has become an attractive resource for organisations to monitor the opinions of users, and therefore, it is receiving a lot of attention in the field of sentiment analysis. Early work on sentiment analysis approached this problem at a document-level, where the overall sentiment was identified, rather than the details of the sentiment. This research took into account the use of an aspect-based sentiment analysis on Twitter in order to perform a finer-grained analysis. A new hybrid sentiment classification for Twitter is proposed by embedding a feature selection method. A comparison of the accuracy of the classification by the principal component analysis (PCA), latent semantic analysis (LSA), and random projection (RP) feature selection methods are presented in this paper. Furthermore, the hybrid sentiment classification was validated using Twitter datasets to represent different domains, and the evaluation with different classification algorithms also demonstrated that the new hybrid approach produced meaningful results. The implementations showed that the new hybrid sentiment classification was able to improve the accuracy performance from the existing baseline sentiment classification methods by 76.55, 71.62 and 74.24%, respectively.  相似文献   

属性情感分析是细粒度的情感分类任务。针对传统神经网络模型无法准确构建属性情感特征的问题,提出了一种融合多注意力和属性上下文的长短时记忆(LSTM-MATT-AC)神经网络模型。在双向长短时记忆(LSTM)的不同位置加入不同类型的注意力机制,充分利用多注意力机制的优势,让模型能够从不同的角度关注句子中特定属性的情感信息,弥补了单一注意力机制的不足;同时,融合双向LSTM独立编码的属性上下文语义信息,获取更深层次的情感特征,有效识别特定属性的情感极性;最后在SemEval2014 Task4和Twitter数据集上进行实验,验证了不同注意力机制和独立上下文处理方式对属性情感分析模型的有效性。实验结果表明,模型在Restaurant、Laptop和Twitter领域数据集上的准确率分别达到了80.6%、75.1%和71.1%,较之前基于神经网络的情感分析模型在准确率上有了进一步的提高。  相似文献   

Twitter is a radiant platform with a quick and effective technique to analyze users’ perceptions of activities on social media. Many researchers and industry experts show their attention to Twitter sentiment analysis to recognize the stakeholder group. The sentiment analysis needs an advanced level of approaches including adoption to encompass data sentiment analysis and various machine learning tools. An assessment of sentiment analysis in multiple fields that affect their elevations among the people in real-time by using Naive Bayes and Support Vector Machine (SVM). This paper focused on analysing the distinguished sentiment techniques in tweets behaviour datasets for various spheres such as healthcare, behaviour estimation, etc. In addition, the results in this work explore and validate the statistical machine learning classifiers that provide the accuracy percentages attained in terms of positive, negative and neutral tweets. In this work, we obligated Twitter Application Programming Interface (API) account and programmed in python for sentiment analysis approach for the computational measure of user’s perceptions that extract a massive number of tweets and provide market value to the Twitter account proprietor. To distinguish the results in terms of the performance evaluation, an error analysis investigates the features of various stakeholders comprising social media analytics researchers, Natural Language Processing (NLP) developers, engineering managers and experts involved to have a decision-making approach.  相似文献   

随着社交网络的日益普及,基于Twitter文本的情感分析成为近年来的研究热点。Twitter文本中蕴含的情感倾向对于挖掘用户需求和对重大事件的预测具有重要意义。但由于Twitter文本短小和用户自身行为存在随意性等特点,再加之现有的情感分类方法大都基于手工制作的文本特征,难以挖掘文本中隐含的深层语义特征,因此难以提高情感分类性能。本文提出了一种基于卷积神经网络的Twitter文本情感分类模型。该模型利用word2vec方法初始化文本词向量,并采用CNN模型学习文本中的深层语义信息,从而挖掘Twitter文本的情感倾向。实验结果表明,采用该模型能够取得82.3%的召回率,比传统分类方法的分类性能有显著提高。  相似文献   

Millions of people are connecting and exchanging information on social media platforms, where interpersonal interactions are constantly being shared. However, due to inaccurate or misleading information about the COVID-19 pandemic, social media platforms became the scene of tense debates between believers and doubters. Healthcare professionals and public health agencies also use social media to inform the public about COVID-19 news and updates. However, they occasionally have trouble managing massive pandemic-related rumors and frauds. One reason is that people share and engage, regardless of the information source, by assuming the content is unquestionably true. On Twitter, users use words and phrases literally to convey their views or opinion. However, other users choose to utilize idioms or proverbs that are implicit and indirect to make a stronger impression on the audience or perhaps to catch their attention. Idioms and proverbs are figurative expressions with a thematically coherent totality that cannot understand literally. Despite more than 10% of tweets containing idioms or slang, most sentiment analysis research focuses on the accuracy enhancement of various classification algorithms. However, little attention would decipher the hidden sentiments of the expressed idioms in tweets. This paper proposes a novel data expansion strategy for categorizing tweets concerning COVID-19. The following are the benefits of the suggested method: 1) no transformer fine-tuning is necessary, 2) the technique solves the fundamental challenge of the manual data labeling process by automating the construction and annotation of the sentiment lexicon, 3) the method minimizes the error rate in annotating the lexicon, and drastically improves the tweet sentiment classification’s accuracy performance.  相似文献   

The COVID-19 pandemic has become one of the severe diseases in recent years. As it majorly affects the common livelihood of people across the universe, it is essential for administrators and healthcare professionals to be aware of the views of the community so as to monitor the severity of the spread of the outbreak. The public opinions are been shared enormously in microblogging media like twitter and is considered as one of the popular sources to collect public opinions in any topic like politics, sports, entertainment etc., This work presents a combination of Intensity Based Emotion Classification Convolution Neural Network (IBEC-CNN) model and Non-negative Matrix Factorization (NMF) for detecting and analyzing the different topics discussed in the COVID-19 tweets as well the intensity of the emotional content of those tweets. The topics were identified using NMF and the emotions are classified using pretrained IBEC-CNN, based on predefined intensity scores. The research aimed at identifying the emotions in the Indian tweets related to COVID-19 and producing a list of topics discussed by the users during the COVID-19 pandemic. Using the Twitter Application Programming Interface (Twitter API), huge numbers of COVID-19 tweets are retrieved during January and July 2020. The extracted tweets are analyzed for emotions fear, joy, sadness and trust with proposed Intensity Based Emotion Classification Convolution Neural Network (IBEC-CNN) model which is pretrained. The classified tweets are given an intensity score varies from 1 to 3, with 1 being low intensity for the emotion, 2 being the moderate and 3 being the high intensity. To identify the topics in the tweets and the themes of those topics, Non-negative Matrix Factorization (NMF) has been employed. Analysis of emotions of COVID-19 tweets has identified, that the count of positive tweets is more than that of count of negative tweets during the period considered and the negative tweets related to COVID-19 is less than 5%. Also, more than 75% negative tweets expressed sadness, fear are of low intensity. A qualitative analysis has also been conducted and the topics detected are grouped into themes such as economic impacts, case reports, treatments, entertainment and vaccination. The results of analysis show that the issues related to the pandemic are expressed different emotions in twitter which helps in interpreting the public insights during the pandemic and these results are beneficial for planning the dissemination of factual health statistics to build the trust of the people. The performance comparison shows that the proposed IBEC-CNN model outperforms the conventional models and achieved 83.71% accuracy. The % of COVID-19 tweets that discussed the different topics vary from 7.45% to 26.43% on topics economy, Statistics on cases, Government/Politics, Entertainment, Lockdown, Treatments and Virtual Events. The least number of tweets discussed on politics/government on the other hand the tweets discussed most about treatments.  相似文献   

Web-blogging sites such as Twitter and Facebook are heavily influenced by emotions, sentiments, and data in the modern era. Twitter, a widely used microblogging site where individuals share their thoughts in the form of tweets, has become a major source for sentiment analysis. In recent years, there has been a significant increase in demand for sentiment analysis to identify and classify opinions or expressions in text or tweets. Opinions or expressions of people about a particular topic, situation, person, or product can be identified from sentences and divided into three categories: positive for good, negative for bad, and neutral for mixed or confusing opinions. The process of analyzing changes in sentiment and the combination of these categories is known as “sentiment analysis.” In this study, sentiment analysis was performed on a dataset of 90,000 tweets using both deep learning and machine learning methods. The deep learning-based model long-short-term memory (LSTM) performed better than machine learning approaches. Long short-term memory achieved 87% accuracy, and the support vector machine (SVM) classifier achieved slightly worse results than LSTM at 86%. The study also tested binary classes of positive and negative, where LSTM and SVM both achieved 90% accuracy.  相似文献   

Social media, especially Twitter is now one of the most popular platforms where people can freely express their opinion. However, it is difficult to extract important summary information from many millions of tweets sent every hour. In this work we propose a new concept, sentimental causal rules, and techniques for extracting sentimental causal rules from textual data sources such as Twitter which combine sentiment analysis and causal rule discovery. Sentiment analysis refers to the task of extracting public sentiment from textual data. The value in sentiment analysis lies in its ability to reflect popularly voiced perceptions that are stated in natural language. Causal rules on the other hand indicate associations between different concepts in a context where one (or several concepts) cause(s) the other(s). We believe that sentimental causal rules are an effective summarization mechanism that combine causal relations among different aspects extracted from textual data as well as the sentiment embedded in these causal relationships. In order to show the effectiveness of sentimental causal rules, we have conducted experiments on Twitter data collected on the Kurdish political issue in Turkey which has been an ongoing heated public debate for many years. Our experiments on Twitter data show that sentimental causal rule discovery is an effective method to summarize information about important aspects of an issue in Twitter which may further be used by politicians for better policy making.  相似文献   

目前多数图像视觉情感分析方法主要从图像整体构建视觉情感特征表示,然而图像中包含对象的局部区域往往更能突显情感色彩。针对视觉图像情感分析中忽略局部区域情感表示的问题,提出一种嵌入图像整体特征与局部对象特征的视觉情感分析方法。该方法结合整体图像和局部区域以挖掘图像中的情感表示,首先利用对象探测模型定位图像中包含对象的局部区域,然后通过深度神经网络抽取局部区域的情感特征,最后用图像整体抽取的深层特征和局部区域特征来共同训练图像情感分类器并预测图像的情感极性。实验结果表明,所提方法在真实数据集TwitterⅠ和TwitterⅡ上的情感分类准确率分别达到了75.81%和78.90%,高于仅从图像整体特征和仅从局部区域特征分析情感的方法。  相似文献   

In emergencies, Twitter is an important platform to get situational awareness simultaneously. Therefore, information about Twitter users’ location is a fundamental aspect to understand the disaster effects. But location extraction is a challenging task. Most of the Twitter users do not share their locations in their tweets. In that respect, there are different methods proposed for location extraction which cover different fields such as statistics, machine learning, etc. This study is a sample study that utilizes geo-tagged tweets to demonstrate the importance of the location in disaster management by taking three cases into consideration. In our study, tweets are obtained by utilizing the “earthquake” keyword to determine the location of Twitter users. Tweets are evaluated by utilizing the Latent Dirichlet Allocation (LDA) topic model and sentiment analysis through machine learning classification algorithms including the Multinomial and Gaussian Naïve Bayes, Support Vector Machine (SVM), Decision Tree, Random Forest, Extra Trees, Neural Network, k Nearest Neighbor (kNN), Stochastic Gradient Descent (SGD), and Adaptive Boosting (AdaBoost) classifications. Therefore, 10 different machine learning algorithms are applied in our study by utilizing sentiment analysis based on location-specific disaster-related tweets by aiming fast and correct response in a disaster situation. In addition, the effectiveness of each algorithm is evaluated in order to gather the right machine learning algorithm. Moreover, topic extraction via LDA is provided to comprehend the situation after a disaster. The gathered results from the application of three cases indicate that Multinomial Naïve Bayes and Extra Trees machine learning algorithms give the best results with an F-measure value over 80%. The study aims to provide a quick response to earthquakes by applying the aforementioned techniques.  相似文献   

新型冠状病毒肺炎简称新冠肺炎,是一种由新型冠状病毒引起的急性感染性肺炎,具有传染性强、人群普遍易感的特点。因此,对新冠肺炎感染人数的预测,不仅仅有利于国家面对疫情做出科学决策,而且有利于及时整合防疫资源。本文提出一种基于传统的传染病动力模型SEIR和差分整合移动平均自回归模型ARIMA构建的SEIR-ARIMA混合模型,对不同时间段、不同地点的新冠肺炎疫情做出预测和分析。从实验结果上看,基于SEIR-ARIMA混合模型的预测,比常见的用于新冠肺炎预测的逻辑回归Logistic、长短期记忆人工神经网络LSTM、SEIR模型、ARIMA模型有较好的预测效果。为了真实地反映出实验效果的提高是否源于SEIR与ARIMA模型结合的优势,本文还实现SEIR-Logistic混合模型和SEIR-LSTM混合模型,并与SEIR-ARIMA对比分析得出,SEIR-ARIMA预测都取得更好的预测效果。因此,基于SEIR-ARIMA混合模型对新冠肺炎的发展趋势的分析相对可靠,有利于国家面对疫情的科学决策,对我国未来预防其他类型的传染病具有很好的应用价值。  相似文献   

The empirical mode decomposition (EMD) has been successfully applied to adaptively decompose economic and financial time series for forecasting purpose. Recently, the variational mode decomposition (VMD) has been proposed as an alternative to EMD to easily separate tones of similar frequencies in data where the EMD fails. The purpose of this study is to present a new time series forecasting model which integrates VMD and general regression neural network (GRNN). The performance of the proposed model is evaluated by comparing the forecasting results of VMD-GRNN with three competing prediction models; namely the EMD-GRNN model, feedforward neural networks (FFNN), and autoregressive moving average (ARMA) process on West Texas Intermediate (WTI), Canadian/US exchange rate (CANUS), US industrial production (IP) and the Chicago Board Options Exchange NASDAQ 100 Volatility Index (VIX) time series are used for experimentations. Based on mean absolute error (MAE), mean absolute percentage error (MAPE), and the root mean of squared errors (RMSE), the analysis results from forecasting demonstrate the superiority of the VMD-based method over the three competing prediction approaches. The practical analysis results suggest that VMD is an effective and promising technique for analysis and prediction of economic and financial time series.  相似文献   

为获得更具判别性的视觉特征并提升情感分类效果,构建融合双注意力多层特征的视觉情感分析模型。通过卷积神经网络提取图像多通道的多层次特征,根据空间注意力机制对多通道的低层特征赋予空间注意力权重,利用通道注意力机制对多通道的高层特征赋予通道注意力权重,分别强化不同层次的特征表示,将强化后的高层特征和低层特征进行融合,形成用于训练情感分类器的判别性特征。在3个真实数据集Twitter Ⅰ、Twitter Ⅱ和EmotionROI上进行对比实验,结果表明,该模型的分类准确率分别达到79.83%、78.25%和49.34%,有效提升了社交媒体视觉情感分析的效果。  相似文献   

传统的属性级别情感分析方法缺乏对属性实体与前后文之间交互关系的研究,导致情感分类结果的正确率不高。为了有效提取文本特征,提出了一种利用多头注意力机制学习属性实体与前后文之间关系的属性级别情感分析模型(intra&inter multi-head attention network, IIMAN),从而提高情感极性判断结果。该模型首先利用BERT预训练完成输入语句的词向量化;通过注意力网络中的内部多头注意力与联合多头注意力学习属性实体与前后文以及前后文内部间的关系;最后通过逐点卷积变换层、面向属性实体的注意力层和输出层完成情感极性分类。通过在三个公开的属性级别情感分析数据集Twitter、laptop、restaurant上的实验证明,IIMAN相较于其他基线模型,正确率和F1值有了进一步的提升,能够有效提高情感极性分类结果。  相似文献   

特定于某一方面的情感分类是情感分析领域中的一项细粒度任务。深层的神经网络可以更好地提取上下文特征与方面特征,同时利用Attention机制可以根据上下文特征和方面特征不同的重要性赋予相应的权重值。模型着重从提取上下文与方面特征和更好地融合上下文与方面向量入手,提出了一种混合提取与多层注意的深度神经网络。基于Bi-LSTM和CNN在提取特征方面都有显著的成效,引入两种网络的合并模型。最后,在经典的Laptop,Resteraunt和Twitter数据集上进行了验证,展示了比基准模型更好地分类效果。  相似文献   

One of the drastically growing and emerging research areas used in most information technology industries is Bigdata analytics. Bigdata is created from social websites like Facebook, WhatsApp, Twitter, etc. Opinions about products, persons, initiatives, political issues, research achievements, and entertainment are discussed on social websites. The unique data analytics method cannot be applied to various social websites since the data formats are different. Several approaches, techniques, and tools have been used for big data analytics, opinion mining, or sentiment analysis, but the accuracy is yet to be improved. The proposed work is motivated to do sentiment analysis on Twitter data for cloth products using Simulated Annealing incorporated with the Multiclass Support Vector Machine (SA-MSVM) approach. SA-MSVM is a hybrid heuristic approach for selecting and classifying text-based sentimental words following the Natural Language Processing (NLP) process applied on tweets extracted from the Twitter dataset. A simulated annealing algorithm searches for relevant features and selects and identifies sentimental terms that customers criticize. SA-MSVM is implemented, experimented with MATLAB, and the results are verified. The results concluded that SA-MSVM has more potential in sentiment analysis and classification than the existing Support Vector Machine (SVM) approach. SA-MSVM has obtained 96.34% accuracy in classifying the product review compared with the existing systems.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号