首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Social media data can be valuable in many ways. However, the vast amount of content shared and the linguistic variants of languages used on social media are making it very challenging for high-value topics to be identified. In this paper, we present an unsupervised multilingual approach for identifying highly relevant terms and topics from the mass of social media data. This approach combines term ranking, localised language analysis, unsupervised topic clustering and multilingual sentiment analysis to extract prominent topics through analysis of Twitter's tweets from a period of time. It is observed that each of the ranking methods tested has their strengths and weaknesses, and that our proposed ‘Joint’ ranking method is able to take advantage of the strengths of the ranking methods. This ‘Joint’ ranking method coupled with an unsupervised topic clustering model is shown to have the potential to discover topics of interest or concern to a local community. Practically, being able to do so may help decision makers to gauge the true opinions or concerns on the ground. Theoretically, the research is significant as it shows how an unsupervised online topic identification approach can be designed without much manual annotation effort, which may have great implications for future development of expert and intelligent systems.  相似文献   

2.
Sentiment information about social media posts is increasingly considered an important resource for customer segmentation, market understanding, and tackling other socio-economic issues. However, sentiment in social media is difficult to measure since user-generated content is usually short and informal. Although many traditional sentiment analysis methods have been proposed, identifying slang sentiment words remains a challenging task for practitioners. Though some slang words are available in existing sentiment lexicons, with new slang being generated with emerging memes, a dedicated lexicon will be useful for researchers and practitioners. To this end, we propose to build a slang sentiment dictionary to aid sentiment analysis. It is laborious and time-consuming to collect a comprehensive list of slang words and label the sentiment polarity. We present an approach to leverage web resources to construct a Slang Sentiment Dictionary (SlangSD) that is easy to expand. SlangSD is publicly available for research purposes. We empirically show the advantages of using SlangSD, the newly-built slang sentiment word dictionary for sentiment classification, and provide examples demonstrating its ease of use with a sentiment analysis system.  相似文献   

3.
People express their opinions about things like products, celebrities and services using social media channels. The analysis of these textual contents for sentiments is a gold mine for marketing experts as well as for research in humanities, thus automatic sentiment analysis is a popular area of applied artificial intelligence. The chief objective of this paper is to investigate automatic sentiment analysis on social media contents over various text sources and languages. The comparative findings of the investigation may give useful insights to artificial intelligence researchers who develop sentiment analyzers for a new textual source. To achieve this, we describe supervised machine learning based systems which perform sentiment analysis and we comparatively evaluate them on seven publicly available English and Hungarian databases, which contain text documents taken from Twitter and product review sites. We discuss the differences among these text genres and languages in terms of document- and target-level sentiment analysis.  相似文献   

4.
Crisis events such as terrorist attacks are extensively commented upon on social media platforms such as Twitter. For this reason, social media content posted during emergency events is increasingly being used by news media and in social studies to characterize the public’s reaction to those events. This is typically achieved by having journalists select ‘representative’ tweets to show, or a classifier trained on prior human-annotated tweets is used to provide a sentiment/emotion breakdown for the event. However, social media users, journalists and annotators do not exist in isolation, they each have their own context and world view. In this paper, we ask the question, ‘to what extent do local and international biases affect the sentiments expressed on social media and the way that social media content is interpreted by annotators’. In particular, we perform a multi-lingual study spanning two events and three languages. We show that there are marked disparities between the emotions expressed by users in different languages for an event. For instance, during the 2016 Paris attack, there was 16% more negative comments written in the English than written in French, even though the event originated on French soil. Furthermore, we observed that sentiment biases also affect annotators from those regions, which can negatively impact the accuracy of social media labelling efforts. This highlights the need to consider the sentiment biases of users in different countries, both when analysing events through the lens of social media, but also when using social media as a data source, and for training automatic classification models.  相似文献   

5.
Li  Zuhe  Fan  Yangyu  Jiang  Bin  Lei  Tao  Liu  Weihua 《Multimedia Tools and Applications》2019,78(6):6939-6967

Social media sentiment analysis (also known as opinion mining) which aims to extract people’s opinions, attitudes and emotions from social networks has become a research hotspot. Conventional sentiment analysis concentrates primarily on the textual content. However, multimedia sentiment analysis has begun to receive attention since visual content such as images and videos is becoming a new medium for self-expression in social networks. In order to provide a reference for the researchers in this active area, we give an overview of this topic and describe the algorithms of sentiment analysis and opinion mining for social multimedia. Having conducted a brief review on textual sentiment analysis for social media, we present a comprehensive survey of visual sentiment analysis on the basis of a thorough investigation of the existing literature. We further give a summary of existing studies on multimodal sentiment analysis which combines multiple media channels. We finally summarize the existing benchmark datasets in this area, and discuss the future research trends and potential directions for multimedia sentiment analysis. This survey covers 100 articles during 2008–2018 and categorizes existing studies according to the approaches they adopt.

  相似文献   

6.
Sentiment analysis for social media and online document has been a burgeoning area in text mining for the last decade. However, Email sentiment analysis has not been studied and examined thoroughly even though it is one of the most ubiquitous means of communication. In this research, a hybrid sentiment analysis framework for Email data using term frequency-inverse document frequency term weighting model for feature extraction, and k-means labeling combined with support vector machine classifier for sentiment classification is proposed. Empirical results indicate comparatively better classification results with the proposed framework than other combinations.  相似文献   

7.
Due to the advancement of technology and globalization, it has become much easier for people around the world to express their opinions through social media platforms. Harvesting opinions through sentiment analysis from people with different backgrounds and from different cultures via social media platforms can help modern organizations, including corporations and governments understand customers, make decisions, and develop strategies. However, multiple languages posted on many social media platforms make it difficult to perform a sentiment analysis with acceptable levels of accuracy and consistency. In this paper, we propose a bilingual approach to conducting sentiment analysis on both Chinese and English social media to obtain more objective and consistent opinions. Instead of processing English and Chinese comments separately, our approach treats review comments as a stream of text containing both Chinese and English words. That stream of text is then segmented by our segment model and trimmed by the stop word lists which include both Chinese and English words. The stem words are then processed into feature vectors and then applied with two exchangeable natural language models, SVM and N-Gram. Finally, we perform a case study, applying our proposed approach to analyzing movie reviews obtained from social media. Our experiment shows that our proposed approach has a high level of accuracy and is more effective than the existing learning-based approaches.  相似文献   

8.
Multilingual text processing is useful because the information content found in different languages is complementary, both regarding facts and opinions. While Information Extraction and other text mining software can, in principle, be developed for many languages, most text analysis tools have only been applied to small sets of languages because the development effort per language is large. Self-training tools obviously alleviate the problem, but even the effort of providing training data and of manually tuning the results is usually considerable. In this paper, we gather insights by various multilingual system developers on how to minimise the effort of developing natural language processing applications for many languages. We also explain the main guidelines underlying our own effort to develop complex text mining software for tens of languages. While these guidelines??most of all: extreme simplicity??can be very restrictive and limiting, we believe to have shown the feasibility of the approach through the development of the Europe Media Monitor (EMM) family of applications (http://emm.newsbrief.eu/overview.html). EMM is a set of complex media monitoring tools that process and analyse up to 100,000 online news articles per day in between twenty and fifty languages. We will also touch upon the kind of language resources that would make it easier for all to develop highly multilingual text mining applications. We will argue that??to achieve this??the most needed resources would be freely available, simple, parallel and uniform multilingual dictionaries, corpora and software tools.  相似文献   

9.
互联网以及电子商务的快速发展,使得网络成为人们交流和沟通的公共平台.消费者在网络平台生成的大量在线评论信息产生广泛影响,并引起专家学者的积极关注,基于在线评论进行的情感分析相关研究也不断发展.鉴于此,重点关注基于在线评论的情感分析方法及其应用,在对上述内容概述的基础上分析和思考现有研究存在的问题,并指出未来可能的研究方向和内容.  相似文献   

10.
随着人们对互联网多语言信息需求的日益增长,跨语言词向量已成为一项重要的基础工具,并成功应用到机器翻译、信息检索、文本情感分析等自然语言处理领域。跨语言词向量是单语词向量的一种自然扩展,词的跨语言表示通过将不同的语言映射到一个共享的低维向量空间,在不同语言间进行知识转移,从而在多语言环境下对词义进行准确捕捉。近几年跨语言词向量模型的研究成果比较丰富,研究者们提出了较多生成跨语言词向量的方法。该文通过对现有的跨语言词向量模型研究的文献回顾,综合论述了近年来跨语言词向量模型、方法、技术的发展。按照词向量训练方法的不同,将其分为有监督学习、无监督学习和半监督学习三类方法,并对各类训练方法的原理和代表性研究进行总结以及详细的比较;最后概述了跨语言词向量的评估及应用,并分析了所面临的挑战和未来的发展方向。  相似文献   

11.
针对现有股市预测研究中所存在的大众情感度量不够全面的问题,提出了一种基于社交情感分析的股市预测模型. 该模型首先基于异构图模型的证券情感量化方法对社交媒介的数据进行情感分析,得到量化的情感时间序列;然后,基于自组织神经网络模型对情感序列及行情指数序列进行建模,从而对股票指数进行预测. 在国内社交媒介及股市行情数据集上的实验结果表明,本文所建立的模型在预测误差和精度上较BP(Back Propagation)神经网络分别提升了15%和12%,能更好地预测股票指数.  相似文献   

12.
师夏阳  张风远  袁嘉琪  黄敏 《计算机应用》2022,42(11):3379-3385
攻击性言论会对社会安定造成严重不良影响,但目前攻击性言论自动检测主要集中在少数几种高资源语言,对低资源语言缺少足够的攻击性言论标注语料导致检测困难,为此,提出一种跨语言无监督攻击性迁移检测方法。首先,使用多语BERT(mBERT)模型在高资源英语数据集上进行对攻击性特征的学习,得到一个原模型;然后,通过分析英语与丹麦语、阿拉伯语、土耳其语、希腊语的语言相似程度,将原模型迁移到这四种低资源语言上,实现对低资源语言的攻击性言论自动检测。实验结果显示,与BERT、线性回归(LR)、支持向量机(SVM)、多层感知机(MLP)这四种方法相比,所提方法在丹麦语、阿拉伯语、土耳其语、希腊语这四种语言上的攻击性言论检测的准确率和F1值均提高了近2个百分点,接近目前的有监督检测,可见采用跨语言模型迁移学习和迁移检测相结合的方法能够实现对低资源语言的无监督攻击性检测。  相似文献   

13.
IT vendors routinely use social media such as YouTube not only to disseminate their IT product information, but also to acquire customer input efficiently as part of their market research strategies. Customer responses that appear in social media, however, are typically unstructured; thus, a fairly large data set is needed for meaningful analysis. Although identifying customers’ value structures and attitudes may be useful for developing targeted or niche markets, the unstructured and volume-heavy nature of customer data prohibits efficient and economical extraction of such information. Automatic extraction of customer information would be valuable in determining value structure and strength. This paper proposes an intelligent method of estimating causality between user profiles, value structures, and attitudes based on the replies and published content managed by open social network systems such as YouTube. To show the feasibility of the idea proposed in this paper, information richness and agility are used as underlying concepts to create performance measures based on media/information richness theory. The resulting deep sentiment analysis proves to be superior to legacy sentiment analysis tools for estimation of causality among the focal parameters.  相似文献   

14.
15.
The field of sentiment analysis (SA) has grown in tandem with the aid of social networking platforms to exchange opinions and ideas. Many people share their views and ideas around the world through social media like Facebook and Twitter. The goal of opinion mining, commonly referred to as sentiment analysis, is to categorise and forecast a target’s opinion. Depending on if they provide a positive or negative perspective on a given topic, text documents or sentences can be classified. When compared to sentiment analysis, text categorization may appear to be a simple process, but number of challenges have prompted numerous studies in this area. A feature selection-based classification algorithm in conjunction with the firefly with levy and multilayer perceptron (MLP) techniques has been proposed as a way to automate sentiment analysis (SA). In this study, online product reviews can be enhanced by integrating classification and feature election. The firefly (FF) algorithm was used to extract features from online product reviews, and a multi-layer perceptron was used to classify sentiment (MLP). The experiment employs two datasets, and the results are assessed using a variety of criteria. On account of these tests, it is possible to conclude that the FFL-MLP algorithm has the better classification performance for Canon (98% accuracy) and iPod (99% accuracy).  相似文献   

16.
Social media such as forums, blogs and microblogs has been increasingly used for public information sharing and opinions exchange nowadays. It has changed the way how online community interacts and somehow has led to a new trend of engagement for online retailers especially on microblogging websites such as Twitter. In this study, we investigated the impact of online retailers' engagement with the online brand communities on users' perception of brand image and service. Firstly, we analysed the overall sentiment trends of different brands and the patterns of engagement between companies and customers using the collected tweets posted on a popular social media platform, Twitter. Then, we studied how different types of engagements affect customer sentiments. Our analysis shows that engagement has an effect on sentiments that associate with brand image, perception and customer service of the online retailers. Our findings indicate that the level, length, type and attitude of retailers' engagement with social media users have a significant impact on their sentiments. Based on our results, we derived several important managerial and practical implications.  相似文献   

17.
Sentiment analysis is the natural language processing task dealing with sentiment detection and classification from texts. In recent years, due to the growth in the quantity and fast spreading of user-generated contents online and the impact such information has on events, people and companies worldwide, this task has been approached in an important body of research in the field. Despite different methods having been proposed for distinct types of text, the research community has concentrated less on developing methods for languages other than English. In the above-mentioned context, the present work studies the possibility to employ machine translation systems and supervised methods to build models able to detect and classify sentiment in languages for which less/no resources are available for this task when compared to English, stressing upon the impact of translation quality on the sentiment classification performance. Our extensive evaluation scenarios show that machine translation systems are approaching a good level of maturity and that they can, in combination to appropriate machine learning algorithms and carefully chosen features, be used to build sentiment analysis systems that can obtain comparable performances to the one obtained for English.  相似文献   

18.
随着Web资源的日益丰富,人们需要跨语言的知识共享和信息检索。一个多语言Ontology可以用来刻画不同语言相关领域的知识,克服不同文化和不同语言带来的障碍。对现有的构建多语言Ontology方法进行分析和比较,提出一种基于核心概念集的多语言Ontology的构建方法,用一个独立于特定语言的Ontology以及来自不同自然语言的定义和词汇的同义词集来描述相关领域的概念。用该方法构建的Ontology具有良好的扩展能力、表达能力和推理能力,特别适合分布式环境下大型Ontology的创建。  相似文献   

19.
Language Resources and Evaluation - This paper describes the development of a multilingual, manually annotated dataset for three under-resourced Dravidian languages generated from social media...  相似文献   

20.
Effective crisis management has long relied on both the formal and informal response communities. Social media platforms such as Twitter increase the participation of the informal response community in crisis response. Yet, challenges remain in realizing the formal and informal response communities as a cooperative work system. We demonstrate a supportive technology that recognizes the existing capabilities of the informal response community to identify needs (seeker behavior) and provide resources (supplier behavior), using their own terminology. To facilitate awareness and the articulation of work in the formal response community, we present a technology that can bridge the differences in terminology and understanding of the task between the formal and informal response communities. This technology includes our previous work using domain-independent features of conversation to identify indications of coordination within the informal response community. In addition, it includes a domain-dependent analysis of message content (drawing from the ontology of the formal response community and patterns of language usage concerning the transfer of property) to annotate social media messages. The resulting repository of annotated messages is accessible through our social media analysis tool, Twitris. It allows recipients in the formal response community to sort on resource needs and availability along various dimensions including geography and time. Thus, computation indexes the original social media content and enables complex querying to identify contents, players, and locations. Evaluation of the computed annotations for seeker-supplier behavior with human judgment shows fair to moderate agreement. In addition to the potential benefits to the formal emergency response community regarding awareness of the observations and activities of the informal response community, the analysis serves as a point of reference for evaluating more computationally intensive efforts and characterizing the patterns of language behavior during a crisis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号