首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Due to the advancement of technology and globalization, it has become much easier for people around the world to express their opinions through social media platforms. Harvesting opinions through sentiment analysis from people with different backgrounds and from different cultures via social media platforms can help modern organizations, including corporations and governments understand customers, make decisions, and develop strategies. However, multiple languages posted on many social media platforms make it difficult to perform a sentiment analysis with acceptable levels of accuracy and consistency. In this paper, we propose a bilingual approach to conducting sentiment analysis on both Chinese and English social media to obtain more objective and consistent opinions. Instead of processing English and Chinese comments separately, our approach treats review comments as a stream of text containing both Chinese and English words. That stream of text is then segmented by our segment model and trimmed by the stop word lists which include both Chinese and English words. The stem words are then processed into feature vectors and then applied with two exchangeable natural language models, SVM and N-Gram. Finally, we perform a case study, applying our proposed approach to analyzing movie reviews obtained from social media. Our experiment shows that our proposed approach has a high level of accuracy and is more effective than the existing learning-based approaches.  相似文献   

2.
IT vendors routinely use social media such as YouTube not only to disseminate their IT product information, but also to acquire customer input efficiently as part of their market research strategies. Customer responses that appear in social media, however, are typically unstructured; thus, a fairly large data set is needed for meaningful analysis. Although identifying customers’ value structures and attitudes may be useful for developing targeted or niche markets, the unstructured and volume-heavy nature of customer data prohibits efficient and economical extraction of such information. Automatic extraction of customer information would be valuable in determining value structure and strength. This paper proposes an intelligent method of estimating causality between user profiles, value structures, and attitudes based on the replies and published content managed by open social network systems such as YouTube. To show the feasibility of the idea proposed in this paper, information richness and agility are used as underlying concepts to create performance measures based on media/information richness theory. The resulting deep sentiment analysis proves to be superior to legacy sentiment analysis tools for estimation of causality among the focal parameters.  相似文献   

3.
Supervised learning has attracted much attention in recent years. As a consequence, many of the state-of-the-art algorithms are domain dependent as they require a labeled training corpus to learn the domain features. This requires the availability of labeled corpora which is a cumbersome task in itself. However, for text sentiment detection SentiWordNet (SWN) may be used. It is a vocabulary where terms are arranged in synonym groups called synsets. This research makes use of SentiWordNet and treats it as the labeled corpus for training. A sentiment dictionary, SentiMI, builds upon the mutual information calculated from these terms. A complete framework is developed by using feature selection and extracting mutual information, from SentiMI, for the selected features. Training, testing and evaluation of the proposed framework are conducted on a large dataset of 50,000 movie reviews. A notable performance improvement of 7% in accuracy, 14% in specificity, and 8% in F-measure is achieved by the proposed framework as compared to the baseline SentiWordNet classifier. Comparison with the state-of-the-art classifiers is also performed on widely used Cornell Movie Review dataset which also proves the effectiveness of the proposed approach.  相似文献   

4.
《Information & Management》2016,53(8):987-996
Social media is a major platform for opinion sharing. In order to better understand and exploit opinions on social media, we aim to classify users with opposite opinions on a topic for decision support. Rather than mining text content, we introduce a link-based classification model, named global consistency maximization (GCM) that partitions a social network into two classes of users with opposite opinions. Experiments on a Twitter data set show that: (1) our global approach achieves higher accuracy than two baseline approaches and (2) link-based classifiers are more robust to small training samples if selected properly.  相似文献   

5.
SAMAR is a system for subjectivity and sentiment analysis (SSA) for Arabic social media genres. Arabic is a morphologically rich language, which presents significant complexities for standard approaches to building SSA systems designed for the English language. Apart from the difficulties presented by the social media genres processing, the Arabic language inherently has a high number of variable word forms leading to data sparsity. In this context, we address the following 4 pertinent issues: how to best represent lexical information; whether standard features used for English are useful for Arabic; how to handle Arabic dialects; and, whether genre specific features have a measurable impact on performance. Our results show that using either lemma or lexeme information is helpful, as well as using the two part of speech tagsets (RTS and ERTS). However, the results show that we need individualized solutions for each genre and task, but that lemmatization and the ERTS POS tagset are present in a majority of the settings.  相似文献   

6.
This paper presents a framework for collecting and analysing large volume social media content. The real-time analytics framework comprises semantic annotation, Linked Open Data, semantic search, and dynamic result aggregation components. In addition, exploratory search and sense-making are supported through information visualisation interfaces, such as co-occurrence matrices, term clouds, treemaps, and choropleths. There is also an interactive semantic search interface (Prospector), where users can save, refine, and analyse the results of semantic search queries over time. Practical use of the framework is exemplified through three case studies: a general scenario analysing tweets from UK politicians and the public’s response to them in the run up to the 2015 UK general election, an investigation of attitudes towards climate change expressed by these politicians and the public, via their engagement with environmental topics, and an analysis of public tweets leading up to the UK’s referendum on leaving the EU (Brexit) in 2016. The paper also presents a brief evaluation and discussion of some of the key text analysis components, which are specifically adapted to the domain and task, and demonstrate scalability and efficiency of our toolkit in the case studies.  相似文献   

7.
Sentiment analysis techniques are increasingly used to grasp reactions from social media users to unexpected and potentially stressful social events. This paper argues that, alongside assessments of the affective valence of social media content as negative or positive, there is a need for a deeper understanding of the context in which reactions are expressed and the specific functions that users' emotional states may reflect. To demonstrate this, we present a qualitative analysis of affective expressions on Twitter collected in Germany during the 2011 EHEC food contamination incident based on a coding scheme developed from Skinner et al.'s (2003) coping classification framework. Affective expressions of coping were found to be diverse not only in terms of valence but also in the adaptive functions they served: beyond the positive or negative tone, some people perceived the outbreak as a threat while others as a challenge to cope with. We discuss how this qualitative sentiment analysis can allow a better understanding of the way the overall situation is perceived – threat or challenge – and the resources that individuals experience having to cope with emerging demands.  相似文献   

8.
Social media websites such as Facebook, Twitter, etc. has changed the way peoples communicate and make decision. In this regard, various companies are willing to use these media to raise their reputation. In this paper, a reputation management system is proposed which measures the reputation of a given company by using the social media data, particularly tweets of Twitter. Taking into account the name of the company and its' related tweets, it is determined that a given tweet has either negative or positive impact on the company's reputation or product. The proposed method is based on N-gram learning approach, which consists of two steps: train step and test step. In the training step, we consider four profiles i.e. positive, negative, neutral, and irrelevant profiles for each company. Then 80% of the available tweets are used to build the companies' profiles. Each profile contains the terms that have been appeared in the tweets of each company together with the terms' frequencies. Then in the test step, which is performed on the 20% remaining tweets of the dataset, each tweet is compared with all of the built profiles, based on distance criterion to examine how the given tweet affects a company's reputation. Evaluation of the proposed method indicates that this method has a better efficiency and performance in terms of recall and precision compared to the previous methods such as Neural Network and Bayesian method.  相似文献   

9.
Deniz Kılınç 《Software》2019,49(9):1352-1364
There are many data sources that produce large volumes of data. The Big Data nature requires new distributed processing approaches to extract the valuable information. Real-time sentiment analysis is one of the most demanding research areas that requires powerful Big Data analytics tools such as Spark. Prior literature survey work has shown that, though there are many conventional sentiment analysis researches, there are only few works realizing sentiment analysis in real time. One major point that affects the quality of real-time sentiment analysis is the confidence of the generated data. In more clear terms, it is a valuable research question to determine whether the owner that generates sentiment is genuine or not. Since data generated by fake personalities may decrease accuracy of the outcome, a smart/intelligent service that can identify the source of data is one of the key points in the analysis. In this context, we include a fake account detection service to the proposed framework. Both sentiment analysis and fake account detection systems are trained and tested using Naïve Bayes model from Apache Spark's machine learning library. The developed system consists of four integrated software components, ie, (i) machine learning and streaming service for sentiment prediction, (ii) a Twitter streaming service to retrieve tweets, (iii) a Twitter fake account detection service to assess the owner of the retrieved tweet, and (iv) a real-time reporting and dashboard component to visualize the results of sentiment analysis. The sentiment classification performances of the system for offline and real-time modes are 86.77% and 80.93%, respectively.  相似文献   

10.
Smog disasters are becoming more and more frequent and may cause severe consequences on the environment and public health, especially in urban areas. Social media as a real-time urban data source has become an increasingly effective channel to observe people׳s reactions on smog-related health hazard. It can be used to capture possible smog-related public health disasters in its early stage. We then propose a predictive analytic approach that utilizes both social media and physical sensor data to forecast the next day smog-related health hazard. First, we model smog-related health hazards and smog severity through mining raw microblogging text and network information diffusion data. Second, we developed an artificial neural network (ANN)-based model to forecast smog-related health hazard with the current health hazard and smog severity observations. We evaluate the performance of the approach with other alternative machine learning methods. To the best of our knowledge, we are the first to integrate social media and physical sensor data for smog-related health hazard forecasting. The empirical findings can help researchers to better understand the non-linear relationships between the current smog observations and the next day health hazard. In addition, this forecasting approach can provide decision support for smog-related health hazard management through functions like early warning.  相似文献   

11.
Very often, correlation analysis of behavioral patterns between social network sites and the society suggests that people's behaviors in social network sites are independent from external influences. Recently, some research works have demonstrated that the assumptions are not always true. The work presented in this paper shows an approach to identify the possible associations between social network sites and the society. It utilized the D-Miner service framework in which different social network analysis tools can be plugged-in and used. The framework is supported by multi-agents, which include crawlers for different social network sites, schedulers to dispatch user requests, and analysis engines with different analytical algorithms. Two new agents have been developed for the association identification. A crawler agent is to collect incidents in the society and an association agent is to identify which social media messages are correlated to corresponding incidents. These identified associations can be applied to the evaluation of correlation analysis such as tracing the information propagation between social network sites and the society; and indentifying whether the correlations of behavioral patterns between social network sites and the society have been dominated by those incidents or not. The new agents have been tested with satisfactory results in identifying the number of connections which support the association between social network sites and the society.  相似文献   

12.
The 12-month discussion surrounding a regional university campus quickly evolved from a suggestion of independence, to a plan, to the ultimate closure of the university. This unique series of events at the University of South Florida Polytechnic (USFP) allows for an investigation of how various forms of media were used during this significant event that impacted college student’s education and immediate future. A campus wide survey was combined with social and online media monitoring to assess the topics, authors, and methods used during prominent discussions during and preceding the closure of USFP. Although social media played a crucial role, the most common format was Twitter and it was used almost exclusively by members of the media itself. Students instead relied on traditional sources to gather information. Additionally, students expressed their opinion utilizing classic methods, such as petitions, foregoing more modern Twitter or Facebook campaigns. It is incorrect to automatically assume younger demographic authorship or utilization of social media technology. Whereas social media use could expand even more over the next decade, identifying authorship remains critical as it is unclear how frequent social media is viewed as an official method of public discussion, especially when politics and higher education collide.  相似文献   

13.
Human emotion expressed in social media plays an increasingly important role in shaping policies and decisions. However, the process by which emotion produces influence in online social media networks is relatively unknown. Previous works focus largely on sentiment classification and polarity identification but do not adequately consider the way emotion affects user influence. This research developed a novel framework, a theory-based model, and a proof-of-concept system for dissecting emotion and user influence in social media networks. The system models emotion-triggered influence and facilitates analysis of emotion-influence causality in the context of U.S. border security (using 5,327,813 tweets posted by 1,303,477 users). Motivated by a theory of emotion spread, the model was integrated in an influence-computation method, called the interaction modeling (IM) approach, which was compared with a benchmark using a user centrality (UC) approach based on social positions. IM was found to have identified influential users who are more broadly related to U.S. cultural issues. Influential users tended to express intense emotions of fear, anger, disgust, and sadness. The emotion trust distinguishes influential users from others, whereas anger and fear contributed significantly to causing user influence. The research contributes to incorporating human emotion into the data-information-knowledge-wisdom model of knowledge management and to providing new information systems artifacts and new causality findings for emotion-influence analysis.  相似文献   

14.
YouTube (owned by Google Inc.) is arguably among most popular social media platforms used by millions across the globe. It provides an ever-growing, unique and rich source of content which presents new opportunities and challenges for information discovery and analysis. It is pertinent to explore and understand a topic via YouTube content to discover interesting information about public opinions and sentiments. This paper presents an integrated framework to facilitate the acquisition, storage, management, processing, and visualization of relevant content with the objective to assist in such analysis. It not only collects a significant portion of content, relevant to a given topic, in short time but also offers tools for visual exploratory analysis such as; (i) temporal evolution, (ii) vocabulary network, (iii) authors relative popularity and influence (iv) categories and (v) user communities and influencers. The utility and effectiveness is demonstrated through content analysis of a famous YouTube entertainment topic, the “Gangnam Style”.  相似文献   

15.
This study extends brand relationship theory to the context of the microblogging platform Twitter. The authors investigate the impact of Twitter trust on users’ intentions to continue using the platform and to “follow” brands that are hosted on Twitter (the trust transfer phenomenon). They also explore the role of perceived self-Twitter personality match in strengthening trust towards the Twitter brand. A cross-cultural American–Ukrainian sample allows to identify potential culture-based differences in brand personality and brand trust concepts. The results show that the positive effect of trust in Twitter on its users’ patronage intentions is robust across two cultures with diverse history and ideology. An important novel finding is the influence of trust in Twitter on patronage intentions towards the businesses hosted on Twitter. However, this relationship reaches statistical significance only in the Ukrainian sample, signaling potential differences in the trust transfer processes in different cultures. The study confirms the role of similarity in personality traits between Twitter users and the Twitter brand in engendering trust in Twitter. The salience of different personality traits in the “personality match – Twitter trust” link for different cultures suggests important implications for global marketers.  相似文献   

16.
The quality of the interpretation of the sentiment in the online buzz in the social media and the online news can determine the predictability of financial markets and cause huge gains or losses. That is why a number of researchers have turned their full attention to the different aspects of this problem lately. However, there is no well-rounded theoretical and technical framework for approaching the problem to the best of our knowledge. We believe the existing lack of such clarity on the topic is due to its interdisciplinary nature that involves at its core both behavioral-economic topics as well as artificial intelligence. We dive deeper into the interdisciplinary nature and contribute to the formation of a clear frame of discussion. We review the related works that are about market prediction based on online-text-mining and produce a picture of the generic components that they all have. We, furthermore, compare each system with the rest and identify their main differentiating factors. Our comparative analysis of the systems expands onto the theoretical and technical foundations behind each. This work should help the research community to structure this emerging field and identify the exact aspects which require further research and are of special significance.  相似文献   

17.
18.
In this paper we present a methodology to analyze and visualize streams of Social Media messages and apply it to a case in which Twitter is used as a backchannel, i.e. as a communication medium through which participants follow an event in the real world as it unfolds. Unlike other methods based on social networks or theories of information diffusion, we do not assume proximity or a pre-existing social structure to model content generation and diffusion by distributed users; instead we refer to concepts and theories from discourse psychology and conversational analysis to track online interaction and discover how people collectively make sense of novel events through micro-blogging. In particular, the proposed methodology extracts concept maps from twitter streams and uses a mix of sentiment and topological metrics computed over the extracted concept maps to build visual devices and display the conversational flow represented as a trajectory through time of automatically extracted topics. We evaluated the proposed method through data collected from the analysis of Twitter users’ reactions to the March 2015 Apple Keynote during which the company announced the official launch of several new products.  相似文献   

19.
This study examined the effectiveness of three social media based recruitment channels for sampling rural adolescent populations for online health research. At present, there is no consensus on the optimal social media based vehicle for recruiting adolescents due to limited research. This exploratory study compared Facebook ads, Twitter, and QR code postcards at three different but demographically similar rural high schools. The results showed that QR codes had the highest response percentage and the lowest cost per recruited participant, whereas Twitter had the lowest response percentage and Facebook had the highest cost per recruited participant. Although this is the first time QR codes were examined in this context, it seemed to show potential in online health research. The findings are interpreted from a variety of theoretical and conceptual frameworks. Applications of each recruitment channel are discussed and suggestions are provided for future research.  相似文献   

20.
We designed and applied interactive visualisation techniques for investigating how social networks are embedded in time and space, using data collected from smartphone logs. Our interest in spatial aspects of social networks is that they may reveal associations between participants missed by simply making contact through smartphone devices. Four linked and co-ordinated views of spatial, temporal, individual and social network aspects of the data, along with demographic and attitudinal variables, helped add context to the behaviours we observed. Using these techniques, we were able to characterise spatial and temporal aspects of participants’ social networks and suggest explanations for some of them. This provides some validation of our techniques.Unexpected deficiencies in the data that became apparent prompted us to evaluate the dataset in more detail. Contrary to what we expected, we found significant gaps in participant records, particularly in terms of location, a poorly connected sample of participants and asymmetries in reciprocal call logs. Although the data captured are of high quality, deficiencies such as these remain and are likely to have a significant impact on interpretations relating to spatial aspects of the social network. We argue that appropriately-designed interactive visualisation techniques–afforded by our flexible prototyping approach–are effective in identifying and characterising data inconsistencies. Such deficiencies are likely to exist in other similar datasets, and although the visual approaches we discuss for identifying data problems may not be scalable, the categories of problems we identify may be used to inform attempts to systematically account for errors in larger smartphone datasets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号