首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 569 毫秒
1.
As a new form of social media, microblogging provides platform sharing, wherein users can share their feelings and ideas on certain topics. Bursty topics from microblogs are the results of the emerging issues that instantly attract more followers and more attention online, which provide a unique opportunity to gauge the relation between expressed public sentiment and hot topics. This paper presents a Social Sentiment Sensor (SSS) system on Sina Weibo to detect daily hot topics and analyze the sentiment distributions toward these topics. SSS includes two main techniques, namely, hot topic detection and topic-oriented sentiment analysis. Hot topic detection aims to detect the most popular topics online based on the following steps, topic detection, topic clustering, and topic popularity ranking. We extracted topics from the hashtags using a hashtag filtering model because they can cover almost all the topics. Then, we cluster the topics that describe the same issue, and rank the topic clusters via their popularity to exploit the final hot topics. Topic-oriented sentiment analysis aims to analyze public opinions toward the hot topics. After retrieving the topic-related messages, we recognize sentiment for each message using a state-of-the-art SVM (Support Vector Machine) sentiment classifier. Then, we summarize the sentiments for the hot topic to achieve topic sentiment distribution. Based on the above framework and algorithms, SSS produces a real-time visualization system to monitor social sentiments, which is offering the public a new and timely perspective on the dynamics of the social topics.  相似文献   

2.
User communities in social networks are usually identified by considering explicit structural social connections between users. While such communities can reveal important information about their members such as family or friendship ties and geographical proximity, just to name a few, they do not necessarily succeed at pulling like‐minded users that share the same interests together. Therefore, researchers have explored the topical similarity of social content to build like‐minded communities of users. In this article, following the topic‐based approaches, we are interested in identifying communities of users that share similar topical interests with similar temporal behavior. More specifically, we tackle the problem of identifying temporal (diachronic) topic‐based communities, i.e., communities of users who have a similar temporal inclination toward emerging topics. To do so, we utilize multivariate time series analysis to model the contributions of each user toward emerging topics. Further, our modeling is completely agnostic to the underlying topic detection method. We extract topics of interest by employing seminal topic detection methods; one graph‐based and two latent Dirichlet allocation‐based methods. Through our experiments on Twitter data, we demonstrate the effectiveness of our proposed temporal topic‐based community detection method in the context of news recommendation, user prediction, and document timestamp prediction applications, compared with the nontemporal as well as the state‐of‐the‐art temporal approaches.  相似文献   

3.
Community question answering (CQA) has recently become a popular social media where users can post questions on any topic of interest and get answers from enthusiasts. The variation of topics in questions and answers indicate the change of users’ interests over time. It can help users focus on the most popular products or events and track their changes by exploiting hot topics and analyzing the trend of a specific topic. In this paper, we present a hot topic detection and trend analysis system to capture hot topics in a CQA system and track their evolutions over time. Our system consists of hot term extraction, question clustering and trend analysis. Experimental results using datasets from Yahoo! Answers show that our system can discover meaningful hot topics. We also show that the evolution of topics over time can be accurately exploited by trend graphing.  相似文献   

4.
Existing studies on hierarchy constructionmainly focus on text corpora and indiscriminately mix numerous topics, thus increasing the possibility of knowledge acquisition bottlenecks and misconceptions. To address these problems and provide a comprehensive and in-depth representation of domain specific topics, we propose a novel topic hierarchy construction method with real-time update. This method combines heterogeneous evidence from multiple sources including folksonomy and encyclopedia, separately in both initial topic hierarchy construction and topic hierarchy improvement. Results of comprehensive experiments indicate that the proposed method significantly outperforms state-of-theart methods (t-test, p-value < 0.000 1); recall has particularly improved by 20.4% to 38.7%.  相似文献   

5.
Microblog is a popular and open platform for discovering and sharing the latest news about social issues and daily life. The quickly-updated microblog streams make it urgent to develop an effective tool to monitor such streams. Emerging topic tracking is one of such tools to reveal what new events are attracting the most online attention at present. However, due to the fast changing, high noise and short length of the microblog feeds, two challenges should be addressed in emerging topic tracking. One is the problem of detecting emerging topics early, long before they become hot, and the other is how to effectively monitor evolving topics over time. In this study, we propose a novel emerging topics tracking method, which aligns emerging word detection from temporal perspective with coherent topic mining from spatial perspective. Specifically, we first design a metric to estimate word novelty and fading based on local weighted linear regression (LWLR), which can highlight the word novelty of expressing an emerging topic and suppress the word novelty of expressing an existing topic. We then track emerging topics by leveraging topic novelty and fading probabilities, which are learnt by designing and solving an optimization problem. We evaluate our method on a microblog stream containing over one million feeds. Experimental results show the promising performance of the proposed method in detecting emerging topic and tracking topic evolution over time on both effectiveness and efficiency.  相似文献   

6.
Twitter has become one of the most popular social media platforms, widely used for discussion and information dissemination on all kinds of topics. As a result, both business and academics have researched methods to identify the topics being discussed on Twitter. Those methods can be employed for a number of applications, including emergency management, advertisements, and corporate/government communication. However, deriving topics from this short text based and highly dynamic environment remains a huge challenge. Most current methods use the content of tweets as the only source for topic derivation. Recently, tweet interactions have been considered for improving the quality of topic derivation. In this paper, we propose a method that considers both content and interactions with a temporal aspect to further improve the quality of topic derivation. The impact of the temporal aspect in user/tweet interactions is analyzed based on several Twitter datasets. The proposed method incorporates time when it clusters tweets and identifies representative terms for each topic. Experimental results show that the inclusion of the temporal aspect in the interactions results in a significant improvement in the quality of topic derivation comparing to existing baseline methods.  相似文献   

7.
基于LDA模型的新闻话题的演化   总被引:1,自引:0,他引:1  
新闻话题及演化的研究可以帮助人们快速了解和获取新闻内容。提出了一种挖掘新闻话题随时间变化的方法,通过话题抽取和话题关联实现话题的演化。首先应用LDA(Latent Dirichlet Allocation Model)对不同时间段的文集进行话题的自动抽取,话题数目在不同时间段是可变的;计算相邻时间段中任意两个话题的分布距离实现话题的关联。实验结果证明该方法不但可以描述同一个话题随时间的演化过程,还可以描述话题内容随时间的变化,反映了话题(或子话题)之间多对多的演化关系。  相似文献   

8.
Understanding urban dynamics and large-scale human mobility will play a vital role in building smart cities and sustainable urbanization. Existing research in this domain mainly focuses on a single data source (e.g., GPS data, CDR data, etc.). In this study, we collect big and heterogeneous data and aim to investigate and discover the relationship between spatiotemporal topics found in geo-tagged tweets and GPS traces from smartphones. We employ Latent Dirichlet Allocation-based topicmodeling on geo-tagged tweets to extract and classify the topics. Then the extracted topics from tweets and temporal population distribution from GPS traces are jointly used to model urban dynamics and human crowd flow. The experimental results and validations demonstrate the efficiency of our approach and suggest that the fusion of cross-domain data for urban dynamics modeling is more practical than previously thought.  相似文献   

9.
This paper addresses the problem of semantics-based temporal expert finding, which means identifying a person with given expertise for different time periods. For example, many real world applications like reviewer matching for papers and finding hot topics in newswire articles need to consider time dynamics. Intuitively there will be different reviewers and reporters for different topics during different time periods. Traditional approaches used graph-based link structure by using keywords based matching and ignored semantic information, while topic modeling considered semantics-based information without conferences influence (richer text semantics and relationships between authors) and time information simultaneously. Consequently they result in not finding appropriate experts for different time periods. We propose a novel Temporal-Expert-Topic (TET) approach based on Semantics and Temporal Information based Expert Search (STMS) for temporal expert finding, which simultaneously models conferences influence and time information. Consequently, topics (semantically related probabilistic clusters of words) occurrence and correlations change over time, while the meaning of a particular topic almost remains unchanged. By using Bayes Theorem we can obtain topically related experts for different time periods and show how experts’ interests and relationships change over time. Experimental results on scientific literature dataset show that the proposed generalized time topic modeling approach significantly outperformed the non-generalized time topic modeling approaches, due to simultaneously capturing conferences influence with time information.  相似文献   

10.
User comments, as a large group of online short texts, are becoming increasingly prevalent with the development of online communications. These short texts are characterized by their co-occurrences with usually lengthier normal documents. For example, there could be multiple user comments following one news article, or multiple reader reviews following one blog post. The co-occurring structure inherent in such text corpora is important for efficient learning of topics, but is rarely captured by conventional topic models. To capture such structure, we propose a topic model for co-occurring documents, referred to as COTM. In COTM, we assume there are two sets of topics: formal topics and informal topics, where formal topics can appear in both normal documents and short texts whereas informal topics can only appear in short texts. Each normal document has a probability distribution over a set of formal topics; each short text is composed of two topics, one from the set of formal topics, whose selection is governed by the topic probabilities of the corresponding normal document, and the other from a set of informal topics. We also develop an online algorithm for COTM to deal with large scale corpus. Extensive experiments on real-world datasets demonstrate that COTM and its online algorithm outperform state-of-art methods by discovering more prominent, coherent and comprehensive topics.  相似文献   

11.
Exploring the spatial and semantical knowledge from messages in social media offers us an opportunity to get a deeper understanding about the mobility and activity of users, which can be leveraged to improve the service quality of online applications like recommender systems. In this paper, we investigate the problem of the spatial and semantical label inference, where the challenges come from three aspects: diverse heterogeneous information, uncertainty of individual mobility, and large-scale sparse data. We address the challenges by exploring two types of data fusion, the fusion of heterogeneous social networks and the fusion of heterogeneous features. We build a 4-dimensional tensor, called spatial–temporal semantical tensor (STST), to model the individual mobility and activity by fusing two heterogeneous social networks, a social media network and a location-based social network (LBSN). To address the challenge arising from diverse heterogeneous information and the uncertainty of individual mobility, we construct three types of heterogeneous features and fuse them with STST by exploring their interdependency relationships. Particularly, a spatial tendency feature is constructed to constrain the inference of individual mobility and reduce the uncertainty. To deal with large-scale sparse data, we propose a parallel contextual tensor factorization (PCTF) to concurrently factorize STST. Finally, we integrate these components into an inference framework, called spatial and semantical label inference SSLI. The results of extensive experiments conducted on real datasets and synthetic datasets verify the effectiveness and efficiency of SSLI.  相似文献   

12.
In this paper, a novel probabilistic topic model is proposed for mining activities from complex video surveillance scenes. In order to handle the temporal nature of the video data, we devise a dynamical causal topic model (DCTM) that can detect the latent topics and causal interactions between them. The model is based on the assumption that all temporal relationships between latent topics at neighboring time steps follow a noisy-OR distribution. And the parameter of the noisy-OR distribution is estimated by a data driven approach based on the idea of nonparametric Granger causality statistic. Furthermore, for convergence analysis during model learning process, the Kullback-Leibler between the prior and the posterior distributions is calculated. At last, using the causality matrix learned by DCTM, the total causal influence of each topic is measured. We evaluate the proposed model through experimentations on several challenging datasets and demonstrate that our model can identify the high influence activity in crowded scenes.  相似文献   

13.
Event-related topics in social networking services are always the epitome of heated society issues, therefore determining the significance of analyzing its evolution patterns. In this paper, we present a comprehensive survey on the tweets about "ransomware" in Sina Weibo, a famous social networking service similar to twitter in China. The keyword corresponds to a global ransomware attack in May 2017, on which our example event-related topics are based. We collect text data from sina Weibo and vectorize each tweets, before using a dynamic topic model to discover the event-related topics. The results of the topic model are explainable enough and help us to understand the evolution of those topics more thoroughly.  相似文献   

14.
Patterns found in digital trace data are increasingly used as evidence of social phenomena. Still, the role of digital services not as mirrors but instead as mediators of social reality has been neglected. We identify characteristics of this mediation process by analyzing Twitter messages referring to politics during the campaign for the German federal election 2013 and comparing the thus emerging image of political reality with established measurements of political reality. We focus on the relationship between temporal dynamics in politically relevant Twitter messages and crucial campaign events, comparing dominant topics in politically relevant tweets with topics prominent in surveys and in television news, and by comparing mention shares of political actors with their election results.  相似文献   

15.
贺瑞芳  王浩成  刘宏宇  王博 《软件学报》2023,34(11):5162-5178
社交媒体主题检测旨在从大规模短帖子中挖掘潜在的主题信息. 由于帖子形式简短、表达非正规化, 且社交媒体中用户交互复杂多样, 使得该任务具有一定的挑战性. 前人工作仅考虑了帖子的文本内容, 或者同时对同构情境下的社交上下文进行建模, 忽略了社交网络的异构性. 然而, 不同的用户交互方式, 如转发, 评论等, 可能意味着不同的行为模式和兴趣偏好, 其反映了对主题的不同的关注与理解; 此外, 不同用户对同一主题的发展和演化具有不同影响, 社区中处于引领地位的权威用户相对于普通用户对主题推断会产生更重要的作用. 因此, 提出一种新的多视图主题模型(multi-view topic model, MVTM), 通过编码微博会话网络中的异构社交上下文来推断更加完整、连贯的主题. 首先根据用户之间的交互关系构建一个属性多元异构会话网络, 并将其分解为具有不同交互语义的多个视图; 接着, 考虑不同交互方式与不同用户的重要性, 借助邻居级注意力和交互级注意力机制, 得到特定视图的嵌入表示; 最后, 设计一个多视图驱动的神经变分推理方法, 以捕捉不同视图之间的深层关联, 并自适应地平衡它们的一致性和独立性, 从而产生更连贯的主题. 在3个月新浪微博数据集上的实验结果证明所提方法的有效性.  相似文献   

16.
Social business intelligence combines corporate data with user-generated content (UGC) to make decision-makers aware of the trends perceived from the environment. A key role in the analysis of textual UGC is played by topics, meant as specific concepts of interest within a subject area. To enable aggregations of topics at different levels, a topic hierarchy has to be defined. Some attempts have been made to address the peculiarities of topic hierarchies, but no comprehensive solution has been found so far. The approach we propose to model topic hierarchies in ROLAP systems is called meta-stars. Its basic idea is to use meta-modeling coupled with navigation tables and with dimension tables: navigation tables support hierarchy instances with different lengths and with non-leaf facts, and allow different roll-up semantics to be explicitly annotated; meta-modeling enables hierarchy heterogeneity and dynamics to be accommodated; dimension tables are easily integrated with standard business hierarchies. After outlining a reference architecture for social business intelligence and describing the meta-star approach, we formalize its querying expressiveness and give a cost model for the main query execution plans. Then, we evaluate meta-stars by presenting experimental results for query performances and disk space.  相似文献   

17.
针对人物标签推荐中多样性及推荐标签质量问题,该文提出了一种融合个性化与多样性的人物标签推荐方法。该方法使用主题模型对用户关注对象建模,通过聚类分析把具有相似言论的对象划分到同一类簇;然后对每个类簇的标签进行冗余处理,并选取代表性标签;最后对不同类簇中的标签融合排序,以获取Top-K个标签推荐给用户。实验结果表明,与已有推荐方法相比,该方法在反映用户兴趣爱好的同时,能显著提高标签推荐质量和推荐结果的多样性。  相似文献   

18.
Social media platforms such as Twitter are becoming increasingly mainstream which provides valuable user-generated information by publishing and sharing contents. Identifying interesting and useful contents from large text-streams is a crucial issue in social media because many users struggle with information overload. Retweeting as a forwarding function plays an important role in information propagation where the retweet counts simply reflect a tweet’s popularity. However, the main reason for retweets may be limited to personal interests and satisfactions. In this paper, we use a topic identification as a proxy to understand a large number of tweets and to score the interestingness of an individual tweet based on its latent topics. Our assumption is that fascinating topics generate contents that may be of potential interest to a wide audience. We propose a novel topic model called Trend Sensitive-Latent Dirichlet Allocation (TS-LDA) that can efficiently extract latent topics from contents by modeling temporal trends on Twitter over time. The experimental results on real world data from Twitter demonstrate that our proposed method outperforms several other baseline methods.  相似文献   

19.
Nowadays, spatial and temporal data play an important role in social networks. These data are distributed and dispersed in several heterogeneous data sources. These peculiarities make that geographic information retrieval being a non-trivial task, considering that the spatial data are often unstructured and built by different collaborative communities from social networks. The problem arises when user queries are performed with different levels of semantic granularity. This fact is very typical in social communities, where users have different levels of expertise. In this paper, a novelty approach based on three matching-query layers driven by ontologies on the heterogeneous data sources is presented. A technique of query contextualization is proposed for addressing to available heterogeneous data sources including social networks. It consists of contextualizing a query in which whether a data source does not contain a relevant result, other sources either provide an answer or in the best case, each one adds a relevant answer to the set of results. This approach is a collaborative learning system based on experience level of users in different domains. The retrieval process is achieved from three domains: temporal, geographical and social, which are involved in the user-content context. The work is oriented towards defining a GIScience collaborative learning for geographic information retrieval, using social networks, web and geodatabases.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号