共查询到20条相似文献,搜索用时 0 毫秒
1.
微博这类高自我呈现社交媒体的话题热度预测方法,是否同样适用于知乎这类低自我呈现社交媒体平台,是有待检验的问题.对此从意见领袖和群体特征维度构建指标体系,采用基于树模型、核空间、线性模型、神经网络等4类(共11种)机器学习回归算法,构建话题热度预测模型进行对比计算分析.结果发现与高自我呈现平台不同,线性模型效果更好,其中对特征进行选择的弹性网回归算法的效果最佳;群体规模对话题热度影响较小,话题类型却对话题热度影响较大. 相似文献
2.
3.
Social media data can be valuable in many ways. However, the vast amount of content shared and the linguistic variants of languages used on social media are making it very challenging for high-value topics to be identified. In this paper, we present an unsupervised multilingual approach for identifying highly relevant terms and topics from the mass of social media data. This approach combines term ranking, localised language analysis, unsupervised topic clustering and multilingual sentiment analysis to extract prominent topics through analysis of Twitter's tweets from a period of time. It is observed that each of the ranking methods tested has their strengths and weaknesses, and that our proposed ‘Joint’ ranking method is able to take advantage of the strengths of the ranking methods. This ‘Joint’ ranking method coupled with an unsupervised topic clustering model is shown to have the potential to discover topics of interest or concern to a local community. Practically, being able to do so may help decision makers to gauge the true opinions or concerns on the ground. Theoretically, the research is significant as it shows how an unsupervised online topic identification approach can be designed without much manual annotation effort, which may have great implications for future development of expert and intelligent systems. 相似文献
4.
Lu Zhao Lin Yu-Ru Huang Xiaoxia Xiong Naixue Fang Zhijun 《Multimedia Tools and Applications》2017,76(8):10855-10879
Multimedia Tools and Applications - Nowadays, microblogging has become popular, with hundreds of millions of short messages being posted and shared every minute on a variety of topics in social... 相似文献
5.
Topic models are generative probabilistic models which have been applied to information retrieval to automatically organize and provide structure to a text corpus. Topic models discover topics in the corpus, which represent real world concepts by frequently co-occurring words. Recently, researchers found topics to be effective tools for structuring various software artifacts, such as source code, requirements documents, and bug reports. This research also hypothesized that using topics to describe the evolution of software repositories could be useful for maintenance and understanding tasks. However, research has yet to determine whether these automatically discovered topic evolutions describe the evolution of source code in a way that is relevant or meaningful to project stakeholders, and thus it is not clear whether topic models are a suitable tool for this task.In this paper, we take a first step towards evaluating topic models in the analysis of software evolution by performing a detailed manual analysis on the source code histories of two well-known and well-documented systems, JHotDraw and jEdit. We define and compute various metrics on the discovered topic evolutions and manually investigate how and why the metrics evolve over time. We find that the large majority (87%–89%) of topic evolutions correspond well with actual code change activities by developers. We are thus encouraged to use topic models as tools for studying the evolution of a software system. 相似文献
6.
7.
8.
Xue Feng Wang Jianwei Qian Shengsheng Zhang Tianzhu Liu Xueliang Xu Changsheng 《Multimedia Tools and Applications》2019,78(1):141-160
Multimedia Tools and Applications - In this paper, we proposed a novel multi-modal max-margin supervised topic model (MMSTM) for social event analysis by jointly learning the representation... 相似文献
9.
In this paper, we pay attention to reveal the event topics and track the evolutionary trend of social event and a novel probabilistic topic model is proposed. The Multi-modal Multi-layered Topic Classification Model (tm_MMC) for Social Event Analysis has the capacity for revealing visual and non-visual topics, by jointly modeling the textual and visual information while simultaneously learning and predicting the multi-layered category labels. In order to track the evolutionary trends of the topics online, tm_MMC uses topic intensity and heritability to incrementally build an up-to-date model. To evaluate the effectiveness of our model, we experiment using a collected data, and compare the results with those of other traditional models. The results demonstrate the effectiveness and advantages of our model against several state-of-the-art methods. 相似文献
10.
11.
Golnoosh Farnadi Geetha Sitaraman Shanu Sushmita Fabio Celli Michal Kosinski David Stillwell Sergio Davalos Marie-Francine Moens Martine De Cock 《User Modeling and User-Adapted Interaction》2016,26(2-3):109-142
A variety of approaches have been recently proposed to automatically infer users’ personality from their user generated content in social media. Approaches differ in terms of the machine learning algorithms and the feature sets used, type of utilized footprint, and the social media environment used to collect the data. In this paper, we perform a comparative analysis of state-of-the-art computational personality recognition methods on a varied set of social media ground truth data from Facebook, Twitter and YouTube. We answer three questions: (1) Should personality prediction be treated as a multi-label prediction task (i.e., all personality traits of a given user are predicted at once), or should each trait be identified separately? (2) Which predictive features work well across different on-line environments? and (3) What is the decay in accuracy when porting models trained in one social media environment to another? 相似文献
12.
Abutiheen Zinah Abdulridha Mohammed Enas Ali Hussein Mohsin Hasan 《International Journal of Speech Technology》2022,25(3):659-666
International Journal of Speech Technology - Social media has allowed all individuals, organizations, and businesses to share their opinions, ideas, and inclinations with others. These opinions... 相似文献
13.
Yasmin AlNoamany Michele C. Weigle Michael L. Nelson 《International Journal on Digital Libraries》2016,17(3):239-256
An emerging trend in social media is for users to create and publish “stories”, or curated lists of Web resources, with the purpose of creating a particular narrative of interest to the user. While some stories on the Web are automatically generated, such as Facebook’s “Year in Review”, one of the most popular storytelling services is “Storify”, which provides users with curation tools to select, arrange, and annotate stories with content from social media and the Web at large. We would like to use tools, such as Storify, to present (semi-)automatically created summaries of archival collections. To support automatic story creation, we need to better understand as a baseline the structural characteristics of popular (i.e., receiving the most views) human-generated stories. We investigated 14,568 stories from Storify, comprising 1,251,160 individual resources, and found that popular stories (i.e., top 25 % of views normalized by time available on the Web) have the following characteristics: 2/28/1950 elements (min/median/max), a median of 12 multimedia resources (e.g., images, video), 38 % receive continuing edits, and 11 % of their elements are missing from the live Web. We also checked the population of Archive-It collections (3109 collections comprising 305,522 seed URIs) for better understanding the characteristics of the collections that we intend to summarize. We found that the resources in human-generated stories are different from the resources in Archive-It collections. In summarizing a collection, we can only choose from what is archived (e.g., twitter.com is popular in Storify, but rare in Archive-It). However, some other characteristics of human-generated stories will be applicable, such as the number of resources. 相似文献
14.
Fafalios Pavlos Iosifidis Vasileios Stefanidis Kostas Ntoutsi Eirini 《International Journal on Digital Libraries》2020,21(1):5-17
International Journal on Digital Libraries - How did the popularity of the Greek Prime Minister evolve in 2015? How did the predominant sentiment about him vary during that period? Were there any... 相似文献
15.
针对主题模型不能充分考虑情感极性信息和衰减因子设定单一的问题,提出情感极性和影响函数的OBTM弹幕主题演化方法.提出基于改进负采样的word2vec词向量模型,对弹幕词语的情感极性进行标注;设计影响函数,反映离散时间中文本主题的历史影响程度;利用情感极性特征和影响函数改进OBTM模型,用于弹幕主题演化的分析.实验结果表明,改进的OBTM可以有效优化主题演化效果,能够扩展弹幕在主题情感极性演化方面的应用. 相似文献
16.
17.
Nowadays, due to the rapid growth of digital technologies, huge volumes of image data are created and shared on social media sites. User-provided tags attached to each social image are widely recognized as a bridge to fill the semantic gap between low-level image features and high-level concepts. Hence, a combination of images along with their corresponding tags is useful for intelligent retrieval systems, those are designed to gain high-level understanding from images and facilitate semantic search. However, user-provided tags in practice are usually incomplete and noisy, which may degrade the retrieval performance. To tackle this problem, we present a novel retrieval framework that automatically associates the visual content with textual tags and enables effective image search. To this end, we first propose a probabilistic topic model learned on social images to discover latent topics from the co-occurrence of tags and image features. Moreover, our topic model is built by exploiting the expert knowledge about the correlation between tags with visual contents and the relationship among image features that is formulated in terms of spatial location and color distribution. The discovered topics then help to predict missing tags of an unseen image as well as the ones partially labeled in the database. These predicted tags can greatly facilitate the reliable measure of semantic similarity between the query and database images. Therefore, we further present a scoring scheme to estimate the similarity by fusing textual tags and visual representation. Extensive experiments conducted on three benchmark datasets show that our topic model provides the accurate annotation against the noise and incompleteness of tags. Using our generalized scoring scheme, which is particularly advantageous to many types of queries, the proposed approach also outperforms state-of-the-art approaches in terms of retrieval accuracy. 相似文献
18.
Accurately representing the quantity and characteristics of users’ interest in certain topics is an important problem facing topic evolution researchers, particularly as it applies to modern online environments. Search engines can provide information retrieval for a specified topic from archived data, but fail to reflect changes in interest toward the topic over time in a structured way. This paper reviews notable research on topic evolution based on the probabilistic topic model from multiple aspects over the past decade. First, we introduce notations, terminology, and the basic topic model explored in the survey, then we summarize three categories of topic evolution based on the probabilistic topic model: the discrete time topic evolution model, the continuous time topic evolutionmodel, and the online topic evolution model. Next, we describe applications of the topic evolution model and attempt to summarize model generalization performance evaluation and topic evolution evaluation methods, as well as providing comparative experimental results for different models. To conclude the review, we pose some open questions and discuss possible future research directions. 相似文献
19.
Min Yang Yuzhi Liang Wei Zhao Wei Xu Jia Zhu Qiang Qu 《Multimedia Tools and Applications》2018,77(3):3171-3187
Keyphrase extraction from social media is a crucial and challenging task. Previous studies usually focus on extracting keyphrases that provide the summary of a corpus. However, they do not take users’ specific needs into consideration. In this paper, we propose a novel three-stage model to learn a keyphrase set that represents or related to a particular topic. Firstly, a phrase mining algorithm is applied to segment the documents into human-interpretable phrases. Secondly, we propose a weakly supervised model to extract candidate keyphrases, which uses a few pre-specific seed keyphrases to guide the model. The model consequently makes the extracted keyphrases more specific and related to the seed keyphrases (which reflect the user’s needs). Finally, to further identify the implicitly related phrases, the PMI-IR algorithm is employed to obtain the synonyms of the extracted candidate keyphrases. We conducted experiments on two publicly available datasets from news and Twitter. The experimental results demonstrate that our approach outperforms the state-of-the-art baselines and has the potential to extract high-quality task-oriented keyphrases. 相似文献
20.
Leveraging social media networks for classification 总被引:1,自引:0,他引:1
Social media has reshaped the way in which people interact with each other. The rapid development of participatory web and social networking sites like YouTube, Twitter, and Facebook, also brings about many data mining opportunities and novel challenges. In particular, we focus on classification tasks with user interaction information in a social network. Networks in social media are heterogeneous, consisting of various relations. Since the relation-type information may not be available in social media, most existing approaches treat these inhomogeneous connections homogeneously, leading to an unsatisfactory classification performance. In order to handle the network heterogeneity, we propose the concept of social dimension to represent actors?? latent affiliations, and develop a classification framework based on that. The proposed framework, SocioDim, first extracts social dimensions based on the network structure to accurately capture prominent interaction patterns between actors, then learns a discriminative classifier to select relevant social dimensions. SocioDim, by differentiating different types of network connections, outperforms existing representative methods of classification in social media, and offers a simple yet effective approach to integrating two types of seemingly orthogonal information: the network of actors and their attributes. 相似文献