首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
社交网络现已成为现实世界中信息传播与扩散的主要媒介,对其中的热点信息进行建模和预测有着广泛的应用场景和商业价值,比如进行信息传播挖掘、广告推荐和用户行为分析等.目前的相关研究主要利用特征和时间序列进行建模,但是并没有考虑到社交网络中用户的社交圈层对于信息传播的作用.本文提出了一种基于社交圈层和注意力机制的热度预测模型SCAP(Social Circle and Attention based Popularity Prediction),首先对社交圈层进行定义,通过自动编码器提取用户历史文本序列的特征,对不同用户的社交圈层进行聚类划分,得到社交圈层特征.进而对于一条新发布的文本信息,通过长短期记忆网络与嵌入层提取其文本特征、用户特征和时序特征,并基于注意力机制,捕获到不同社交圈层对于该文本信息的影响程度,得到社交圈层注意力特征.最后将文本特征、用户特征、时序特征和社交圈层注意力特征进行特征融合,并通过两个全连接层进行建模学习,对社交信息的热度进行预测.在推特、微博和豆瓣等四个数据集上的实验结果表明,SCAP模型的预测表现相比于多个对比模型总体呈优,在不同数据集上均方误差(MSE)分别降低了0.017,0.022,0.021和0.031,F1分数分别提升0.034,0.021,0.034和0.025,能够较为准确地预测社交信息的热度.本文同时探究了不同实验参数对于模型的影响效果,如用户历史文本序列的数量、社交圈层的数量和时间序列的长度,最后验证了模型输入的各个特征和注意力机制的引入对于模型预测性能提升的有效性,在推特数据集中,引入社交圈层和注意力机制,模型的MSE指标分别降低了0.065和0.019.  相似文献   

2.
转发是微博提供的一个信息传播的机制,用户能够将关注者发布的有趣微博转发到自身平台,然后分享给追随者,是微博网络中信息传播最重要的功能。对于微博网络存在的不同类型连接关系,首先提取出相关特征,如同质性、微网络结构、地理距离以及用户性别等,用于识别连接关系的不同类型,然后采用Log-linear模型来拟合各个特征间系数,基于这些系数对微博用户转发行为形成的内在原因进行了分析。  相似文献   

3.
4.
针对流媒体的流行度预测问题,提出一种基于视频特征及历史数据的流行度预测模型。首先,根据视频特征及在社交网络中的影响力,使用K-近邻(KNN)算法对视频的流行程度进行预测。然后,基于流行程度的预测结果,结合自回归滑动平均(Autoregressive Moving Average,ARMA)模型对视频的点播量进行预测。最后,通过爬取豆瓣电影及新浪微博数据,对模型进行试验。结果表明,与朴素贝叶斯分类器及ARMA模型相比,本文模型的召回率(recall)明显较高,平均平方根误差(RMSE)降低了约20%。  相似文献   

5.
随着微博的迅速发展和大量普及, 微博社区发现已经成为新兴的研究热点. 发现网络社区有助于运营商理解网络结构和用户特征, 为用户提供个性化服务. 目前有关社区挖掘的研究大多只关注于网络结构, 忽略节点内容. 本文综合考虑网络结构和节点内容, 提出一种基于用户主题相似性和网络拓扑结构的微博社区发现方法. 首先从微博文本中抽取用户主题, 然后结合用户之间的链接关系, 对它们进行基于相似性的聚类, 最终获得社区结构. 在真实数据集上的实验证明: 所提出的方法不但能够发现潜在社区, 而且还能获知社区主题.  相似文献   

6.
社交网络中的消息流行度预测问题对于信息推荐和病毒式营销等应用具有重要意义。该文提出了一种基于传播模拟的消息流行度预测方法,首先使用最大熵模型学习并预测用户转发消息的概率,然后使用独立级联传播模型在真实的社会网络上模拟消息的传播过程,从而完成消息流行度的预测。该方法的优点在于更充分的利用了社会网络的结构和用户特征信息。该文在Twitter数据集上的实验结果表明,相对于基准方法,该文提出的方法具有更高的准确率和稳定性。  相似文献   

7.
Liu  Bo  Ni  Zeyang  Luo  Junzhou  Cao  Jiuxin  Ni  Xudong  Liu  Benyuan  Fu  Xinwen 《World Wide Web》2019,22(6):2953-2975

Social networking websites with microblogging functionality, such as Twitter or Sina Weibo, have emerged as popular platforms for discovering real-time information on the Web. Like most Internet services, these websites have become the targets of spam campaigns, which contaminate Web contents and damage user experiences. Spam campaigns have become a great threat to social network services. In this paper, we investigate crowd-retweeting spam in Sina Weibo, the counterpart of Twitter in China. We carefully analyze the characteristics of crowd-retweeting spammers in terms of their profile features, social relationships and retweeting behaviors. We find that although these spammers are likely to connect more closely than legitimate users, the underlying social connections of crowd-retweeting campaigns are different from those of other existing spam campaigns because of the unique features of retweets that are spread in a cascade. Based on these findings, we propose retweeting-aware link-based ranking algorithms to infer more suspicious accounts by using identified spammers as seeds. Our evaluation results show that our algorithms are more effective than other link-based strategies.

  相似文献   

8.
With a framework based on the heuristic–systematic model of information processing, this study examined the effects of both content and contextual factors on the popularity of microblogging posts. The popularity of posts was operationalized as the re-tweeting times and number of comments received by posts, which are users’ behavioral outcomes after processing information. The data of the study were 10,000 posts randomly drawn from a popular microblogging site in China. Content factors were found to outperform contextual ones in accounting for the variance in post popularity, which suggests that systematic strategy dominates users’ information processing in comparison with heuristic strategy. Our findings implied that re-tweeting and commenting are distinct types of microblogging behaviors. Re-tweeting aims to disseminate information in which the source credibility (e.g., users’ authoritativeness) and posts’ informativeness play important roles, whereas commenting emphasizes social interaction and conversation in which users’ experience and posts’ topics are more important.  相似文献   

9.
微博网站作为一种流行的社交媒体形式,在为用户提供丰富信息和服务的同时,也带来了信息超载问题。如何利用微博网络为用户推荐有价值的信息,以缓解信息超载问题,变得日益重要。根据微博网络的有向性以及建立关注关系的随意性等特点,提出了一种基于非负多矩阵分解的微博网络推荐方法,综合考虑了用户之间的关注关系、用户与微博内容的转发关系,以及微博内容与主题的所属关系等多源信息。基于新浪微博数据集进行了微博内容推荐实验,结果表明基于非负多矩阵分解的方法,能够有效利用微博网络中的多维信息,显著提高推荐准确度。本方法不仅能挖掘出微博内容的主题,还能挖掘出用户间的关联关系,还可推广到对用户进行好友和主题的推荐。  相似文献   

10.
With the rapid growth of social network applications, more and more people are participating in social networks. Privacy protection in online social networks becomes an important issue. The illegal disclosure or improper use of users’ private information will lead to unaccepted or unexpected consequences in people’s lives. In this paper, we concern on authentic popularity disclosure in online social networks. To protect users’ privacy, the social networks need to be anonymized. However, existing anonymization algorithms on social networks may lead to nontrivial utility loss. The reason is that the anonymization process has changed the social network’s structure. The social network’s utility, such as retrieving data files, reading data files, and sharing data files among different users, has decreased. Therefore, it is a challenge to develop an effective anonymization algorithm to protect the privacy of user’s authentic popularity in online social networks without decreasing their utility. In this paper, we first design a hierarchical authorization and capability delegation (HACD) model. Based on this model, we propose a novel utility-based popularity anonymization (UPA) scheme, which integrates proxy re-encryption with keyword search techniques, to tackle this issue. We demonstrate that the proposed scheme can not only protect the users’ authentic popularity privacy, but also keep the full utility of the social network. Extensive experiments on large real-world online social networks confirm the efficacy and efficiency of our scheme.  相似文献   

11.
近年来,社交网络数据挖掘作为物理网络空间数据挖掘的一大热点,目前在用户行为分析、兴趣识别、产品推荐等方面都取得了令人可喜的成果。随着社交网络商业契机的到来,出现了很多恶意用户及恶意行为,给数据挖掘的效果产生了极大的影响。基于此,提出基于用户行为特征分析的恶意用户识别方法,该方法引入主成分分析方法对微博网络用户行为数据进行挖掘,对各维度特征的权重进行排序,选取前六维主成分特征可以有效识别恶意用户,主成分特征之间拟合出的新特征也能提升系统的识别性能。实验结果表明,引入的方法对微博用户特征进行了有效的排序,很好地识别出了微博社交网络中的恶意用户,为其他方向的社交网络数据挖掘提供了良好的数据清洗技术。  相似文献   

12.
微博网络测量研究   总被引:9,自引:0,他引:9  
随着移动通信和Web技术的不断突破,以微博为代表的在线社会网络在中国广泛发展起来,越来越多的人开始使用微博进行信息分发和舆论传播.为了了解中国微博网络中的拓扑结构特征和用户行为特征等内在信息,对国内最大的微博系统——新浪微博——开展了主动测量,并结合已有的在线社会网络测量结果,对新浪微博的网络拓扑和用户行为特征进行了分析和比较.主要发现包括:1)新浪微博网络具有小世界特性;2)新浪微博网络的入度分布属于幂次分布,而出度分布表现为某种分段幂率函数;3)与类似社会网络相比,新浪微博网络的出入度不具有相关性;4)新浪微博网络属于同配网络;5)新浪微博用户发博时间具有明显的日分布和周分布模式;6)新浪微博用户博文数目分布表现为威布尔分布;7)新浪微博用户博文的转发和评价行为具有很强的相关性,且博文转发概率要高于评价概率.这些测量研究和发现不仅有助于设计出符合中国微博网络结构特征的数学模型和计算模型,也是实现对微博舆论的监测、引导、控制等方面的重要依据和基础.  相似文献   

13.
伴随着互联网的广泛流行,以微博为代表的社交网络产生了大量的数据. 从这些数据中挖掘到有用的信息成为当今研究的一项重要方向. 根据微博文本的特点,本文提出来一种基于联合分类器过滤掉噪声微博,然后利用LDA模型进行主题发现. 联合分类器模型是由朴素贝叶斯、支持向量机和决策树三种模型通过简单投票机制结合构成的,实验结果联合分类器的准确度达到87%,显然这种分类方法是可行的,也是有效的.  相似文献   

14.
Location prediction is a crucial need for location-aware services and applications. Given an object’s recent movement and a future time, the goal of location prediction is to predict the location of the object at the future time specified. Different from traditional location prediction using motion function, some research works have elaborated on mining movement behavior from historical trajectories for location prediction. Without loss of generality, given a set of trajectories of an object, prior works on mining movement behaviors will first extract regions of popularity, in which the object frequently appears, and then discover the sequential relationships among regions. However, the quality of the frequent regions extracted affects the accuracy of the location prediction. Furthermore, trajectory data has both spatial and temporal information. To further enhance the accuracy of location prediction, one could utilize not only spatial information but also temporal information to predict the locations of objects. In this paper, we propose a framework QS-STT (standing for QuadSection clustering and Spatial-Temporal Trajectory model) to capture the movement behaviors of objects for location prediction. Specifically, we have developed QuadSection clustering to extract a reasonable and near-optimal set of frequent regions. Then, based on the set of frequent regions, we propose a spatial-temporal trajectory model to explore the object’s movement behavior as a probabilistic suffix tree with both spatial and temporal information of movements. Note that STT is not only able to discover sequential relationships among regions but also derives the corresponding probabilities of time, indicating when the object appears in each region. Based on STT, we further propose an algorithm to traverse STT for location prediction. By enhancing the quality of the frequent region extracted and exploring both the spatial and temporal information of STT, the accuracy of location prediction in QS-STT is improved. QS-STT is designed for individual location prediction. For verifying the effectiveness of QS-STT for location prediction under the different spatial density, we have conducted experiments on four types of real trajectory datasets with different speed. The experimental results show that our proposed QS-STT is able to capture both spatial and temporal patterns of movement behaviors and by exploring QS-STT, our proposed prediction algorithm outperforms existing works.  相似文献   

15.
微博信息传播预测研究综述   总被引:1,自引:1,他引:0  
李洋  陈毅恒  刘挺 《软件学报》2016,27(2):247-263
微博已经逐渐成为人们获取信息、分享信息的重要社会媒体,深刻影响并改变了信息的传播方式.针对微博信息传播预测问题展开综述.该研究对舆情监控、微博营销、个性化推荐具有重要意义.首先概述微博信息传播过程,通过介绍微博信息传播的定性研究工作,揭示微博信息传播的特点;接着,从以信息为中心、以用户为中心以及以信息和用户为中心这3个角度介绍微博信息传播预测相关研究工作,对应的主要研究任务分别是微博信息流行度预测、用户传播行为预测和微博信息传播路径预测;继而介绍可用于微博信息传播预测研究的公开数据资源;最后,展望微博信息传播预测研究的问题与挑战.  相似文献   

16.
Microblogging services allow users to publish their thoughts, activities, and interests in the form of text streams and to share them with others in a social network. A user’s text stream in a microblogging service is temporally composed of the posts the user has written or republished from other socially connected users. In this context, most research on the microblogging service has primarily focused on social graph or topic extraction from the text streams, and in particular, several studies attempted to discover user’s topics of interests from a text stream since the topics play a crucial role in user search, friend recommendation, and contextual advertisement. Yet, they did not yet fully address unique properties of the stream. In this paper, we study a problem of detecting the topics of long-term steady interests to a user from a text stream, considering its dynamic and social characteristics, and propose a graph-based topic extraction model. Extensive experiments have been carried out to investigate the effects of the proposed approach by using a real-world dataset, and the proposed model is shown to produce better performance than the existing alternatives.  相似文献   

17.
In this paper, we focus on the problem of community detection on Sina weibo, the most popular microblogging system in China. By characterizing the structure and content of microgroup (community) on Sina weibo in detail, we observe that different from ordinary social networks, the degree assortativity coefficients are negative on most microgroups. In addition, we find that users from the same microgroup tend to share some common attributes (e.g., followers, tags) and interests extracted from their published posts. Inspired by these new findings, we propose a united method to remodel the network for microgroup detection while maintaining the information of link structure and user content. Firstly, the link direction is concerned by assigning greater weight values to more surprising links, while the content similarity is measured by the Jaccard coefficient of common features and interest similarity based on Latent Dirichlet Allocation model. Then, both link direction and content similarity between two users are uniformly converted to the edge weight of a new remodeled network, which is undirected and weighted. Finally, multiple frequently used community detection algorithms that support weighted networks could be employed. Extensive experiments on real-world social networks show that both link structure and user content play almost equally important roles in microgroup detection on Sina weibo. Our method outperforms the traditional methods with average accuracy improvement up to 39 %, and the number of unrecognized users decreased by about 75 %.  相似文献   

18.
Determining user geolocation from social media data is essential in various location-based applications — from improved transportation/supply management, through providing personalized services and targeted marketing, to better overall user experiences. Previous methods rely on the similarity of user posting content and neighboring nodes for user geolocation, which suffer the problems of: (1) position-agnostic of network representation learning, which impedes the performance of their prediction accuracy; and (2) noisy and unstable user relation fusion due to the flat graph embedding methods employed. This work presents Hierarchical Graph Neural Networks (HGNN) – a novel methodology for location-aware collaborative user-aspect data fusion and location prediction. It incorporates geographical location information of users and clustering effect of regions and can capture topological relations while preserving their relative positions. By encoding the structure and features of regions with hierarchical graph learning, HGNN can primarily alleviate the problem of noisy and unstable signal fusion. We further design a relation mechanism to bridge connections between individual users and clusters, which not only leverages the information of isolated nodes that are useless in previous methods but also captures the relations between unlabeled nodes and labeled subgraphs. Furthermore, we introduce a robust statistics method to interpret the behavior of our model by identifying the importance of data samples when predicting the locations of the users. It provides meaningful explanations on the model behaviors and outputs, overcoming the drawbacks of previous approaches that treat user geolocation as “black-box” modeling and lacking interpretability. Comprehensive evaluations on real-world Twitter datasets verify the proposed model’s superior performance and its ability to interpret the user geolocation results.  相似文献   

19.
以新浪微博为研究平台,随机获取微博用户数据作为研究样本,通过共链关系构建社会网络,利用聚类分析方法对样本进行微博关注好友的网络群体分析,网络内部子结构分析和个体角色分析。进而从微博用户好友数据中挖掘关注对象的特征和关注对象间的关联特征,并对改进微博用户关注好友的推荐和信息推送提出一些建议。  相似文献   

20.
李冠辰 《软件》2013,(12):127-131
最近几年,以微博为首的社交网络迅猛发展,这些平台上包含了网民对于时事热点的观点,对生活和人际关系的看法等大量有价值的信息和资源。由于微博数据非常庞大又难以获取等困难,如何有效地对社交网络进行数据挖掘,是近两年数据挖掘研究的重点和热点。本工作设计和实现了一个基于Hadoop的并行社交网络挖掘系统,包含了分布式数据库,并行爬虫,并行数据处理和并行数据挖掘算法集,可以有效地获取和分析挖掘海量的社交网络数据,为社团分析,用户行为分析,用户分类,微博分类等工作提供支持。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号