首页 | 本学科首页   官方微博 | 高级检索  
     


Accuracy estimation of link-based similarity measures and its application
Authors:Yinglong Zhang  Cuiping Li  Chengwang Xie  Hong Chen
Affiliation:1. School of Software, East China Jiaotong University, Nanchang 330045, China2. Key Laboratory of Data Engineering and Knowledge Engineering of Ministry of Education,and Department of Computer Science, Renmin University of China, Beijing 100872, China3. Intelligent Optimization and Information Processing Laboratory, East China Jiaotong University,Nanchang 330013, China
Abstract:Link-based similarity measures play a significant role in many graph based applications. Consequently, measuring node similarity in a graph is a fundamental problem of graph datamining. Personalized pagerank (PPR) and simrank (SR) have emerged as the most popular and influential link-based similarity measures. Recently, a novel link-based similarity measure, penetrating rank (P-Rank), which enriches SR, was proposed. In practice, PPR, SR and P-Rank scores are calculated by iterative methods. As the number of iterations increases so does the overhead of the calculation. The ideal solution is that computing similarity within the minimum number of iterations is sufficient to guarantee a desired accuracy. However, the existing upper bounds are too coarse to be useful in general. Therefore, we focus on designing an accurate and tight upper bounds for PPR, SR, and P-Rank in the paper. Our upper bounds are designed based on the following intuition: the smaller the difference between the two consecutive iteration steps is, the smaller the difference between the theoretical and iterative similarity scores becomes. Furthermore, we demonstrate the effectiveness of our upper bounds in the scenario of top-k similar nodes queries, where our upper bounds helps accelerate the speed of the query. We also run a comprehensive set of experiments on real world data sets to verify the effectiveness and efficiency of our upper bounds.
Keywords:personalized PageRank  SimRank  P-Rank  upper bound  
本文献已被 SpringerLink 等数据库收录!
点击此处可从《Frontiers of Computer Science》浏览原始摘要信息
点击此处可从《Frontiers of Computer Science》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号