首页 | 本学科首页   官方微博 | 高级检索  
     

基于文献信息网络语义特征的相似性搜索
引用本文:邱庆羽,李婧,全兵,童超,张利君,张海仙. 基于文献信息网络语义特征的相似性搜索[J]. 计算机应用, 2018, 38(5): 1327-1333. DOI: 10.11772/j.issn.1001-9081.2017112623
作者姓名:邱庆羽  李婧  全兵  童超  张利君  张海仙
作者单位:1. 四川大学 计算机学院, 成都 610065;2. 中移(苏州)软件技术有限公司, 江苏 苏州 215000;3. 成都瑞贝英特信息技术有限公司, 成都 610041
基金项目:教育部-中国移动科研基金资助项目(MCM20160307);四川省科技创新苗子工程项目和成都市科技局国际合作项目(2016-GH02-00048-HZ,2015-GH02-00041-HZ)。
摘    要:文献信息网络是典型的异构信息网络,基于其进行相似性搜索是图挖掘领域的一个研究热点。然而,现有的方法主要采用元路径或元结构的方式,并未考虑节点自身的语义特征,从而导致搜索结果出现偏差。对此,基于文献信息网络提出了一种基于向量的语义特征提取方法,并设计实现了基于向量的节点相似性计算方法VSim;此外,结合元路径设计了基于语义特征的相似性搜索算法VPSim;为提高算法的执行效率,针对文献网络数据的特点,设计了剪枝策略。通过在真实数据上的实验,验证了VSim对搜索语义特征相似实体的适用性,以及VPSim算法的有效性、高执行效率和高可扩展性。

关 键 词:文献信息网络  相似性搜索  图挖掘  元路径  语义特征  
收稿时间:2017-11-06
修稿时间:2017-12-07

Similarity search based on semantic features of bibliographic information network
QIU Qingyu,LI Jing,QUAN Bing,TONG Chao,ZHANG Lijun,ZHANG Haixian. Similarity search based on semantic features of bibliographic information network[J]. Journal of Computer Applications, 2018, 38(5): 1327-1333. DOI: 10.11772/j.issn.1001-9081.2017112623
Authors:QIU Qingyu  LI Jing  QUAN Bing  TONG Chao  ZHANG Lijun  ZHANG Haixian
Affiliation:1. School of Computer Science, Sichuan University, Chengdu Sichuan 610065, China;2. China Mobile(Suzhou) Software Technology Company Limited, Suzhou Jiangsu 215000, China;3. Chengdu Ruibeiyingte Information Technology Company Limited, Chengdu Sichuan 610041, China
Abstract:Bibliography information network is a typical heterogeneous information network and the similarity search based on it is a hot topic of graph mining. However, current methods mainly adopt meta path or meta structure to search similar objects, do not consider semantic features of node itself which leads to a deviation in the search results. To fill this gap, a vector-based semantic feature extraction method was proposed, and a vector-based node similarity calculation method called VSim was designed and implemented. In addition, a similarity search algorithm VPSim (Similarity computation Based on Vector and meta Path) based on semantic features was designed by combining the meta-paths. In order to improve the execution efficiency of the algorithm, a pruning strategy based on the characteristics of bibliographic network data was designed. Experiments on real-world data sets demonstrate that VSim is applicative for searching entities with similar semantic features and VPSim is effective, efficient and extensible.
Keywords:bibliography information network   similarity search   graph mining   meta path   semantic features
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号