首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
岳峰  王含茹  张馨悦  王刚 《计算机应用研究》2020,37(12):3552-3556,3564
针对现有学术论文推荐方法不能充分利用科研社交网络中实体间的异质关系,且大多聚焦于预测评分的准确性,忽略用户偏好顺序的问题,提出一种基于异质网络分析的列表级排序学习推荐方法。首先采用异质网络分析充分探究科研社交网络中实体之间的关系,在此基础上将异质网络分析获取的信息融入列表级排序学习框架中,对学术论文的推荐排序列表进行优化,最终得到为科研人员推荐的学术论文列表。在科研社交网络科研之友数据集上的实验结果表明所提方法较其他传统推荐方法取得了更好的结果,验证了该方法的有效性。  相似文献   

2.
杨丹  陈默  王刚  孙良旭 《计算机科学》2017,44(5):189-192, 205
随着实体搜索成为信息检索的一种新趋势,实体推荐也成为业界和学术界的热门研究问题之一。异构信息空间中的异构实体间彼此相互关联,因此跨类型实体推荐至关重要。此外,异构实体具有时间信息,异构信息空间中的实体不断随时间演化,用户希望得到在时间上最相关的实体推荐。提出一个时间感知的跨类型实体推荐框架T-ERe,利用异构实体间丰富的关联关系和查询日志实现跨类型的实体推荐。T-ERe考虑实体的时间信息和查询的时间上下文, 给用户推荐时间上最相关的多种类型的实体。在真实数据集上的实验结果表明了T-ERe的可行性和有效性。  相似文献   

3.
杨丹  陈默  王刚  孙良旭 《计算机科学》2017,44(3):215-219
已有的传统的实体识别技术大多是以线下、非实时的方式,在静态数据集上进行,对于大数据集的执行通常需要大量的时间和系统资源。对于异构信息空间中具有时间信息、不断演化的异构实体来说,时间感知的查询时实体识别与数据融合越来越成为一种保证数据质量和满足用户需求的发展趋势。针对异构信息空间中使用时间上下文的关键字查询进行的实体搜索,提出一种时间感知的查询时实体识别与数据融合方法TQ-ER,以给用户提供准确的实体概貌(entity profile);提出一种迭代式时间感知的实体候选集生成算法。TQ-ER充分利用查询的时间上下文和实体的时间信息给正确的回答一个给定查询所需要的、最少的实体数据,以进行识别与数据融合。在真实数据集上的大量实验结果表明了TQ-ER的有效性和正确性。  相似文献   

4.
The need for academic researchers to retrieve patents and research papers is increasing, because applying for patents is now considered an important research activity. However, retrieving patents using keywords is a laborious task for researchers, because the terms used in patents for the purpose of enlarging the scope of the claims are generally more abstract than those used in research papers. Therefore, we have constructed a framework that facilitates patent retrieval for researchers, and have integrated research papers and patents by analysing the citation relationships between them. We obtained cited research papers in patents using two steps: (1) detection of sentences containing bibliographic information, and (2) extraction of bibliographic information from those sentences. To investigate the effectiveness of our method, we conducted two experiments. In the experiment involving Step 1, we prepared 42,073 sentences, among which a human subject manually identified 1,476 sentences containing citations of papers. For Step 2, we prepared 3,000 sentences, in which the titles, authors, and other bibliographic information were manually identified. We obtained a precision of 91.6%, and a recall of 86.9% in Step 1, and a precision of 86.2% and a recall of 85.1% in Step 2. Finally, we constructed an information retrieval system that provided two methods of retrieving research papers and patents. One method was retrieval by query, and another was from the citation relationships between research papers and patents.  相似文献   

5.
随着社交网络和文献索引网络等大规模互联多类异质信息网络的浮现,为相似搜索提出许多挑战,其中相似性度量是关键问题之一。现有适用于同构网络的相似度量方法没有考虑网络多个路径的不同语义。本文提出一种新的基于元路径的相似性度量方法,可以在异构网络中搜索相同类型的对象。元路径是由在不同对象类型中定义的一系列关系所组成的路径,可以为网络中相似搜索引擎提供共同的基础。在真实数据集上的实验表明,与无序相似性衡量方法相比,本文提出的方法支持快速路径相似性查询,可广泛地应用于社交网络和电子商务领域。  相似文献   

6.

Heterogeneous information networks, which consist of multi-typed vertices representing objects and multi-typed edges representing relations between objects, are ubiquitous in the real world. In this paper, we study the problem of entity matching for heterogeneous information networks based on distributed network embedding and multi-layer perceptron with a highway network, and we propose a new method named DEM short for Deep Entity Matching. In contrast to the traditional entity matching methods, DEM utilizes the multi-layer perceptron with a highway network to explore the hidden relations to improve the performance of matching. Importantly, we incorporate DEM with the network embedding methodology, enabling highly efficient computing in a vectorized manner. DEM’s generic modeling of both the network structure and the entity attributes enables it to model various heterogeneous information networks flexibly. To illustrate its functionality, we apply the DEM algorithm to two real-world entity matching applications: user linkage under the social network analysis scenario that predicts the same or matched users in different social platforms and record linkage that predicts the same or matched records in different citation networks. Extensive experiments on real-world datasets demonstrate DEM’s effectiveness and rationality.

  相似文献   

7.
挖掘数据网络中有价值的、具有稳定性的社区,对网络信息的获取、推荐及网络的演化预测具有重要的价值。针对现有异质网络聚类方法难以在同一维度有效整合网络中异质信息的问题,提出了一种基于图正则化非负矩阵分解的异质网络聚类方法。通过加入图正则项,将中心类型子空间和属性类型子空间的内部连接关系作为约束项,引入到非负矩阵分解模型中,从而找到高维数据在低维空间的紧致嵌入,成功消除了异质节点之间的部分噪声,同时,对反映不同子网络共有潜在结构的共识矩阵进行优化,有效整合异质信息,并且在降维过程中较大限度地保留了异质信息的完整性,提高了异质网络聚类方法的精度,在真实世界数据集上的实验结果也验证了该方法的有效性。  相似文献   

8.
传统的犯罪查询的查询条件是文本信息,查询结果是有序的文档列表,这种方式无法展示结果之间的关系.基于异构信息网络以信息网络的形式重构假币犯罪信息数据,构建了假币犯罪信息网络,使用人名消歧的技术建立假币犯罪信息网络中嫌疑人之间的关系,并使用排序学习方法研究假币犯罪信息网络中的节点相关性问题,设计并实现了假币犯罪信息分析系统,通过以实体对象为查询项和网络图为查询结果的方式解决假币犯罪数据的查询问题.  相似文献   

9.
Link prediction problem in complex networks has received substantial amount of attention in the field of social network analysis. Though initial studies consider only static snapshot of a network, importance of temporal dimension has been observed and cultivated subsequently. In recent times, multi-domain relationships between node-pairs embedded in real networks have been exploited to boost link prediction performance. In this paper, we combine multi-domain topological features as well as temporal dimension, and propose a robust and efficient feature set called TMLP (Time-aware Multi-relational Link Prediction) for link prediction in dynamic heterogeneous networks. It combines dynamics of graph topology and history of interactions at dyadic level, and exploits time-series model in the feature extraction process. Several experiments on two networks prepared from DBLP bibliographic dataset show that the proposed framework outperforms the existing methods significantly, in predicting future links. It also demonstrates the necessity of combining heterogeneous information with temporal dynamics of graph topology and dyadic history in order to predict future links. Empirical results find that the proposed feature set is robust against longitudinal bias.  相似文献   

10.
数据更新中要素变化检测与匹配方法   总被引:4,自引:0,他引:4  
吴建华  傅仲良 《计算机应用》2008,28(6):1612-1615
在要素类之间缺乏同名实体关联关系的情况下,通过空间分析自动识别出当前要素的同名实体及它们之间的变化信息。在查询当前要素的候选匹配集时,设计了一种基于自定义空间拓扑关系的空间查询方法,缩小了的空间查询范围并减少了查询次数,提高了空间分析的效率;在确定当前要素的同名实体时,提出了基于权重的空间要素相似性计算模型,基于该模型有效地对复杂空间关系下的要素进行了匹配,提高了匹配的准确性。  相似文献   

11.
在系统中搜索某一姓名时,会返回该同名作者的所有文档(如论文、网页),严重影响用户体验,姓名消歧可提高检索精度.因此,文中提出基于异质网络表示学习的姓名消歧方法.首先为每个歧义姓名构造一个论文异质网络.然后使用异质网络表示学习并结合词向量化语义表征学习方法,获取网络中每个论文节点的表征向量.最后使用具有噪声的基于密度的聚类方法与规则匹配结合的聚类方法将论文划分给不同的作者实体.文中方法在OAG-WholsWho比赛数据集上的性能较优,结果验证方法的有效性.  相似文献   

12.
杨丹  陈默  申德荣 《计算机科学》2017,44(2):112-116
异构信息空间中的实体和关联关系普遍具有时间信息、多种时间版本的实体数据共存,而传统的实体集成忽略了时间信息,不支持时间维度上的集成。提出一种异构信息空间中时间感知的实体集成框架T-EI,从大量异构实体数据中聚集事实形成干净的、完整的、具有时间信息的实体概貌,进而支持时间感知的实体搜索。T-EI利用实体及关联关系所具有的时间信息提出时间感知的实体识别算法,并通过考虑数据时效性提出时间感知的数据融合算法。在真实数据集上的实验结果表明了T-EI的可行性和有效性。  相似文献   

13.
Automatic image tagging automatically assigns image with semantic keywords called tags, which significantly facilitates image search and organization. Most of present image tagging approaches are constrained by the training model learned from the training dataset, and moreover they have no exploitation on other type of web resource (e.g., web text documents). In this paper, we proposed a search based image tagging algorithm (CTSTag), in which the result tags are derived from web search result. Specifically, it assigns the query image with a more comprehensive tag set derived from both web images and web text documents. First, a content-based image search technology is used to retrieve a set of visually similar images which are ranked by the semantic consistency values. Then, a set of relevant tags are derived from these top ranked images as the initial tag set. Second, a text-based search is used to retrieve other relevant web resources by using the initial tag set as the query. After the denoising process, the initial tag set is expanded with other tags mined from the text-based search result. Then, an probability flow measure method is proposed to estimate the probabilities of the expanded tags. Finally, all the tags are refined using the Random Walk with Restart (RWR) method and the top ones are assigned to the query images. Experiments on NUS-WIDE dataset show not only the performance of the proposed algorithm but also the advantage of image retrieval and organization based on the result tags.  相似文献   

14.
QUERY ROUTING IN A PEER-TO-PEER SEMANTIC LINK NETWORK   总被引:9,自引:0,他引:9  
Hai  Zhuge  Jie  Liu  Liang  Feng  Xiaoping  Sun  Chao  He 《Computational Intelligence》2005,21(2):197-216
A semantic link peer-to-peer (P2P) network specifies and manages semantic relationships between peers' data schemas and can be used as the semantic layer of a scalable Knowledge Grid. The proposed approach consists of an automatic semantic link discovery method, a tool for building and maintaining P2P semantic link networks (P2PSLNs), a semantic-based peer similarity measurement for efficient query routing, and the schema mapping algorithms for query reformulation and heterogeneous data integration. The proposed approach has three important aspects. First, it uses semantic links to enrich the relationships between peers' data schemas. Second, it considers not only nodes but also the XML structure in measuring the similarity between schemas to efficiently and accurately forward queries to relevant peers. Third, it copes with semantic and structural heterogeneity and data inconsistency so that peers can exchange and translate heterogeneous information within a uniform view.  相似文献   

15.
鉴于现有农业知识图谱对病虫害防治相关实体、关系刻画不够细致的问题,以苹果病虫害知识图谱构建为例,研究细粒度农业知识图谱的构建方法。对苹果病虫害知识的实体类型和关系种类进行细粒度定义,共划分出19种实体类别和22种实体关系,以此为基础标注并构建了苹果病虫害知识图谱数据集AppleKG。使用APD-CA模型对苹果病虫害领域命名实体进行识别,使用ED-ARE模型对实体关系进行抽取。实验结果表明,该文模型在命名实体识别和关系抽取两项子任务中的F1值分别达到了93.08%和94.73%。使用Neo4j数据库对知识图谱进行了存储和可视化,并就细粒度苹果病虫害知识图谱可以为精准病虫害信息查询、智能辅助诊断等下游任务提供底层技术支撑进行了讨论。  相似文献   

16.
文献信息网络是典型的异构信息网络,基于其进行相似性搜索是图挖掘领域的一个研究热点。然而,现有的方法主要采用元路径或元结构的方式,并未考虑节点自身的语义特征,从而导致搜索结果出现偏差。对此,基于文献信息网络提出了一种基于向量的语义特征提取方法,并设计实现了基于向量的节点相似性计算方法VSim;此外,结合元路径设计了基于语义特征的相似性搜索算法VPSim;为提高算法的执行效率,针对文献网络数据的特点,设计了剪枝策略。通过在真实数据上的实验,验证了VSim对搜索语义特征相似实体的适用性,以及VPSim算法的有效性、高执行效率和高可扩展性。  相似文献   

17.
现有查询分析方法通常将实体识别作为线下预处理过程清洗整个数据集,然而,随着数据规模的不断增大,这种高计算复杂性的线下清洗模式已经很难满足实时性分析应用的需求。针对重复充电运营记录上的聚集查询问题,提出一种将近似聚集查询处理与实体识别相结合的方法。首先,通过基于块的采样策略采集样本;然后,在采集到的样本上利用实体识别方法识别出重复的实体;最后,根据实体识别的结果重构得到聚集结果的无偏估计。所提方法避免了识别全部实体的时间代价,通过识别少量样本数据即可返回满足用户需求的查询结果。真实数据集和合成数据集上的实验结果验证了所提方法的高效性和可靠性。  相似文献   

18.
近年来,带有位置和文本信息的空间-文本数据的规模迅速增长。社交网络中的社交数据和移动互联网中的交易数据等都是空间-文本数据的重要来源,这些数据具有海量、异构、多维等特点。以空间-文本数据为背景的空间关键字查询技术目前得到广泛的研究与应用,给定一个查询位置(用经度和纬度表示)和一组查询关键字,返回距离查询位置最近且与查询关键字相关性较高的空间对象。对空间-文本数据的相关查询技术进行综述,主要包括查询处理模式、索引结构、语义近似查询、基于路网的查询、路线规划查询、基于社交网络查询、基于影响约束下的查询等。  相似文献   

19.
设计和实现了一种动态数据关联网络的表示及搜索方法和系统,能够在数据实体较多、关联关系较复杂时,帮助用户获得实体周边关联关系,并通过引导式交互不断动态扩展;在已知可能有关联的多个实体时,采用分布式计算最小连通图算法,搜索出其关联网络.应用实例表明,本方法和系统能够取得很好的实际效果,采用本方法的应用系统已经在智慧城市、平安城市、城域物联网等多个工程项目中获得落地应用.  相似文献   

20.
We present Wiser, a new semantic search engine for expert finding in academia. Our system is unsupervised and it jointly combines classical language modeling techniques, based on text evidences, with the Wikipedia Knowledge Graph, via entity linking.Wiser indexes each academic author through a novel profiling technique which models her expertise with a small, labeled and weighted graph drawn from Wikipedia. Nodes in this graph are the Wikipedia entities mentioned in the author’s publications, whereas the weighted edges express the semantic relatedness among these entities computed via textual and graph-based relatedness functions. Every node is also labeled with a relevance score which models the pertinence of the corresponding entity to author’s expertise, and is computed by means of a proper random-walk calculation over that graph; and with a latent vector representation which is learned via entity and other kinds of structural embeddings derived from Wikipedia.At query time, experts are retrieved by combining classic document-centric approaches, which exploit the occurrences of query terms in the author’s documents, with a novel set of profile-centric scoring strategies, which compute the semantic relatedness between the author’s expertise and the query topic via the above graph-based profiles.The effectiveness of our system is established over a large-scale experimental test on a standard dataset for this task. We show that Wiser achieves better performance than all the other competitors, thus proving the effectiveness of modeling author’s profile via our “semantic” graph of entities. Finally, we comment on the use of Wiser for indexing and profiling the whole research community within the University of Pisa, and its application to technology transfer in our University.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号