首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 468 毫秒
1.
Query expansion methods have been extensively studied in information retrieval. This paper proposes a query expansion method. The HQE method employs a combination of ontology-based collaborative filtering and neural networks to improve query expansion. In the HQE method, ontology-based collaborative filtering is used to analyze semantic relationships in order to find the similar users, and the radial basis function (RBF) networks are used to acquire the most relevant web documents and their corresponding terms from these similar users’ queries. The method can improve the precision and only requires users to provide less query information at the beginning than traditional collaborative filtering methods.  相似文献   

2.
近年来,带有位置和文本信息的空间-文本数据的规模迅速增长。社交网络中的社交数据和移动互联网中的交易数据等都是空间-文本数据的重要来源,这些数据具有海量、异构、多维等特点。以空间-文本数据为背景的空间关键字查询技术目前得到广泛的研究与应用,给定一个查询位置(用经度和纬度表示)和一组查询关键字,返回距离查询位置最近且与查询关键字相关性较高的空间对象。对空间-文本数据的相关查询技术进行综述,主要包括查询处理模式、索引结构、语义近似查询、基于路网的查询、路线规划查询、基于社交网络查询、基于影响约束下的查询等。  相似文献   

3.
Large volumes of geospatial data are increasingly made available through dynamic networks, such as ad hoc networks. Consequently, new adapted query propagation approaches that take into account geospatial aspects of data are needed. Different existing query propagation approaches use various criteria to select relevant sources. In addition, many approaches rely on existing semantic mappings between sources; however, in ad hoc networks, sources move autonomously and in a dynamic fashion. Our goal is rather to reduce the number of sources that must be accessed to answer a query and, therefore, to reduce the volume of semantic mappings that needs to be computed to answer the query. In this paper, we propose three real time query propagation strategies to address this issue. Those strategies reproduce the behavior of members of a social network. The strategies are designed to be part of a real time semantic interoperability framework for ad hoc networks of geospatial databases. The strategies have been formalized with the Lightweight Coordination Calculus (LCC), which support distributed interactions based on social norms and constraints. The implementation and testing of the strategies show that they complement each other to provide optimal query answers.  相似文献   

4.
An adaptive learning automata-based ranking function discovery algorithm   总被引:1,自引:0,他引:1  
Due to the massive amount of heterogeneous information on the web, insufficient and vague user queries, and use of the same query by different users for different aims, the information retrieval process deals with a huge amount of uncertainty and doubt. Under such circumstances, designing an efficient retrieval function and ranking algorithm by which the most relevant results are provided is of the greatest importance. In this paper, a learning automata-based ranking function discovery algorithm in which different sources of information are combined is proposed. In this method, the learning automaton is used to adjust the portion of the final ranking that is assigned to each source of evidence based on the user feedback. All sources of information are first given the same importance. The proportion of a given source increases, if the documents provided by this source are reviewed by the user and decreases otherwise. As the proposed algorithm proceeds, the probability of appearance of each source in the final ranking gets proportional to its relevance to the user queries. Several simulation experiments are conducted on well-known data collections and query types to show the performance of the proposed algorithm. The obtained results demonstrate that the proposed algorithm outperforms several existing methods in terms of precision at position n, mean average precision, and normalized discount cumulative gain.  相似文献   

5.
In recent years, spatial data infrastructures (SDIs) have gained great popularity as a solution to facilitate interoperable access to geospatial data offered by different agencies. In order to enhance the data retrieval process, current infrastructures usually offer a catalog service. Nevertheless, such catalog services still have important limitations that make it difficult for users to find the geospatial data that they are interested in. Some current catalog drawbacks include the use of a single record to describe all the feature types offered by a service, the lack of formal means to describe the semantics of the underlying data, and the lack of an effective ranking metric to organize the results retrieved from a query. Aiming to overcome these limitations, this article proposes SESDI (Semantically-Enabled Spatial Data Infrastructures), which is framework that reuses techniques of classic information retrieval to improve geographic data retrieval in a SDI. Moreover, the framework proposes several ranking metrics to solve spatial, semantic, temporal and multidimensional queries.  相似文献   

6.
杨丹  陈默  王刚  孙良旭 《计算机科学》2017,44(5):189-192, 205
随着实体搜索成为信息检索的一种新趋势,实体推荐也成为业界和学术界的热门研究问题之一。异构信息空间中的异构实体间彼此相互关联,因此跨类型实体推荐至关重要。此外,异构实体具有时间信息,异构信息空间中的实体不断随时间演化,用户希望得到在时间上最相关的实体推荐。提出一个时间感知的跨类型实体推荐框架T-ERe,利用异构实体间丰富的关联关系和查询日志实现跨类型的实体推荐。T-ERe考虑实体的时间信息和查询的时间上下文, 给用户推荐时间上最相关的多种类型的实体。在真实数据集上的实验结果表明了T-ERe的可行性和有效性。  相似文献   

7.
The architectural choices underlying Linked Data have led to a compendium of data sources which contain both duplicated and fragmented information on a large number of domains. One way to enable non-experts users to access this data compendium is to provide keyword search frameworks that can capitalize on the inherent characteristics of Linked Data. Developing such systems is challenging for three main reasons. First, resources across different datasets or even within the same dataset can be homonyms. Second, different datasets employ heterogeneous schemas and each one may only contain a part of the answer for a certain user query. Finally, constructing a federated formal query from keywords across different datasets requires exploiting links between the different datasets on both the schema and instance levels. We present Sina, a scalable keyword search system that can answer user queries by transforming user-supplied keywords or natural-languages queries into conjunctive SPARQL queries over a set of interlinked data sources. Sina uses a hidden Markov model to determine the most suitable resources for a user-supplied query from different datasets. Moreover, our framework is able to construct federated queries by using the disambiguated resources and leveraging the link structure underlying the datasets to query. We evaluate Sina over three different datasets. We can answer 25 queries from the QALD-1 correctly. Moreover, we perform as well as the best question answering system from the QALD-3 competition by answering 32 questions correctly while also being able to answer queries on distributed sources. We study the runtime of SINA in its mono-core and parallel implementations and draw preliminary conclusions on the scalability of keyword search on Linked Data.  相似文献   

8.
A spatial query interface has been designed and implemented in the object-oriented paradigm for heterogeneous data sets. The object-oriented approach presented is shown to be highly suitable for querying typical multiple heterogeneous sources of spatial data. The spatial query model takes into consideration two common components of spatial data: spatial location and attributes. Spatial location allows users to specify an area or a region of interest, also known as a spatial range query. Also, the spatial query allows users to query spatial orientation and relationships (geometric and topological relationships) among other spatial data within the selected area or region. Queries on the properties and values of attributes provide more detailed non-spatial characteristics of spatial data. A query model specific to spatial data involves exploitation of both spatial and attribute components. This paper presents a conceptual spatial query model of heterogeneous data sets based on the object-oriented data model used in the geospatial information distribution system (GIDS).  相似文献   

9.
Influence is a complex and subtle force that governs social dynamics and user behaviors. Understanding how users influence each other can benefit various applications, e.g., viral marketing, recommendation, information retrieval and etc. While prior work has mainly focused on qualitative aspect, in this article, we present our research in quantitatively learning influence between users in heterogeneous networks. We propose a generative graphical model which leverages both heterogeneous link information and textual content associated with each user in the network to mine topic-level influence strength. Based on the learned direct influence, we further study the influence propagation and aggregation mechanisms: conservative and non-conservative propagations to derive the indirect influence. We apply the discovered influence to user behavior prediction in four different genres of social networks: Twitter, Digg, Renren, and Citation. Qualitatively, our approach can discover some interesting influence patterns from these heterogeneous networks. Quantitatively, the learned influence strength greatly improves the accuracy of user behavior prediction.  相似文献   

10.
11.
As social media and e-commerce on the Internet continue to grow, opinions have become one of the most important sources of information for users to base their future decisions on. Unfortunately, the large quantities of opinions make it difficult for an individual to comprehend and evaluate them all in a reasonable amount of time. The users have to read a large number of opinions of different entities before making any decision. Recently a new retrieval task in information retrieval known as Opinion-Based Entity Ranking (OpER) has emerged. OpER directly ranks relevant entities based on how well opinions on them are matched with a user's preferences that are given in the form of queries. With such a capability, users do not need to read a large number of opinions available for the entities. Previous research on OpER does not take into account the importance and subjectivity of query keywords in individual opinions of an entity. Entity relevance scores are computed primarily on the basis of occurrences of query keywords match, by assuming all opinions of an entity as a single field of text. Intuitively, entities that have positive judgments and strong relevance with query keywords should be ranked higher than those entities that have poor relevance and negative judgments. This paper outlines several ranking features and develops an intuitive framework for OpER in which entities are ranked according to how well individual opinions of entities are matched with the user's query keywords. As a useful ranking model may be constructed from many ranking features, we apply learning to rank approach based on genetic programming (GP) to combine features in order to develop an effective retrieval model for OpER task. The proposed approach is evaluated on two collections and is found to be significantly more effective than the standard OpER approach.  相似文献   

12.
In many of the problems that can be found nowadays, information is scattered across different heterogeneous data sources. Most of the natural language interfaces just focus on a very specific part of the problem (e.g. an interface to a relational database, or an interface to an ontology). However, from the point of view of users, it does not matter where the information is stored, they just want to get the knowledge in an integrated, transparent, efficient, effective, and pleasant way. To solve this problem, this article proposes a generic multi-agent conversational architecture that follows the divide and conquer philosophy and considers two different types of agents. Expert agents are specialized in accessing different knowledge sources, and decision agents coordinate them to provide a coherent final answer to the user. This architecture has been used to design and implement SmartSeller, a specific system which includes a Virtual Assistant to answer general questions and a Bookseller to query a book database. A deep analysis regarding other relevant systems has demonstrated that our proposal provides several improvements at some key features presented along the paper.  相似文献   

13.
The content-based cross-media retrieval is a new type of multimedia retrieval in which the media types of query examples and the returned results can be different. In order to learn the semantic correlations among multimedia objects of different modalities, the heterogeneous multimedia objects are analyzed in the form of multimedia document (MMD), which is a set of multimedia objects that are of different media types but carry the same semantics. We first construct an MMD semi-semantic graph (MMDSSG) by jointly analyzing the heterogeneous multimedia data. After that, cross-media indexing space (CMIS) is constructed. For each query, the optimal dimension of CMIS is automatically determined and the cross-media retrieval is performed on a per-query basis. By doing this, the most appropriate retrieval approach for each query is selected, i.e. different search methods are used for different queries. The query dependent search methods make cross-media retrieval performance not only accurate but also stable. We also propose different learning methods of relevance feedback (RF) to improve the performance. Experiment is encouraging and validates the proposed methods.  相似文献   

14.
One key property of the Semantic Web is its support for interoperability. Recent research in this area focuses on the integration of multiple data sources to facilitate tasks such as ontology learning, user query expansion and context recognition. The growing popularity of such machups and the rising number of Web APIs supporting links between heterogeneous data providers asks for intelligent methods to spare remote resources and minimize delays imposed by queries to external data sources. This paper suggests a cost and utility model for optimizing such queries by leveraging optimal stopping theory from business economics: applications are modeled as decision makers that look for optimal answer sets. Queries to remote resources cause additional cost but retrieve valuable information which improves the estimation of the answer set’s utility. Optimal stopping optimizes the trade-off between query cost and answer utility yielding optimal query strategies for remote resources. These strategies are compared to conventional approaches in an extensive evaluation based on real world response times taken from seven popular Web services.  相似文献   

15.
《Computers in Industry》2014,65(6):937-951
Passage retrieval is usually defined as the task of searching for passages which may contain the answer for a given query. While these approaches are very efficient when dealing with texts, applied to log files (i.e. semi-structured data containing both numerical and symbolic information) they usually provide irrelevant or useless results. Nevertheless one appealing way for improving the results could be to consider query expansions that aim at adding automatically or semi-automatically additional information in the query to improve the reliability and accuracy of the returned results. In this paper, we present a new approach for enhancing the relevancy of queries during a passage retrieval in log files. It is based on two relevance feedback steps. In the first one, we determine the explicit relevance feedback by identifying the context of the requested information within a learning process. The second step is a new kind of pseudo relevance feedback. Based on a novel term weighting measure it aims at assigning a weight to terms according to their relatedness to queries. This measure, called TRQ (Term Relatedness to Query), is used to identify the most relevant expansion terms.The main advantage of our approach is that is can be applied both on log files and documents from general domains. Experiments conducted on real data from logs and documents show that our query expansion protocol enables retrieval of relevant passages.  相似文献   

16.
郑冬冬  崔志明 《计算机应用》2006,26(9):2024-2027
越来越多的信息隐藏在Web查询接口之后,在此情况下如何寻找与用户查询最相关的数据源接口就变得越来越重要。文中提出了一种Deep Web查询接口选择算法,该算法是完全依赖于查询接口特征的。给定大量异构的Deep Web数据源,目标是选择与用户查询最相关的查询接口集。通过对实际查询接口特征的观察,发现了查询接口上谓词间的相关性。基于此发现,设计了一种基于共同出现谓词相关度模型的数据源选择算法,用于选择与用户查询最相关的查询接口集。  相似文献   

17.
In heterogeneous networks, different modalities are coexisting. For example, video sources with certain lengths usually have abundant time-varying audiovisual data. From the users’ perspective, different video segments will trigger different kinds of emotions. In order to better interact with users in heterogeneous networks and improve their user experiences, affective video content analysis to predict users’ emotions is essential. Academically, users’ emotions can be evaluated by arousal and valence values, and fear degree, which provides an approach to quantize the prediction accuracy of the reaction of the audience and users towards videos. In this paper, we propose the multimodal data fusion method for integrating the visual and audio data in order to perform the affective video content analysis. Specifically, to align the visual and audio data, the temporal attention filters are proposed to obtain the time-span features of the entire video segments. Then, by using the two-branch network structure, matched visual and audio features are integrated in the common space. At last, the fused audiovisual feature is employed for the regression and classification subtasks in order to measure the emotional responses of users. Simulation results show that the proposed method can accurately predict the subjective feelings of users towards the video contents, which provides a way to predict users’ preferences and recommend videos according to their own demand.  相似文献   

18.
谭光兴  刘臻晖 《计算机科学》2015,42(12):275-277, 306
图片检索是图片共享社会网络中的重要研究内容之一。传统的图片检索方法往往通过对用户输入的关键字和图片的文本描述加以匹配来进行图片检索。由于文本信息存在歧义性,图片的文本描述十分困难,因此检索结果的准确性低。为了提高图片检索的准确性,提出了基于排序学习的图片检索方法。将每幅图片通过多种特征描述符进行描述,当用户的输入为图片时,通过对比查询图片和图片库中图片的相似性进行图片检索。采用支持向量机和关联规则两种学习方法对特征描述符的权重组合进行学习,并提出了相应的学习算法。实验表明,提出的基于学习的图片检索方法与相关图片检索方法相比具有更高的准确性。此外,应用支持向量机和关联规则两种方法对分类函数进行学习时,由于两种算法通过相同的数据实例对图片描述符的权重进行学习,因此得到的结果是相关的。  相似文献   

19.
Web信息集成系统中查询的处理   总被引:1,自引:0,他引:1  
为了有效地实现对Web上异构数据源的统一查询处理,提出了一个基于本体的异构数据源集成系统模型OBIISM,引入本体解决各数据源语义层上的异构,通过两级查询重写将用户提交的查询转化为对数据源的查询,为查询异构数据源提供了一个语义统一的接口.  相似文献   

20.
This research examined the effects of three different data base formats on the information retrieval performance of users. Spatial, tabular, and verbal forms of two data base domains (airline and thesaurus) were constructed, along with questions that required users to search through the data base to determine the correct response. Three types of questions, compatible with the forms of the data bases, were designed--spatial, tabular, and verbal. The data indicate that users' responses to the questions are faster and more accurate when the format of the information in the data base matches the type of information needed to answer the question. Although the importance of matching data base format to query type may seem obvious, it would appear that the designers of most current data base systems have not taken this into account.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号