首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 0 毫秒
Recently, Reverse k Nearest Neighbors (RkNN) queries, returning every answer for which the query is one of its k nearest neighbors, have been extensively studied on the database research community. But the RkNN query cannot retrieve spatio-textual objects which are described by their spatial location and a set of keywords. Therefore, researchers proposed a RSTkNN query to find these objects, taking both spatial and textual similarity into consideration. However, the RSTkNN query cannot control the size of answer set and to be sorted according to the degree of influence on the query. In this paper, we propose a new problem Ranked Reverse Boolean Spatial Keyword Nearest Neighbors query called Ranked-RBSKNN query, which considers both spatial similarity and textual relevance, and returns t answers with most degree of influence. We propose a separate index and a hybrid index to process such queries efficiently. Experimental results on different real-world and synthetic datasets show that our approaches achieve better performance.  相似文献   

活动轨迹的近似查询是在带关键词信息的轨迹集中,检索与查询点集距离最近且满足查询点集关键词要求的活动轨迹的过程。因为GAT(Grid index for Activity Trajectories)不能查询海量活动轨迹,将GAT扩展到适用于海量活动轨迹的近似查询技术GATH(GAT on Hadoop)。和GAT相比,GATH使用两种新的索引结构进行剪枝;其网格索引依照海量数据的特点从底层单元格开始进行基于空间的剪枝;其倒排索引用于进行基于关键词的剪枝。实验结果证实GATH比GAT能有效缩短索引建立时间及提高剪枝效率。  相似文献   

随着在线地图应用的普及,基于地图的空间对象检索成为一个重要的工具而被广泛使用,技术也比较成熟。人们在地图上经常进行确定性目标点查询,例如用户提交关键词“咖啡店”,地图应用会在地图上标记所有的咖啡店,用户还可以通过进一步操作获取咖啡店的详细信息。但实际生活中存在另一种需求,例如用户想找到一个区域,在这个区域内要有“咖啡店”、“学校”和“旅店”这三类对象,称这样的查询为不确定性区域检索查询。目前对地图应用的研究无法解决不确定性区域检索的问题。而利用矩形剪枝和top-k推荐能够通过用户提交的关键字,给用户返回若干候选区域。  相似文献   

With the rocket development of the Internet, WWW(World Wide Web), mobile computing and GPS (Global Positioning System) services, location-based services like Web GIS (Geographical Information System) portals are becoming more and more popular. Spatial keyword queries over GIS spatial data receive much more attention from both academic and industry communities than ever before. In general, a spatial keyword query containing spatial location information and keywords is to locate a set of spatial objects that satisfy the location condition and keyword query semantics. Researchers have proposed many solutions to various spatial keyword queries such as top-K keyword query, reversed kNN keyword query, moving object keyword query, collective keyword query, etc. In this paper, we propose a density-based spatial keyword query which is to locate a set of spatial objects that not only satisfies the query’s textual and distance condition, but also has a high density in their area. We use the collective keyword query semantics to find in a dense area, a group of spatial objects whose keywords collectively match the query keywords. To efficiently process the density based spatial keyword query, we use an IR-tree index as the base data structure to index spatial objects and their text contents and define a cost function over the IR-tree indexing nodes to approximately compute the density information of areas. We design a heuristic algorithm that can efficiently prune the region according to both the distance and region density in processing a query over the IR-tree index. Experimental results on datasets show that our method achieves desired results with high performance.  相似文献   

Aggregate keyword search on large relational databases   总被引:2,自引:1,他引:1  
Keyword search has been recently extended to relational databases to retrieve information from text-rich attributes. However, all the existing methods focus on finding individual tuples matching a set of query keywords from one table or the join of multiple tables. In this paper, we motivate a novel problem of aggregate keyword search: finding minimal group-bys covering a set of query keywords well, which is useful in many applications. We develop two interesting approaches to tackle the problem. We further extend our methods to allow partial matches and matches using a keyword ontology. An extensive empirical evaluation using both real data sets and synthetic data sets is reported to verify the effectiveness of aggregate keyword search and the efficiency of our methods.  相似文献   

Keyword search is the most popular technique for querying large tree-structured datasets, often of unknown structure, in the web. Recent keyword search approaches return lowest common ancestors (LCAs) of the keyword matches ranked with respect to their relevance to the keyword query. A major challenge of a ranking approach is the efficiency of its algorithms as the number of keywords and the size and complexity of the data increase. To face this challenge most of the known approaches restrict their ranking to a subset of the LCAs (e.g., SLCAs, ELCAs), missing relevant results.In this work, we design novel top-k-size stack-based algorithms on tree-structured data. Our algorithms implement ranking semantics for keyword queries which is based on the concept of LCA size. Similar to metric selection in information retrieval, LCA size reflects the proximity of keyword matches in the data tree. This semantics does not rank a predefined subset of LCAs and through a layered presentation of results, it demonstrates improved effectiveness compared to previous relevant approaches. To address performance challenges our algorithms exploit a lattice of the partitions of the keyword set, which empowers a linear time performance. This result is obtained without the support of auxiliary precomputed data structures. An extensive experimental study on various and large datasets confirms the theoretical analysis. The results show that, in contrast to other approaches, our algorithms scale smoothly when the size of the dataset and the number of keywords increase.  相似文献   

It is widely recognized that the integration of information retrieval (IR) and database (DB) techniques provides users with a broad range of high quality services. Along this direction, IR-styled m-keyword query processing over a relational database in an rdbms framework has been well studied. It finds all hidden interconnected tuple structures, for example connected trees that contain keywords and are interconnected by sequences of primary/foreign key relationships among tuples. A new challenging issue is how to monitor events that are implicitly interrelated over an open-ended relational data stream for a user-given m-keyword query. Such a relational data stream is a sequence of tuple insertion/deletion operations. The difficulty of the problem is related to the number of costly joins to be processed over time when tuples are inserted and/or deleted. Such cost is mainly affected by three parameters, namely, the number of keywords, the maximum size of interconnected tuple structures, and the complexity of the database schema when it is viewed as a schema graph. In this paper, we propose new approaches. First, we propose a novel algorithm to efficiently determine all the joins that need to be processed for answering an m-keyword query. Second, we propose a new demand-driven approach to process such a query over a high speed relational data stream. We show that we can achieve high efficiency by significantly reducing the number of intermediate results when processing joins over a relational data stream. The proposed new techniques allow us to achieve high scalability in terms of both query plan generation and query plan execution. We conducted extensive experimental studies using synthetic data and real data to simulate a relational data stream. Our approach significantly outperforms existing algorithms.  相似文献   

We introduce a new cryptographic primitive, called proxy re-encryption with keyword search, which is motivated by the following scenario in email systems: Charlie sends an encrypted email, which contains some keywords, such as “urgent”, to Alice under Alice’s public key, and Alice delegates her decryption rights to Bob via her mail server. The desired situations are: (1) Bob can decrypt mails delegated from Alice by using only his private key, (2) Bob’s mail gateway, with a trapdoor from Bob, can test whether the email delegated from Alice contains some keywords, such as “urgent”, (3) Alice and Bob do not wish to give the mail server or mail gateway the access to the content of emails.The function of proxy re-encryption with keyword search (PRES) is the combination of proxy re-encryption (PRE) and public key encryption with keyword search (PEKS). However, a PRES scheme cannot be obtained by directly combining those two schemes, since the resulting scheme is no longer proven secure in our security model. In this paper, a concrete construction is proposed, which is proven secure in the random oracle model, based on the modified Decisional Bilinear Diffie-Hellman assumption.  相似文献   

In order to guarantee security and privacy of sensitive data, attribute-based keyword search (ABKS) enables data owners to upload their encrypted data to cloud servers, and authorizes intended data users to retrieve it. Meanwhile, ABKS outsources heavy search work to cloud servers, which makes ABKS adaptive to mobile computing environment. However, as cloud servers can both generate keyword ciphertexts and run search algorithm, the existing most ABKS schemes are vulnerable to keyword guessing attack. In this paper, we show the fundamental cause that the existing ABKS schemes do not resist keyword guessing attack is any entity can generate keyword ciphertext. To solve the above problem, in the phase of keyword ciphertext generation, we use private key of data owner to sign keyword prior to generating keyword ciphertext. Therefore, any other entity does not forge keyword ciphertext, which can resist keyword guessing attack. We give the formal definition and security model of attributed-based keyword search secure against keyword guessing attack (ABKS-SKGA). Furthermore, we provide an ABKS-SKGA scheme. The ABKS-SKGA scheme is proved secure against chosen-plaintext attack (CPA). Performance analysis shows that the proposed scheme is practical.  相似文献   

提出了基于结果类型分组的XML(extensible markup language,可扩展标志语言)关键字查询算法。采用熵值赋权法确定结果类型,继而对XML文档节点虚拟分组,并在虚拟组的基础上给出了相应的查询算法,不仅确保了结果信息的完整,避免了丢失某些有意义结果和返回无意义结果的现象。实验结果表明,所提出算法与SLCA、MLCEA相比,在查询质量、效率及稳定性上有一定提高。  相似文献   

Processing keyword search on XML: a survey   总被引:1,自引:0,他引:1  
Ziyang Liu  Yi Chen 《World Wide Web》2011,14(5-6):671-707
Keyword search is a user-friendly approach for users to retrieve information from XML data. Since an XML document can have a large size and contain a lot of information, an XML keyword search result should be a fragment of an XML document dynamically constructed at query time, which is achievable due to the structuredness of XML. Processing keyword searches on XML has several challenges, e.g., what are the elements in the XML document that are relevant to the query? How to generate the results efficiently and rank the results meaningfully? How to present the results to the user in a way such that the user can quickly find the desired information? In this survey, we review the papers in the literature that attempted to address these problems. We divide the existing approaches into several classes based on the problem they tackled, and perform a comprehensive analysis of these works.  相似文献   

针对关系数据库关键词查询系统中的结果排序问题,提出了一种新的排序方法.该方法结合了查询相关性和结构权重,将单个元组看作是一个虚拟文档,通过对元组引入信息检索(information retrieval,JR)式评分方式,采用标准化词频和标准化逆文档频率说明元组与查询条件之间的相关性程度,对整个结果采用结构权重来反应结果的语义强度.相比于以往只考虑结构权重的排序方法,该方法能更有效的将与查询高度相关的结果排在前面.实验结果表明,结合查询相关性的排序方法可以有效的对结果进行排序.  相似文献   

Intense regulatory focus on secure retention of electronic records has led to a need to ensure that records are trustworthy, i.e., able to provide irrefutable proof and accurate details of past events. In this paper, we analyze the requirements for a trustworthy index to support keyword-based search queries. We argue that trustworthy index entries must be durable—the index must be updated when new documents arrive, and not periodically deleted and rebuilt. To this end, we propose a scheme for efficiently updating an inverted index, based on judicious merging of the posting lists of terms. Through extensive simulations and experiments with two real world data sets and workloads, we demonstrate that the scheme achieves online update speed while maintaining good query performance. We also present and evaluate jump indexes, a novel trustworthy and efficient index for join operations on posting lists for multi-keyword queries. Jump indexes support insert, lookup and range queries in time logarithmic in the number of indexed documents.  相似文献   

基于有效最低公共祖先的XML关键字查询算法   总被引:1,自引:0,他引:1  
郑弘晖  郭红 《计算机应用》2010,30(3):825-830
针对XML文档关键字搜索问题,从元素标签内容等价和元素结构相似性等价两个方面考虑无效的查询结果。介绍了有效最低公共祖先(FLCA)的概念,在此基础上提出紧致的有效最低公共祖先(CFLCA)的概念。根据定义的查询结果集,提出基于等价模式值索引的查询算法(BEPVA)。最后与CVLCA和SLCA进行了比较,结果表明提出的方法在查询质量和查询效率上有较大的提高。  相似文献   

当含有敏感信息的XML文档在网络上传输或交换时,需要用户执行受限查询,如何提高查询效率,同时又保证敏感信息的安全一直是安全领域的研究热点。以带访问权限的实例信息树为主体,优先抽取主干信息策略,再反向作用于实例信息树存储特殊节点的压缩方法,为安全且高效的XML关键字查询奠定了基础,而且采用扩展的Dewey编码方式,为安全查询提供了方便。实验结果表明,这种基于压缩策略的安全查询方式减轻了存储负担,提高了查询效率。  相似文献   

为了解决基于LCA(Lower Common Ancestor)的XML关键字查询丢失语义的问题,提出了一种基于“自然语言生成技术(Natural Language Generation,NLG)”的XML关键字查询技术,将NLG的内容规划应用到XML文档,产生针对用户查询的消息语句集,通过对消息语句集的筛选既可以实现基于语义的XML关键字查询,又可以极大地提高查询效率。  相似文献   

The fast development of GPS equipped devices has aroused widespread use of spatial keyword querying in location based services nowadays. Existing spatial keyword query methodologies mainly focus on the spatial and textual similarities, while leaving the semantic understanding of keywords in spatial Web objects and queries to be ignored. To address this issue, this paper studies the problem of semantic based spatial keyword querying. It seeks to return the k objects most similar to the query, subject to not only their spatial and textual properties, but also the coherence of their semantic meanings. To achieve that, we propose novel indexing structures, which integrate spatial, textual and semantic information in a hierarchical manner, so as to prune the search space effectively in query processing. Extensive experiments are carried out to evaluate and compare them with other baseline algorithms.  相似文献   

Domain-specific Web search with keyword spices   总被引:4,自引:0,他引:4  
Domain-specific Web search engines are effective tools for reducing the difficulty experienced when acquiring information from the Web. Existing methods for building domain-specific Web search engines require human expertise or specific facilities. However, we can build a domain-specific search engine simply by adding domain-specific keywords, called "keyword spices," to the user's input query and forwarding it to a general-purpose Web search engine. Keyword spices can be effectively discovered from Web documents using machine learning technologies. The paper describes domain-specific Web search engines that use keyword spices for locating recipes, restaurants, and used cars.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号