期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A flabellate overlay network for multi-attribute search

Ruixuan LiAuthor Vitae Wei Song^{Author Vitae} 《Journal of Parallel and Distributed Computing》2011,71(3):407-423

Peer-to-peer (P2P) technology provides a popular way of distributing resources, sharing, and locating in a large-scale distributed environment. However, most of the current existing P2P systems only support queries over a single resource attribute, such as file name. The current multiple resource attribute search methods often encounter high maintenance cost and lack of resilience to the highly dynamic environment of P2P networks. In this paper, we propose a Flabellate overlAy Network (FAN), a scalable and structured underlying P2P overlay supporting resource queries over multi-dimensional attributes. In FAN, the resources are mapped into a multi-dimensional Cartesian space based on the consistent hash values of the resource attributes. The mapping space is divided into non-overlapping and continuous subspaces based on the peer’s distance. This paper presents strategies for managing the extended adjacent subspaces, which is crucial to network maintenance and resource search in FAN. The algorithms of a basic resource search and range query over FAN are also presented in this paper. To alleviate the load of the hot nodes, a virtual replica network (VRN) consisting of the nodes with the same replicates is proposed for replicating popular resources adaptively. The queries can be forwarded from the heavily loaded nodes to the lightly loaded ones through VRN. Theoretical analysis and experimental results show that FAN has a higher routing efficiency and lower network maintenance cost over the existing multi-attribute search methods. Also, VRN efficiently balances the network load and reduces the querying delay in FAN while invoking a relatively low overhead. 相似文献

2.

Multidimensional B-trees for associative searching in database systems

Peter Scheuermann Mohamed Ouksel 《Information Systems》1982,7(2):123-137

A new method for multiple attribute indexing, the Multidimensional B-Tree (MBDT), is developed. This method is well suited for dynamic databases, since it handles several types of associative queries efficiently and requires low-cost maintenance. Algorithms and search strategies for exact match, partial match, and range queries are presented and statistical procedures are given to estimate the average and worst case retrieval times. The applicability of our organization to practical databases is discussed and analytical tradeoffs with regard to index organizations based on k-d trees are established. 相似文献

3.

Efficient Processing of Metric Skyline Queries

Lei Chen Xiang Lian 《Knowledge and Data Engineering, IEEE Transactions on》2009,21(3):351-365

Skyline query is of great importance in many applications, such as multi-criteria decision making and business planning. In particular, a skyline point is a data object in the database whose attribute vector is not dominated by that of any other objects. Previous methods to retrieve skyline points usually assume static data objects in the database (i.e. their attribute vectors are fixed), whereas several recent work focus on skyline queries with dynamic attributes. In this paper, we propose a novel variant of skyline queries, namely metric skyline, whose dynamic attributes are defined in the metric space (i.e. not limited to the Euclidean space). We illustrate an efficient and effective pruning mechanism to answer metric skyline queries through a metric index. Most importantly, we formalize the query performance of the metric skyline query in terms of the pruning power, by a cost model, in light of which we construct an optimized metric index aiming to maximize the pruning power of metric skyline queries. Extensive experiments have demonstrated the efficiency and effectiveness of our proposed pruning techniques as well as the constructed index in answering metric skyline queries. 相似文献

4.

一种用于多属性范围查询的聚簇方法

下载免费PDF全文

马慧吴凌坤《计算机工程》2011,37(19):41-43,46

为提高多属性区域的查询效率,在物理层重新安排记录排列顺序,以减少查询访问磁盘块数。在此基础上,构造数学模型,将待查询记录按属性值映射至多维坐标空间中的点,以求解一个线性序,使空间中相距越远的点在线性序中也相距越远,并提出一种适用于多属性范围查询的聚簇方法。实验结果表明,与光谱算法及传统聚簇算法相比,该方法查询性能更优。相似文献

5.

Voronoi-based range and continuous range query processing in mobile databases

Kefeng Xuan Geng Zhao David Taniar Wenny Rahayu Maytham Safar Bala Srinivasan 《Journal of Computer and System Sciences》2011,77(4):637-651

With the wide availability of mobile devices (smart phones, iPhones, etc.), mobile location-based queries are increasingly in demand. One of the most frequent queries is range search which returns objects of interest within a pre-defined area. Most of the existing methods are based on the road network expansion method – expanding all nodes (intersections and objects) and computing the distance of each node to the query point. Since road networks are extremely complex, node expansion approaches are inefficient. In this paper, we propose a method, Voronoi Range Search (VRS) based on the Voronoi diagram, to process range search queries efficiently and accurately by partitioning the road networks to some special polygons. Then we further propose Voronoi Continuous Range (VCR) to satisfy the requirement for continuous range search queries (moving queries) based on VRS. Our empirical experiments show that VRS and VCR surpass all their rivals for both static and moving queries. 相似文献

6.

SA：一种有利于多属性范围查询的多维聚簇方法

吴凌坤汤庸王鹏舒然《计算机科学》2009,36(6):133-137

一般来说.外存访问的数据文件中针对多属性的区域查询有两个改进其效率的方向.一个是在其上建立索引,另一个是在物理层按照某种规律重新安排记录.探讨如何通过第二种方法来提高范围查询的效率,即通过多维聚簇的方式得到数据文件中更好的记录的存储顺序.首先,细致分析了该问题,并针对该问题构造了一个数学模型,然后通过引入光谱算法(SA)的思想为解决该NP难问题提供了一种多项式时间内的近似解.最后通过实验来验证了该方法在矩形区域查询和单维范围查询方面的有效性. 相似文献

7.

CAPS: A cloud-assisted approach to handle spikes in peer-to-peer web search

Harisankar Haridas Janakiram Dharanipragada 《Peer-to-Peer Networking and Applications》2016,9(1):193-208

In search engines, popular news events cause huge spikes in the queries related to the events. In this work, we consider the stability issues caused by these spikes in a peer-to-peer web search engine formed using voluntary resources shared by peers. The requirement of providing top-ranking results from a dynamic index distinguishes web search from other classes of peer-to-peer search like object lookup and file search. This makes the traditional methods of load reduction based on caching and replication, proposed for peer-to-peer object lookup and file search, insufficient for containing interest spikes in peer-to-peer web search. We propose transient use of public cloud to maintain the stability of peer-to-peer web search engines during interest spikes. To the best of our knowledge, this is the first work proposing transient use of public cloud to handle spikes in peer-to-peer search. In the proposed architecture, CAPS, the responsibility of handling spiking queries is dynamically offloaded to public clouds during the spike period. The peer bandwidth to be used for transfer of relevant index from peers to cloud is decided considering the impact on other applications in the peer as well as the requirements of the search application. We show that transient use of public cloud can be performed without major adverse impact on the desirable properties of peer-to-peer search like privacy and decentralized control. Experimental evaluation under realistic settings show that cloud-assistance can be used effectively to handle spikes. 相似文献

8.

An efficient quantum search engine on unsorted database

Songfeng Lu Yingyu Zhang Fang Liu 《Quantum Information Processing》2013,12(10):3265-3274

We consider the problem of finding one or more desired items out of an unsorted database. Patel has shown that if the database permits quantum queries, then mere digitization is sufficient for efficient search for one desired item. The algorithm, called factorized quantum search algorithm, presented by him can locate the desired item in an unsorted database using O( $log_4N$ ) queries to factorized oracles. But the algorithm requires that all the attribute values must be distinct from each other. In this paper, we discuss how to make a database satisfy the requirements, and present a quantum search engine based on the algorithm. Our goal is achieved by introducing auxiliary files for the attribute values that are not distinct, and converting every complex query request into a sequence of calls to factorized quantum search algorithm. The query complexity of our algorithm is O( $log_4N$ ) for most cases. 相似文献

9.

Implications of the file names and user requested queries on Gnutella performance

Surendar Chandra William Acosta 《Peer-to-Peer Networking and Applications》2012,5(1):14-29

The Gnutella file sharing system allows a large number of peers to share their local files. However, it does not coordinate the way by which these shared objects are named or how they are searched by other users; such decisions are made independently by each peer. In this work, we investigate the practical performance implications of this design. We collected the shared filenames and user generated queries over a three-year period. We show the mismatch between these naming mechanisms. We show the fundamental limitations of Gnutella performance that cannot be addressed by improvements in overlays or by varying the search mechanisms alone. Based on our observations, we describe two practical approaches to improve Gnutella performance. We describe a mechanism to build the file term synopsis using the observed popularity of queries routed through the ultrapeer. We also describe a query transformation mechanism that improves the success rates for failed queries. 相似文献

10.

Dynamic maintenance of data distribution for selectivity estimation 总被引：3，自引：0，他引：3

Kyu-Young Whang Ph.D. Sang-Wook Kim Gio Wiederhold Ph.D. 《The VLDB Journal The International Journal on Very Large Data Bases》1994,3(1):29-51

We propose a new dynamic method for multidimensional selectivity estimation for range queries that works accurately independent of data distribution. Good estimation of selectivity is important for query optimization and physical database design. Our method employs the multilevel grid file (MLGF) for accurate estimation of multidimensional data distribution. The MLGF is a dynamic, hierarchical, balanced, multidimensional file structure that gracefully adapts to nonuniform and correlated distributions. We show that the MLGF directory naturally represents a multidimensional data distribution. We then extend it for further refinement and present the selectivity estimation method based on the MLGF. Extensive experiments have been performed to test the accuracy of selectivity estimation. The results show that estimation errors are very small independent of distributions, even with correlated and/or highly skewed ones. Finally, we analyze the cause of errors in estimation and investigate the effects of various parameters on the accuracy of estimation. 相似文献

11.

Algorithms for BD trees

Sivarama P. Dandamudi Paul G. Sorenson 《Software》1986,16(12):1077-1096

Database applications often require a sophisticated class of storage structures in order to answer different types of queries efficiently. This often dictates that the file should be organized on multiple keys. Several storage structures have been proposed to satisfy these needs. Most are generalizations of the storage structures used for managing one-dimensional data. Recently, a new storage structure, called the BD tree, was proposed to manage multidimensional data. This structure has good dynamic characteristics. This paper presents algorithms for the BD tree to perform insertion, deletion, and to answer exact match, partial match and range queries. In addition, some experimental evidence is presented that suggests that BD trees have good dynamic characteristics. 相似文献

12.

面向大型数据集合的关键分类查找算法

许晓峰金澈清高明周傲英《计算机研究与发展》2009,46(Z2)

Top-k查询是Web和多媒体搜索、决策支持、分布式系统等众多领域中最重要的查询之一,它返回数据集合中k个最关键的元组.大型数据集合往往包含一系列分类型属性,获取对目标属性影响最大的k个分类型属性值对于许多应用中也非常重要.研究了这个问题,正式定义了k-AKC和PKC两种查询,并设计相应的查询处理算法.实验结果表明,改良算法PKCQ+具有较佳的有效性和高效性. 相似文献

13.

DiCAS: An Efficient Distributed Caching Mechanism for P2P Systems 总被引：2，自引：0，他引：2

Chen Wang Li Xiao Yunhao Liu Pei Zheng 《Parallel and Distributed Systems, IEEE Transactions on》2006,17(10):1097-1109

Peer-to-peer networks are widely criticized for their inefficient flooding search mechanism. Distributed Hash Table (DHT) algorithms have been proposed to improve the search efficiency by mapping the index of a file to a unique peer based on predefined hash functions. However, the tight coupling between indices and hosting peers incurs high maintenance cost in a highly dynamic network. To properly balance the tradeoff between the costs of indexing and searching, we propose the distributed caching and adaptive search (DiCAS) algorithm, where indices are passively cached in a group of peers based on a predefined hash function. Guided by the same function, adaptive search selectively forwards queries to "matched” peers with a high probability of caching the desired indices. The search cost is reduced due to shrunk searching space. Different from the DHT solutions, distributed caching loosely maps the index of a file to a group of peers in a passive fashion, which saves the cost of updating indices. Our simulation study shows that the DiCAS protocol can significantly reduce the network search traffic with the help of small cache space contributed by each individual peer. 相似文献

14.

Optimization of distributed tree queries

Clement T. Yu Z. Meral Ozsoyoglu K. Lam 《Journal of Computer and System Sciences》1984,29(3):409-445

In this paper the problem of finding an optimum strategy of semi joins for solving tree queries is studied under the objective of total time minimization. Tree queries that are conjunctions of equi-join clauses such that any two relations in the query have at most one attribute in common are considered. This class of tree queries is a superset of classes of tree queries, such as chain queries and simple queries, that have been studied for semi-join optimization in the literature. An algorithm based on dynamic programming to find the optimum semi-join strategy for a given query is presented. The search space for finding the optimum is reduced by eliminating strategies that can never be the optimum. This is accomplished by utilizing a set of properties that a potentially optimum strategy should satisfy. 相似文献

15.

基于自定义模板的通用报表设计 总被引：1，自引：0，他引：1

金百东《计算机时代》2009,(10):38-39

提出了一种实现通用报表的新思想：将Excel文件转化生成自定义模板文件,模板文件包括报表的静态及动态两部分内容,静态部分主要封装了表格结构信息,动态部分主要封装了SQL语句以及与静态部分相关联信息,模板文件可以存入服务器数据库中;客户端通过通用打印组件完成报表的输出。相似文献

16.

A framework for efficient spatial web object retrieval

Dingming Wu Gao Cong Christian S. Jensen 《The VLDB Journal The International Journal on Very Large Data Bases》2012,21(6):797-822

相似文献

17.

Building and Querying a P2P Virtual World 总被引：2，自引：0，他引：2

Egemen Tanin Aaron Harwood Hanan Samet Deepa Nayar Sarana Nutanong 《GeoInformatica》2006,10(1):91-116

Peer-to-Peer (P2P) systems are known to provide excellent scalability in a networked environment. One peer is introduced to the system by each participant. However current P2P applications can only provide file sharing and other forms of relatively simple data communications, and, in this paper, we demonstrate how this limitation can be bridged by indexing and querying a 3D virtual-world on a dynamic distributed network. We present an algorithm for 3D range queries as well as an algorithm for nearest neighbor queries. We also show how to build such a complex application from the ground level of a P2P routing algorithm. 相似文献

18.

On processing continuous frequent K-N-match queries for dynamic data over networked data sources 总被引：1，自引：1，他引：0

Shih-Chuan Chiu Jiun-Long Huang Jen-He Huang 《Knowledge and Information Systems》2012,31(3):547-579

Similarity search is one of the critical issues in many applications. When using all attributes of objects to determine their similarity, most prior similarity search algorithms are easily influenced by a few attributes with high dissimilarity. The frequent k-n-match query is proposed to overcome the above problem. However, the prior algorithm to process frequent k-n-match queries is designed for static data, whose attributes are fixed, and is not suitable for dynamic data. Thus, we propose in this paper two schemes to process continuous frequent k-n-match queries over dynamic data. First, the concept of safe region is proposed and four formulae are devised to compute safe regions. Then, scheme CFKNMatchAD-C is developed to speed up the process of continuous frequent k-n-match queries by utilizing safe regions to avoid unnecessary query re-evaluations. To reduce the amount of data transmitted by networked data sources, scheme CFKNMatchAD-C also uses safe regions to eliminate transmissions of unnecessary data updates which will not affect the results of queries. Moreover, for large-scale environments, we further propose scheme CFKNMatchAD-D by extending scheme CFKMatchAD-C to employ multiple servers to process continuous frequent k-n-match queries. Experimental results show that scheme CFKNMatchAD-C and scheme CFKNMatchAD-D outperform the prior algorithm in terms of average response time and the amount of produced network traffic. 相似文献

19.

Schema-aware XPath filtering on XML document streams

Daewook Lee Joonho Kwon Weidong Yang Hyoseop Shin Jae-min Kwak Sukho Lee 《Journal of Intelligent Manufacturing》2009,20(3):273-282

The XML stream filtering is gaining widespread attention from the research community in recent years. There have been many efforts to improve the performance of the XML filtering system by utilizing XML schema information. In this paper, we design and implement an XML stream filtering system, SFilter, which uses DTD or XML schema information for improving the performance. We propose the simplification and two kinds of optimization, one is static and the other is dynamic optimization. The Simplification and static optimization transform the XPath queries to make automata as an index structure for the filtering. The dynamic optimization are done in runtime at the filtering time. We developed five kinds of static optimization and two kinds of dynamic optimization. We present the novel filtering algorithm for the resulting transformed XPath queries and runtime optimizing. The experimental result shows that our system filters the XML streams efficiently. 相似文献

20.

Beyond search: Retrieving complete tuples from a text-database

Alexander Löser Christoph Nagel Stephan Pieper Christoph Boden 《Information Systems Frontiers》2013,15(3):311-329

A common task of Web users is querying structured information from Web pages. For realizing this interesting scenario we propose a novel query processor for systematically discovering instances of semantic relations in Web search results and joining these relation instances into complex result tuples with conjunctive queries. Our query processor transforms a structured user query into keyword queries that are submitted to a search engine, forwards search results to a relation extractor, and then combines relations into complex result tuples. The processor automatically learns discriminative and effective keywords for different types of semantic relations. Thereby, our query processor leverages the index of a search engine to query potentially billions of pages. Unfortunately, relation extractors may fail to return a relation for a result tuple. Moreover, user defined data sources may not return at least k complete result tuples. Therefore we propose an adaptive routing model based on information theory for retrieving missing attributes of incomplete result tuples. The model determines the most promising next incomplete tuple and attribute type for returning any-k complete result tuples at any point during the query execution process. We report a thorough experimental evaluation over multiple relation extractors. Our query processor returns complete result tuples while processing only very few Web pages. 相似文献