期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Efficient processing of top-k queries: selective NRA algorithms

Jing Yuan Guangzhong Sun Tao Luo Defu Lian Guoliang Chen 《Journal of Intelligent Information Systems》2012,39(3):687-710

Efficient processing of top-k queries has drawn increasing attention from both industry and academia due to its varied applications. Lower access cost is a crucial concern for a top-k query processing. Typically, when answering a top-k query, there exist two types of accesses: sorted access and random access. In some scenarios, the latter is not supported by the data source. Fagin et al. proposed the No Random Access (NRA) algorithm (Fagin et?al, J Comput Syst Sci 66:614–656, 2003) for this situation. In this paper, we motivate our work by a key observation of the NRA algorithm: the number of accesses could be further reduced by selectively (instead of in parallel) performing sorted accesses to different lists of the dataset. Based on this insight, we propose a Selective NRA (SNRA) algorithm aiming to cut down the unnecessary access cost. Later, we optimize the SNRA algorithm in terms of runtime cost and present the SNRA-opt algorithm. Furthermore, we address the problem of instance optimality theoretically and turn SNRA (and SNRA-opt) into instance optimal algorithms, termed as Hybrid-SNRA (HSNRA) and HSNRA-opt. Extensive experimental results show that our algorithms perform significantly fewer sorted accesses than NRA (and its state-of-the-art variations). In terms of runtime cost, the proposed SNRA-opt and HSNRA-opt algorithms are two orders of magnitude faster than the NRA algorithm. In addition, we discuss the parameter selection problem of the SNRA algorithms, both theoretically and experimentally. 相似文献

2.

Localized monitoring of kNN queries in wireless sensor networks

Yuxia Yao Xueyan Tang Ee-Peng Lim 《The VLDB Journal The International Journal on Very Large Data Bases》2009,18(1):99-117

Wireless sensor networks have been widely used in civilian and military applications. Primarily designed for monitoring purposes, many sensor applications require continuous collection and processing of sensed data. Due to the limited power supply for sensor nodes, energy efficiency is a major performance concern in query processing. In this paper, we focus on continuous kNN query processing in object tracking sensor networks. We propose a localized scheme to monitor nearest neighbors to a query point. The key idea is to establish a monitoring area for each query so that only the updates relevant to the query are collected. The monitoring area is set up when the kNN query is initially evaluated and is expanded and shrunk on the fly upon object movement. We analyze the optimal maintenance of the monitoring area and develop an adaptive algorithm to dynamically decide when to shrink the monitoring area. Experimental results show that establishing a monitoring area for continuous kNN query processing greatly reduces energy consumption and prolongs network lifetime. 相似文献

3.

GPU-based efficient join algorithms on Hadoop

Wang Hongzhi Li Ning Wang Zheng Li Jianing 《The Journal of supercomputing》2021,77(1):292-321

The Journal of Supercomputing - The growing data have brought tremendous pressure for query processing and storage, so there are many studies that focus on using GPU to accelerate join operation,... 相似文献

4.

Privacy-preserving kNN query processing algorithms via secure two-party computation over encrypted database in cloud computing

Kim Hyeong-Jin Lee Hyunjo Kim Yong-Ki Chang Jae-Woo 《The Journal of supercomputing》2022,78(7):9245-9284

The Journal of Supercomputing - Since studies on privacy-preserving database outsourcing have been spotlighted in a cloud computing, databases need to be encrypted before being outsourced to the... 相似文献

5.

GPU-based parallel algorithms for sparse nonlinear systems

V. Galiano H. Migallón V. Migallón J. Penadés 《Journal of Parallel and Distributed Computing》2012

In this work we describe some parallel algorithms for solving nonlinear systems using CUDA (Compute Unified Device Architecture) over a GPU (Graphics Processing Unit). The proposed algorithms are based on both the Fletcher–Reeves version of the nonlinear conjugate gradient method and a polynomial preconditioner type based on block two-stage methods. Several strategies of parallelization and different storage formats for sparse matrices are discussed. The reported numerical experiments analyze the behavior of these algorithms working in a fine grain parallel environment compared with a thread-based environment. 相似文献

6.

基于GPU的精确串匹配算法综述

张春燕谭建龙刘燕兵郭莉《计算机应用研究》2016,33(7)

精确串匹配是计算机领域的一个经典问题。在大数据时代,海量的数据给串匹配问题带来巨大的挑战。当前,GPU的应用得到学术界和工业界的广泛关注。近年,基于GPU的串匹配算法研究已成为学术界的焦点。为展示近年的研究,本文综述了基于GPU的精确串匹配技术,针对不同的算法和GPU架构介绍精确串匹配技术在GPU上的改进：不同算法的改进具有差异性,研究时需扩展具体算法,并比较上述算法的优缺点。最后对评测指标进行介绍,展望其发展趋势。相似文献

7.

Genetic algorithms for approximate similarity queries 总被引：1，自引：0，他引：1

Renato Agma J.M. Caetano 《Data & Knowledge Engineering》2007,62(3):459-482

Algorithms to query large sets of simple data (composed of numbers and small character strings) are constructed to retrieve the exact answer, retrieving every relevant element, so the answer said to be exact. Similarity searching over complex data is much more expensive than searching over simple data. Moreover, comparison operations over complex data usually consider features extracted from each element, instead of the elements themselves. Thus, even if an algorithm retrieves an exact answer, it is ‘exact’ regarding the extracted features, not regarding the original elements themselves. Therefore, trading exact answering with query time response can be worthwhile. In this work we developed two search strategies based on genetic algorithms to allow retrieving approximate data indexed by Metric Access Methods (MAM) within a limited, user-defined, amount of time. These strategies allow implementing algorithms to answer both range and k-nearest neighbor queries, and allow also to estimate the precision obtained for the approximate answer. Experimental evaluation shows that very good results (corresponding to what the user would expect) can be obtained in a fraction of the time required to obtain the exact answer. 相似文献

8.

kNN processing with co-space distance in SoLoMo systems

《Expert systems with applications》2014,41(16):6967-6982

With the increasing popularity of smart phones, SoLoMo (Social-Location-Mobile) systems are expected to be fast-growing and become a popular mobile social networking platform. A main challenge in such systems is on the creation of stable links between users. For each online user, the current SoLoMo system continuously returns his/her kNN (k Nearest Neighbor) users based on their geo-locations. Such a recommendation approach is simple, but fails to create sustainable friendships. Instead, it would be more effective to tap onto the existing social relationships in conventional social networks, such as Facebook and Twitter, to provide a “better” friend recommendations.To measure the similarity between users, we propose a new metric, co-space distance, by considering both the user distances in the real world (physical distance) and the virtual world (social distance). The co-space distance measures the similarity of two users in the SoLoMo system. We compute the social distances between users based on their public information in the conventional social networks, which can be achieved by a few MapReduce jobs. To facilitate efficient computation of the social distance, we build a distributed index on top of the key-value store, and maintain the users’ geo-locations using an R-tree. For each query on finding potential friends around a location, we return kNN neighbors to each user based on their co-space distances. We propose a progressive top-k processing strategy and an adaptive-caching strategy to facilitate efficient query processing. Experiments with Gowalla dataset¹ show the effectiveness and efficiency of our recommendation approach. 相似文献

9.

Branch-and-bound processing of ranked queries

Yufei Tao Vagelis Hristidis Dimitris Papadias Yannis Papakonstantinou 《Information Systems》2007

Despite the importance of ranked queries in numerous applications involving multi-criteria decision making, they are not efficiently supported by traditional database systems. In this paper, we propose a simple yet powerful technique for processing such queries based on multi-dimensional access methods and branch-and-bound search. The advantages of the proposed methodology are: (i) it is space efficient, requiring only a single index on the given relation (storing each tuple at most once), (ii) it achieves significant (i.e., orders of magnitude) performance gains with respect to the current state-of-the-art, (iii) it can efficiently handle data updates, and (iv) it is applicable to other important variations of ranked search (including the support for non-monotone preference functions), at no extra space overhead. We confirm the superiority of the proposed methods with a detailed experimental study. 相似文献

10.

Efficient algorithms for descendant-only tree pattern queries

Michaela Götz Christoph Koch Wim Martens 《Information Systems》2009

Tree pattern matching is a fundamental problem that has a wide range of applications in Web data management, XML processing, and selective data dissemination. In this paper we develop efficient algorithms for the tree homeomorphism problem, i.e., the problem of matching a tree pattern with exclusively transitive (descendant) edges. We first prove that deciding whether there is a tree homeomorphism is LOGSPACE-complete, improving on the current LOGCFL upper bound. Furthermore, we develop a practical algorithm for the tree homeomorphism decision problem that is both space- and time-efficient. The algorithm is in LOGDCFL and space consumption is strongly bounded, while the running time is linear in the size of the data tree. This algorithm immediately generalizes to the problem of matching the tree pattern against all subtrees of the data tree, preserving the mentioned efficiency properties. 相似文献

11.

Analysis and evaluation of V*-kNN: an efficient algorithm for moving kNN queries

Sarana Nutanong Rui Zhang Egemen Tanin Lars Kulik 《The VLDB Journal The International Journal on Very Large Data Bases》2010,19(3):307-332

The moving k nearest neighbor (MkNN) query continuously finds the k nearest neighbors of a moving query point. MkNN queries can be efficiently processed through the use of safe regions. In general, a safe region is a region within which the query point can move without changing the query answer. This paper presents an incremental safe-region-based technique for answering MkNN queries, called the V*-Diagram, as well as analysis and evaluation of its associated algorithm, V*-kNN. Traditional safe-region approaches compute a safe region based on the data objects but independent of the query location. Our approach exploits the knowledge of the query location and the boundary of the search space in addition to the data objects. As a result, V*-kNN has much smaller I/O and computation costs than existing methods. We further provide cost models to estimate the number of data accesses for V*-kNN and a competitive technique, RIS-kNN. The V*-Diagram and V*-kNN are also applicable to the domain of spatial networks and we present algorithms to construct a spatial-network V*-Diagram. Our experimental results show that V*-kNN significantly outperforms the competitive technique. The results also verify the accuracy of the cost models. 相似文献

12.

比例逆权重 kNN 算法及其流处理应用

《计算机工程与设计》2015,(12)

相似文献

13.

Efficient processing of k-hop reachability queries

James Cheng Zechao Shang Hong Cheng Haixun Wang Jeffrey Xu Yu 《The VLDB Journal The International Journal on Very Large Data Bases》2014,23(2):227-252

We study the problem of answering k -hop reachability queries in a directed graph, i.e., whether there exists a directed path of length $k$ , from a source query vertex to a target query vertex in the input graph. The problem of $k$ -hop reachability is a general problem of the classic reachability (where $k=\infty $ ). Existing indexes for processing classic reachability queries, as well as for processing shortest path distance queries, are not applicable or not efficient for processing $k$ -hop reachability queries. We propose an efficient index for processing $k$ -hop reachability queries. Our experimental results on a wide range of real datasets show that our method is efficient and scalable in terms of both index construction and query processing. 相似文献

14.

Progressive processing of subspace dominating queries

Eleftherios Tiakas Apostolos N. Papadopoulos Yannis Manolopoulos 《The VLDB Journal The International Journal on Very Large Data Bases》2011,20(6):921-948

A top-k dominating query reports the k items with the highest domination score. Algorithms for efficient processing of this query have been recently proposed in the literature. Those methods, either index based or index free, apply a series of pruning criteria toward efficient processing. However, they are characterized by several limitations, such as (1) they lack progressiveness (they report the k best items at the end of the processing), (2) they require a multi-dimensional index or they build a grid-based index on-the-fly, which suffers from performance degradation, especially in high dimensionalities, and (3) they do not support vertically decomposed data. In this paper, we design efficient algorithms that can handle any subset of the dimensions in a progressive manner. Among the studied algorithms, the Differential Algorithm shows the best overall performance. 相似文献

15.

Efficient processing of enumerative set-based queries

《Information Systems》2016

Many applications often require finding sets of entities of interest that meet certain constraints. Such set-based queries (SQs) can be broadly classified into two types: optimization SQs that involve some optimization constraint and enumerative SQs that do not have any optimization constraint. While there has been much research on the evaluation of optimization SQs, there is very little work on the evaluation of enumerative SQs, which represent the most fundamental fragment of set-based queries. In this paper, we address the problem of evaluating enumerative SQs using RDBMS. While enumerative SQs can be expressed using SQL, existing relational engines, unfortunately, are not able to efficiently evaluate such queries due to their complexity. In this paper, we propose a novel evaluation approach for enumerative SQs. Our experimental results on PostgreSQL demonstrate that our proposed approach outperforms the conventional approach by up to three orders of magnitude. 相似文献

16.

Spatial skyline queries: exact and approximation algorithms

Mu-Woong Lee Wanbin Son Hee-Kap Ahn Seung-won Hwang 《GeoInformatica》2011,15(4):665-697

As more data-intensive applications emerge, advanced retrieval semantics, such as ranking and skylines, have attracted the attention of researchers. Geographic information systems are a good example of an application using a massive amount of spatial data. Our goal is to efficiently support exact and approximate skyline queries over massive spatial datasets. A spatial skyline query, consisting of multiple query points, retrieves data points that are not father than any other data points, from all query points. To achieve this goal, we present a simple and efficient algorithm that computes the correct results, also propose a fast approximation algorithm that returns a desirable subset of the skyline results. In addition, we propose a continuous query algorithm to trace changes of skyline points while a query point moves. To validate the effectiveness and efficiency of our algorithm, we provide an extensive empirical comparison between our algorithms and the best known spatial skyline algorithms from several perspectives. 相似文献

17.

对位置信息服务的连续查询攻击算法

杨琼俞立峰《计算机应用》2014,34(1):95-98

为了解决连续查询攻击算法给位置信息服务(LBS)带来的安全隐患,基于已有的k-匿名化Cloaking算法提出了一种新的连续查询攻击算法--CQACA。该算法首先利用熵和查询匿名度量定义了查询识别率的目标函数,并结合元胞蚁群给出了目标函数的求解算法。最后,利用移动对象数据生成器进行实验,深入研究了影响CQACA的关键因素,同时对比分析了该算法与Cloaking算法的性能差异:CQACA与实际数据的误差为13.27%,而Cloaking算法则为17.35%。结果表明CQACA具有一定的有效性。相似文献

18.

Efficient algorithms for answering queries with unsorted multilists

Billy G. Claybrook Chung-Shu Yang 《Information Systems》1978,3(2):93-97

Three algorithms are given in this paper, each one increasing in complexity and efficiency, for parallel tracing of unsorted multilists. These algorithms avoid multiple record accessing for disjunctive queries, and they attempt to reduce the amount of seek action required to access all the records that satisfy a query. The algorithms are more efficient and no more complex than the general algorithm of Hsiao and Harary. The algorithms can enable, in some cases, the same querying efficiency obtainable with sorted multilists. Hence, efficiencies in on-line querying and updating are simultaneously accomplished. 相似文献

19.

q-state Potts model metastability study using optimized GPU-based Monte Carlo algorithms

Ezequiel E. Ferrero Juan Pablo De Francesco Nicolás Wolovick Sergio A. Cannas 《Computer Physics Communications》2012,183(8):1578-1587

We implemented a GPU-based parallel code to perform Monte Carlo simulations of the two-dimensional q-state Potts model. The algorithm is based on a checkerboard update scheme and assigns independent random number generators to each thread. The implementation allows to simulate systems up to ～10⁹ spins with an average time per spin flip of 0.147 ns on the fastest GPU card tested, representing a speedup up to 155×, compared with an optimized serial code running on a high-end CPU.The possibility of performing high speed simulations at large enough system sizes allowed us to provide a positive numerical evidence about the existence of metastability on very large systems based on Binder?s criterion, namely, on the existence or not of specific heat singularities at spinodal temperatures different of the transition one. 相似文献

20.

Efficient processing of XPath queries using indexes

Sanjay Madria Yan Chen Kalpdrum Passi Sourav Bhowmick 《Information Systems》2007

A number of indexing techniques have been proposed in recent times for optimizing the queries on XML and other semi-structured data models. Most of the semi-structured models use tree-like structures and query languages (XPath, XQuery, etc.) which make use of regular path expressions to optimize the query processing. In this paper, we propose two algorithms called Entry-point algorithm (EPA) and Two-point Entry algorithms that exploit different types of indices to efficiently process XPath queries. We discuss and compare two approaches namely, Root-first and Bottom-first in implementing the EPA. We present the experimental results of the algorithms using XML benchmark queries and data and compare the results with that of traditional methods of query processing with and without the use of indexes, and ToXin indexing approach. Our algorithms show improved performance results than the traditional methods and Toxin indexing approach. 相似文献