首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
In a number of emerging streaming applications, the data values that are produced have an associated time interval for which they are valid. A useful computation over such streaming data is to produce a continuous and valid skyline summary. Previous work on skyline algorithms have only focused on evaluating skylines over static data sets, and there are no known algorithms for skyline computation in the continuous setting. In this paper, we introduce the continuous time-interval skyline operator, which continuously computes the current skyline over a data stream. We present a new algorithm called LookOut for evaluating such queries efficiently, and empirically demonstrate the scalability of this algorithm. In addition, we also examine the effect of the underlying spatial index structure when evaluating skylines. Whereas previous work on skyline computations have only considered using the R-tree index structure, we show that for skyline computations using an underlying quadtree has significant performance benefits over an R-tree index.  相似文献   

The reverse skyline query is very useful in many decision making applications. Given a multi-dimensional dataset P and a query point q, the reverse skyline query returns all the points in P whose dynamic skyline contains q. Although the reverse skyline retrieval has been well-studied in the literature, there is, to the best of our knowledge, no prior work on one of the most intuitive and practical types of reverse skyline queries, namely, group-by reverse skyline (GRS) query, which retrieves the reverse skyline for each group in a specified dataset. We formalize the GRS query including monochromatic and bichromatic versions, and identify its properties, and then propose a set of efficient algorithms for computing the group-by reverse skyline. Extensive experimental evaluation using both real and synthetic datasets demonstrates the performance of our proposed algorithms in terms of effectiveness and efficiency under a variety of experimental settings.  相似文献   

With the emergence of location-aware mobile device technologies, communication technologies and GPS systems, the location based queries have attracted great attentions in the database literature. In many user recommendation web services, the spatial preference query is used to suggest the objects based on their spatial proximity with the facilities. In this paper, we study the problem of general spatial skyline (GSSKY) which can provide the minimal candidate set of the optimal solutions for any monotonic distance based spatial preference query. Efficient progressive algorithm called P-GSSKY is proposed to significantly reduce the number of non-promising objects in the computation. Moreover, we also propose spatial join based algorithm, called J-GSSKY, which can compute GSSKY efficiently in terms of I/O cost. The paper conducts a comprehensive performance study of the proposed techniques based on both real and synthetic data.  相似文献   

Current skyline evaluation techniques are mainly to find the outstanding tuples from a large dataset. In this paper, we generalize the concept of skyline query and introduce a novel type of query, the combinatorial skyline query, which is to find the outstanding combinations from all combinations of the given tuples. The past skyline query is a special case of the combinatorial skyline query. This generalized concept is semantically more abundant when used in decision making, market analysis, business planning, and quantitative economics research. In this paper, we first introduce the concept of the combinatorial skyline query (CSQ) and explain the difficulty in resolving this type of query. Then, we propose two algorithms to solve the problem. The experiments manifest the effectiveness and efficiency of the proposed algorithms.  相似文献   

Skyline computation, which returns a set of interesting points from a potentially huge data space, has attracted considerable interest in big data era. However, the flourish of skyline computation still faces many challenges including information security and privacy-preserving concerns. In this paper, we propose a new efficient and privacy-preserving skyline computation framework across multiple domains, called EPSC. Within EPSC framework, a skyline result from multiple service providers will be securely computed to provide better services for the client. Meanwhile, minimum privacy disclosure will be elicited from one service provider to another during skyline computation. Specifically, to leverage the service provider’s privacy disclosure and achieve almost real-time skyline processing and transmission, we introduce an efficient secure vector comparison protocol (ESVC) to construct EPSC, which is exclusively based on two novel techniques: fast secure permutation protocol (FSPP) and fast secure integer comparison protocol (FSIC). Both protocols allow multiple service providers to calculate skyline result interactively in a privacy-preserving way. Detailed security analysis shows that the proposed EPSC framework can achieve multi-domain skyline computation without leaking sensitive information to each other. In addition, performance evaluations via extensive simulations also demonstrate the EPSC’s efficiency in terms of providing skyline computation and transmission while minimizing the privacy disclosure across different domains.  相似文献   

Skyline queries, together with other advanced query operators, are essential in order to help identify sets of interesting data points buried within huge amount of data readily available these days. A skyline query retrieves sets of non-dominated data points in a multi-dimensional dataset. As computing infrastructures become increasingly pervasive, connected by readily available network services, data storage and management have become inevitably more distributed. Under these distributed environments, designing efficient skyline querying with desirable quick response time and progressive returning of answers faces new challenges. To address this, in this paper, we propose a novel skyline query scheme termed MpSky. MpSky is based on a novel space partitioning scheme, employing the dependency relationships among data points on different servers. By grouping points of each server using dependencies, we are able to qualify a skyline point by only comparing it with data on dependent servers, and parallelize the skyline computation among non-dependent partitions that are from different servers or individual servers. By controlling the query propagation among partitions, we are able to generate skyline results progressively and prune partitions and points efficiently. Analytical and extensive simulation results show the effectiveness of the proposed scheme.  相似文献   

Continuous distance-based skyline queries in road networks   总被引:1,自引:0,他引:1  
In recent years, the research community has introduced various methods for processing skyline queries in road networks. A skyline query retrieves the skyline points that are not dominated by others in terms of static and dynamic attributes (i.e., the road distance). This paper addresses the issue of efficiently processing continuous skyline queries in road networks. Two novel and important distance-based skyline queries are presented, namely, the continuous  dε-skylinedε-skylinequery   (Cdε-SQCdε-SQ) and the continuous k nearest neighbor-skyline query (Cknn-SQ  ). A grid index is first designed to effectively manage the information of data objects and then two algorithms are proposed, the Cdε-SQCdε-SQalgorithm   and the Cdε-SQ+Cdε-SQ+algorithm  , which are combined with the grid index to answer the Cdε-SQCdε-SQ. Similarly, the Cknn-SQ algorithm and the Cknn-SQ+algorithm are developed to efficiently process the Cknn-SQ. Extensive experiments using real road network datasets demonstrate the effectiveness and the efficiency of the proposed algorithms.  相似文献   

We present the second output-sensitive skyline computation algorithm which is faster than the only existing output-sensitive skyline computation algorithm [1] in worst case because our algorithm does not rely on the existence of a linear time procedure for finding medians.  相似文献   

Scaling skyline queries over high-dimensional datasets remains to be challenging due to the fact that most existing algorithms assume dimensional independence when establishing the worst-case complexity by discarding correlation distribution. In this paper, we present HashSkyline, a systematic and correlation-aware approach for scaling skyline queries over high-dimensional datasets with three novel features: First, it offers a fast hash-based method to prune non-skyline points by utilizing data correlation characteristics and speed up the overall skyline evaluation for correlated datasets. Second, we develop \(HashSkyline_{GPU}\), which can dramatically reduce the response time for anti-correlated and independent datasets by capitalizing on the parallel processing power of GPUs. Third, the HashSkyline approach uses the pivot cell-based mechanism combined with the correlation threshold to determine the correlation distribution characteristics for a given dataset, enabling adaptive configuration of HashSkyline for skyline query evaluation by auto-switching of \(HashSkyline_{CPU}\) and \(HashSkyline_{GPU}\). We evaluate the validity of HashSkyline using both synthetic datasets and real datasets. Our experiments show that HashSkyline consumes significantly less pre-processing cost and achieves significantly higher overall query performance, compared to existing state-of-the-art algorithms.  相似文献   

With the advent of multicore processors, it has become imperative to write parallel programs if one wishes to exploit the next generation of processors. This paper deals with skyline computation as a case study of parallelizing database operations on multicore architectures. First we parallelize three sequential skyline algorithms, BBS, SFS, and SSkyline, to see if the design principles of sequential skyline computation also extend to parallel skyline computation. Then we develop a new parallel skyline algorithm PSkyline based on the divide-and-conquer strategy. Experimental results show that all the algorithms successfully utilize multiple cores to achieve a reasonable speedup. In particular, PSkyline achieves a speedup approximately proportional to the number of cores when it needs a parallel computation the most.  相似文献   

Skyline queries have attracted considerable attention to assist multicriteria analysis of large-scale datasets. In this paper, we focus on multidimensional subspace skyline computation that has been actively studied for two approaches. First, to narrow down a full-space skyline, users may consider multiple subspace skylines reflecting their interest. For this purpose, we tackle the concept of a skycube, which consists of all possible non-empty subspace skylines in a given full space. Second, to understand diverse semantics of subspace skylines, we address skyline groups in which a skyline point (or a set of skyline points) is annotated with decisive subspaces. Our primary contributions are to identify common building blocks of the two approaches and to develop orthogonal optimization principles that benefit both approaches. Our experimental results show the efficiency of proposed algorithms by comparing them with state-of-the-art algorithms in both synthetic and real-life datasets.  相似文献   

This paper studies the problem of computing the skyline of a vast-sized spatial dataset in SpatialHadoop, an extension of Hadoop that supports spatial operations efficiently. The problem is particularly interesting due to advent of Big Spatial Data that are generated by modern applications run on mobile devices, and also because of the importance of the skyline operator for decision-making and supporting business intelligence. To this end, we present a scalable and efficient framework for skyline query processing that operates on top of SpatialHadoop, and can be parameterized by individual techniques related to filtering of candidate points as well as merging of local skyline sets. Then, we introduce two novel algorithms that follow the pattern of the framework and boost the performance of skyline query processing. Our algorithms employ specific optimizations based on effective filtering and efficient merging, the combination of which is responsible for improved efficiency. We compare our solution against the state-of-the-art skyline algorithm in SpatialHadoop. The results show that our techniques are more efficient and outperform the competitor significantly, especially in the case of large skyline output size.  相似文献   

As data of an unprecedented scale are becoming accessible, it becomes more and more important to help each user identify the ideal results of a manageable size. As such a mechanism, skyline queries have recently attracted a lot of attention for its intuitive query formulation. This intuitiveness, however, has a side effect of retrieving too many results, especially for high-dimensional data. This paper is to support personalized skyline queries as identifying “truly interesting” objects based on user-specific preference and retrieval size k. In particular, we abstract personalized skyline ranking as a dynamic search over skyline subspaces guided by user-specific preference. We then develop a novel algorithm navigating on a compressed structure itself, to reduce the storage overhead. Furthermore, we also develop novel techniques to interleave cube construction with navigation for some scenarios without a priori structure. Finally, we extend the proposed techniques for user-specific preferences including equivalence preference. Our extensive evaluation results validate the effectiveness and efficiency of the proposed algorithms on both real-life and synthetic data.  相似文献   

The Journal of Supercomputing - In recent years, numerous applications have been continuously generating large amounts of uncertain data. The advanced analysis queries such as skyline operators are...  相似文献   

Skyline queries have recently received considerable attention as an alternative decision-making operator in the database community. The conventional skyline algorithms have primarily focused on optimizing the dominance of points in order to remove non-skyline points as efficiently as possible, but have neglected to take into account the incomparability of points in order to bypass unnecessary comparisons. To design a scalable skyline algorithm, we first analyze a cost model that copes with both dominance and incomparability, and develop a novel technique to select a cost-optimal point, called a pivot point, that minimizes the number of comparisons in point-based space partitioning. We then implement the proposed pivot point selection technique in the existing sorting- and partitioning-based algorithms. For point insertions/deletions, we also discuss how to maintain the current skyline using a skytree, derived from recursive point-based space partitioning. Furthermore, we design an efficient greedy algorithm for the k representative skyline using the skytree. Experimental results demonstrate that the proposed algorithms are significantly faster than the state-of-the-art algorithms.  相似文献   

How to process a skyline query efficiently has received considerable attention in recent years. A skyline query identifies a set of non-dominated data records in a multidimensional dataset. Whereas most previous studies have resolved this problem in a centralized environment, this work considers it in a distributed sensor network environment. An algorithm, known as Skyline Sensor Algorithm (SkySensor), is presented to efficiently retrieve skyline results from a sensor network. A cluster-based architecture is designed in SkySensor to collect all sensor readings. A pruning method is then proposed to progressively sift out the skyline results from the sensor network. SkySensor avoids the need of collecting data from all sensors in the network, which is an extremely expensive action, when searching for the skyline results. The performance study indicates that SkySensor is highly efficient, and significantly outperforms previous methods in processing skyline queries.  相似文献   

D2D(Device-to-Device)通信技术是一种能够降低基站负载率和提高系统资源利用率的新型近场通信技术。本文根据D2D接收端与蜂窝端的相对距离关系,分别讨论了传统蜂窝系统以及引入中继技术后的模式选择问题,给出了一种基于蜂窝用户与D2D用户地理位置关系的模式选择方案。仿真数据验证了D2D系统采用复用模式的概率与设定的系统信干噪比阈值成反比关系,表明引入中继技术后的D2D系统采用复用模式的概率大大增加,意味着在混合网络中加入中继节点能够有效地提高系统的频谱利用率。  相似文献   

提出了一种新颖的分布环境中的序敏感轮廓查询算法(即找出不被别的对象所“支配”的且聚集值较高的对象)。现有的算法在节点数m较大时会消耗大量的网络带宽。提出了一种新的分布式序敏感轮廓查询处理算法(Distributed Rank-aware Skylining,DRS)。DRS算法在任意数据集上只需要4次交互就能完成,并且通过剪除不必要的对象来减少通讯代价。通过模拟数据验证了DRS算法的效率。实验表明,当节点数m大于4时,DRS算法性能优于现有算法的性能。  相似文献   

Two-tier streaming settings are a typical dynamic environment where continuous skylines represent an important semantic indicator for multiple attributes. To monitor skylines over the dynamic data in such settings, one needs to continuously update the skyline query results in order to reflect the new data values. This paper tackles the problem of continuous skyline monitoring on a central query server over dynamic data from multiple data sites. Simply sending the updates of tuple values to the server is cost-prohibitive. Therefore, we propose an approach that allows the central server to collaborate with the data sites to monitor the possible skyline changes. By doing so, the processing load is distributed over all the data sites instead of only on the central server. Furthermore, this collaborative approach minimizes the bandwidth consumption between the server and the data sites, which is often critical in a widely distributed environment such as a wide-area sensor network. We give theoretical upper bounds for the computation costs and communication costs of the proposed collaborative approach. We also conduct extensive experiments on both synthetic and real data sets. The experimental results demonstrate that our collaborative approach is efficient, scalable and well-balanced in terms of communication costs and computation costs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号