首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 328 毫秒
1.
This paper studies the problem of probabilistic range query over uncertain data. Although existing solutions could support such query, it still has space for improvement. In this paper, we firstly propose a novel index called S-MRST for indexing uncertain data. For one thing, via using an irregular shape for bounding uncertain data, it has a stronger space pruning ability. For another, by taking the gradient of probability density function into consideration, S-MRST is also powerful in terms of probability pruning ability. More important, S-MRST is a general index which could support multiple types of probabilistic queries. Theoretical analysis and extensive experimental results demonstrate the effectiveness and efficiency of the proposed index.  相似文献   

2.
Flash-memory-based solid-state drives (SSDs) are used widely for secondary storage. To be effective for SSDs, traditional indices have to be redesigned to cope with the special properties of flash memory, such as asymmetric read/write latencies (fast reads and slow writes) and out-of-place updates. Previous flash-optimized indices focus mainly on reducing random writes to SSDs, which is typically accomplished at the expense of a substantial number of extra reads. However, modern SSDs show a narrowing gap between read and write speeds, and read operations on SSDs increasingly affect the overall performance of indices on SSDs. As a consequence, how to optimize SSD-aware indices by reducing both write and read costs is a pertinent and open challenge. We propose a new tree index for SSDs that is able to reduce both writes and extra reads. In particular, we use an update buffer and overflow pages to reduce random writes, and we further exploit Bloom filters to reduce the extra reads to the overflow nodes in the tree. With this mechanism, we construct a read/write-optimized index that is capable of offering better overall performance than previous flash-aware indices. In addition, we present an analysis of the proposed index and show that the read and write costs of the operations on the index can be balanced by only tuning the false-positive rate of the Bloom filters. Our experimental results suggest that our proposal is efficient and represents an improvement over existing methods.  相似文献   

3.
Reachability is a fundamental problem on large-scale networks emerging nowadays in various application domains, such as social networks, communication networks, biological networks, road networks, etc. It has been studied extensively. However, little existing work has studied reachability with realistic constraints imposed on graphs with real-valued edge or node weights. In fact, such weights are very common in many real-world networks, for example, the bandwidth of a link in communication networks, the reliability of an interaction between two proteins in PPI networks, and the handling capacity of a warehouse/storage point in a distribution network. In this paper, we formalize a new yet important reachability query in weighted undirected graphs, called weight constraint reachability (WCR) query that asks: is there a path between nodes \(a\) and \(b\), on which each real-valued edge (or node) weight satisfies a range constraint. We discover an interesting property of WCR, based on which, we design a novel edge-based index structure to answer the WCR query in \(O(1)\) time. Furthermore, we consider the case when the index cannot entirely fit in the memory, which can be very common for emerging massive networks. An I/O-efficient index is proposed, which provides constant I/O (precisely four I/Os) query time with \(O(|V|\log |V|)\) disk-based index size. Extensive experimental studies on both real and synthetic datasets demonstrate the efficiency and scalability of our solutions in answering the WCR query.  相似文献   

4.
With the explosive growth of data, to support efficient data management including queries and updates, the database system is expected to provide tree-like indexes, such as R-tree, M-tree, B+-tree, according to different types of data. In the distributed environment, the indexes have to be scattered across the compute nodes to improve reliability and scalability. Indexes can speed up queries, but they incur maintenance cost when updates occur. In the distributed environment, each compute node maintains a subset of an index tree, so keeping the communication cost small is more crucial, or else it occupies lots of network bandwidth and the scalability and availability of the database system are affected. Further, to achieve the reliability and scalability for queries, several replicas of the index are needed, but keeping the replicas consistent is not straightforward. In this paper, we propose a framework supporting tree-like indexes, based on Chord overlay, which is a popular P2P structure. The framework dynamically tunes the number of replicas of index to balance the query cost and the update cost. Several techniques are designed to improve the efficiency of updates without the cost of performance of the queries. We implement M-tree and R-tree in our framework, and extensive experiments on real- life and synthetic datasets verify the efficiency and scalability of our framework.  相似文献   

5.
陈井爽  陈珂  寿黎但  江大伟  陈刚 《软件学报》2022,33(12):4688-4703
学习型索引通过学习数据分布可以准确地预测数据存取的位置,在保持高效稳定的查询下,显著降低索引的内存占用.现有的学习型索引主要针对只读查询进行优化,而对插入和更新支持不足.针对上述挑战,设计了一种基于Radix Tree的工作负载自适应学习型索引ALERT. ALERT使用Radix Tree来管理不定长的分段,段内采用具有最大误差界的线性插值模型进行预测.同时,ALERT使用一种高效的插入缓冲来降低数据插入更新的代价.针对点查询和范围查询提出两种自适应重组优化方法,通过对工作负载进行感知,动态地调整插入缓冲的组织结构.经实验验证,ALERT与业界流行的学习型索引相比,构建时间平均降低了81%,内存占用平均降低了75%,在保持了优秀读性能的同时,使插入延迟平均降低了50%;此外, ALERT使用自适应重组优化能有效感知查询工作负载特征,与不使用自适应重组优化相比,查询延迟平均降低了15%.  相似文献   

6.
Bit commitment schemes are at the basis of modern cryptography. Since information-theoretic security is impossible both in the classical and in the quantum regime, we examine computationally secure commitment schemes. In this paper we study worst-case complexity assumptions that imply quantum bit commitment schemes. First, we show that QSZK \({\not\subseteq}\) QMA implies a computationally hiding and statistically binding auxiliary-input quantum commitment scheme. We then extend our result to show that the much weaker assumption QIP \({\not\subseteq}\) QMA (which is weaker than PSPACE \({\not\subseteq}\) PP) implies the existence of auxiliary-input commitment schemes with quantum advice. Finally, to strengthen the plausibility of the separation QSZK \({\not\subseteq}\) QMA, we find a quantum oracle relative to which honest-verifier QSZK is not contained in QCMA, the class of languages that can be verified using a classical proof in quantum polynomial time.  相似文献   

7.
8.
The best way of selecting samples in algebraic attacks against block ciphers is not well explored and understood. We introduce a simple strategy for selecting the plaintexts and demonstrate its strength by breaking reduced-round KATAN32, LBlock and SIMON. For each case, we present a practical attack on reduced-round version which outperforms previous attempts of algebraic cryptanalysis whose complexities were close to exhaustive search. The attack is based on the selection of samples using cube attack and ElimLin which was presented at FSE’12, and a new technique called Universal Proning. In the case of LBlock, we break 10 out of 32 rounds. In KATAN32, we break 78 out of 254 rounds. Unlike previous attempts which break smaller number of rounds, we do not guess any bit of the key and we only use structural properties of the cipher to be able to break a higher number of rounds with much lower complexity. We show that cube attacks owe their success to the same properties and therefore can be used as a heuristic for selecting the samples in an algebraic attack. The performance of ElimLin is further enhanced by the new Universal Proning technique, which allows to discover linear equations that are not found by ElimLin.  相似文献   

9.
The Compact Muon Solenoid (CMS) experiment at the European Organization for Nuclear Research (CERN) deploys its data collections, simulation and analysis activities on a distributed computing infrastructure involving more than 70 sites worldwide. The historical usage data recorded by this large infrastructure is a rich source of information for system tuning and capacity planning. In this paper we investigate how to leverage machine learning on this huge amount of data in order to discover patterns and correlations useful to enhance the overall efficiency of the distributed infrastructure in terms of CPU utilization and task completion time. In particular we propose a scalable pipeline of components built on top of the Spark engine for large-scale data processing, whose goal is collecting from different sites the dataset access logs, organizing them into weekly snapshots, and training, on these snapshots, predictive models able to forecast which datasets will become popular over time. The high accuracy achieved indicates the ability of the learned model to correctly separate popular datasets from unpopular ones. Dataset popularity predictions are then exploited within a novel data caching policy, called PPC (Popularity Prediction Caching). We evaluate the performance of PPC against popular caching policy baselines like LRU (Least Recently Used). The experiments conducted on large traces of real dataset accesses show that PPC outperforms LRU reducing the number of cache misses up to 20% in some sites.  相似文献   

10.
Point location is an extremely well-studied problem both in internal memory models and recently also in the external memory model. In this paper, we present an I/O-efficient dynamic data structure for point location in general planar subdivisions. Our structure uses linear space to store a subdivision with N segments. Insertions and deletions of segments can be performed in amortized O(log? B N) I/Os and queries can be answered in $O(\log_{B}^{2} N)$ I/Os in the worst-case. The previous best known linear space dynamic structure also answers queries in $O(\log_{B}^{2} N)$ I/Os, but only supports insertions in amortized $O(\log_{B}^{2} N)$ I/Os. Our structure is also considerably simpler than previous structures.  相似文献   

11.
A mode of a multiset S is an element aS of maximum multiplicity; that is, a occurs at least as frequently as any other element in S. Given an array A[1:n] of n elements, we consider a basic problem: constructing a static data structure that efficiently answers range mode queries on A. Each query consists of an input pair of indices (i,j) for which a mode of A[i:j] must be returned. The best previous data structure with linear space, by Krizanc, Morin, and Smid (Proceedings of the International Symposium on Algorithms and Computation (ISAAC), pp. 517–526, 2003), requires \(\varTheta (\sqrt{n}\log\log n)\) query time in the worst case. We improve their result and present an O(n)-space data structure that supports range mode queries in \(O(\sqrt{n/\log n})\) worst-case time. In the external memory model, we give a linear-space data structure that requires \(O(\sqrt{n/B})\) I/Os per query, where B denotes the block size. Furthermore, we present strong evidence that a query time significantly below \(\sqrt{n}\) cannot be achieved by purely combinatorial techniques; we show that boolean matrix multiplication of two \(\sqrt{n} \times \sqrt{n}\) matrices reduces to n range mode queries in an array of size O(n). Additionally, we give linear-space data structures for the dynamic problem (queries and updates in near O(n 3/4) time), for orthogonal range mode in d dimensions (queries in near O(n 1?1/2d ) time) and for half-space range mode in d dimensions (queries in \(O(n^{1-1/d^{2}})\) time). Finally, we complement our dynamic data structure with a reduction from the multiphase problem, again supporting that we cannot hope for much more efficient data structures.  相似文献   

12.
Despite the importance of ranked queries in numerous applications involving multi-criteria decision making, they are not efficiently supported by traditional database systems. In this paper, we propose a simple yet powerful technique for processing such queries based on multi-dimensional access methods and branch-and-bound search. The advantages of the proposed methodology are: (i) it is space efficient, requiring only a single index on the given relation (storing each tuple at most once), (ii) it achieves significant (i.e., orders of magnitude) performance gains with respect to the current state-of-the-art, (iii) it can efficiently handle data updates, and (iv) it is applicable to other important variations of ranked search (including the support for non-monotone preference functions), at no extra space overhead. We confirm the superiority of the proposed methods with a detailed experimental study.  相似文献   

13.
This paper presents the Argonauts multi-agent framework which was developed as part of a one year student project at Technische Universität Dortmund. The Argonauts framework builds on a BDI approach to model rational agents that act cooperatively in a dynamic and indeterministically changing environment. However, our agent model extends the traditional BDI approach in several aspects, most notably by incorporating motivation into the agent’s goal selection mechanism. The framework has been applied by the Argonauts team in the 2010 version of the annual multi-agent programming contest organized by Technische Universität Clausthal. In this paper, we present a high-level specification and analysis of the actual system used for solving the given scenario. We do this by applying the GAIA methodology, a high-level and iterative approach to model communication and roles in multi-agent scenarios. We further describe the technical details and insights gained during our participation in the multi-agent programming contest.  相似文献   

14.
The increasing popularity of location-based social networks encourages more and more users to share their experiences. It deeply impacts the decision of customers when shopping, traveling, and so on. This paper studies the problem of top-K valuable documents query over geo-textual data stream. Many researchers have studied this problem. However, they do not consider the reliability of documents, where some unreliable documents may mislead customers to make improper decisions. In addition, they lack the ability to prune documents with low representativeness. In order to increase user satisfaction in recommendation systems, we propose a novel framework named PDS. It first employs an efficiently machine learning technique named ELM to prune unreliable documents, and then uses a novel index named \(\mathcal {GH}\) to maintain documents. For one thing, this index maintains a group of pruning values to filter low quality documents. For another, it utilizes the unique property of sliding window to further enhance the PDS performance. Theoretical analysis and extensive experimental results demonstrate the effectiveness of the proposed algorithms.  相似文献   

15.
16.
Bézier surfaces are mathematical tools employed in a wide variety of applications. Some works in the literature propose parallelization strategies to improve performance for the computation of Bézier surfaces. These approaches, however, are mainly focused on graphics applications and often are not directly applicable to other domains. In this work, we propose a new method for the computation of Bézier surfaces, together with approaches to efficiently map the method onto different platforms (CPUs, discrete and integrated GPUs). Additionally, we explore CPU–GPU cooperation mechanisms for computing Bézier surfaces using two integrated heterogeneous systems with different characteristics. An exhaustive performance evaluation—including different data-types, rendering and several hardware platforms—is performed. The results show that our method achieves speedups as high as 3.12x (double-precision) and 2.47x (single-precision) on CPU, and 3.69x (double-precision) and 13.14x (single-precision) on GPU compared to other methods in the literature. In heterogeneous platforms, the CPU–GPU cooperation increases the performance up to 2.09x with respect to the GPU-only version. Our method and the associated parallelization approaches can be easily employed in domains other than computer-graphics (e.g., image registration, bio-mechanical modeling and flow simulation), and extended to other Bézier formulations and Bézier constructions of higher order than surfaces.  相似文献   

17.
We study the problem of answering k -hop reachability queries in a directed graph, i.e., whether there exists a directed path of length $k$ , from a source query vertex to a target query vertex in the input graph. The problem of $k$ -hop reachability is a general problem of the classic reachability (where $k=\infty $ ). Existing indexes for processing classic reachability queries, as well as for processing shortest path distance queries, are not applicable or not efficient for processing $k$ -hop reachability queries. We propose an efficient index for processing $k$ -hop reachability queries. Our experimental results on a wide range of real datasets show that our method is efficient and scalable in terms of both index construction and query processing.  相似文献   

18.
Processing moving queries over moving objects using motion-adaptive indexes   总被引:2,自引:0,他引:2  
This paper describes a motion-adaptive indexing scheme for efficient evaluation of moving continual queries (MCQs) over moving objects. It uses the concept of motion-sensitive bounding boxes (MSBs) to model moving objects and moving queries. These bounding boxes automatically adapt their sizes to the dynamic motion behaviors of individual objects. Instead of indexing frequently changing object positions, we index less frequently changing object and query MSBs, where updates to the bounding boxes are needed only when objects and queries move across the boundaries of their boxes. This helps decrease the number of updates to the indexes. More importantly, we use predictive query results to optimistically precalculate query results, decreasing the number of searches on the indexes. Motion-sensitive bounding boxes are used to incrementally update the predictive query results. Furthermore, we introduce the concepts of guaranteed safe radius and optimistic safe radius to extend our motion-adaptive indexing scheme to evaluating moving continual k-nearest neighbor (kNN) queries. Our experiments show that the proposed motion-adaptive indexing scheme is efficient for the evaluation of both moving continual range queries and moving continual kNN queries.  相似文献   

19.
The global skyline, as an important variant of the skyline, has been widely applied in multi-criteria decision making, business planning and data mining. In this paper, we extend our early work and propose the maintenance methods to process the subspace global skyline (SGS) queries in dynamic databases. In the previous work, we proposed the index structure RB-tree, which can effectively manage the data to accelerate the subspace global skyline calculation. Also, the basic single SGS algorithm based on RB-tree (SSRB) and the optimized single SGS algorithm (OSSRB) were proposed to process a single SGS query. In addition, the multiple SGS algorithm (MSRB) was proposed to calculate multiple SGS queries by sharing the scan spaces of different queries. In this paper, we design some data structures and propose the maintenance approaches of SSRB, OSSRB and MSRB to cope with updates that happen to data sets. Thus our extended algorithms can be adopted for dynamic data sets. Finally, the experimental results show that the proposed algorithms OSSRB and MSRB have good performance to process SGS queries and they can be easily maintained with dynamic datasets.  相似文献   

20.
在Web应用中,以XML为格式的信息查询通常会受到网络传输速度有限等因素的影响。为了减少XML的物化视图与其数据源之间的一致性维护中所需的网络数据传输开销,提出了一种面向远程的XML物化视图增量维护方法和系统框架。这种方法根据多用户的查询请求和数据源更新信息,生成视图维护程序代码,以程序代码的网络迁移代替XML视图的重复查询,有效地减少了网络数据传输量。介绍了物化视图增量维护的基本原理、系统框架以及设计实现思路。最后通过性能测试,说明这种增量维护系统能够有效地减少传输开销。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号