共查询到20条相似文献,搜索用时 0 毫秒
1.
As one of the most important technologies for implementing large-scale distributed systems, peer-to-peer (P2P) computing has attracted much attention in both research and industrial communities, for its advantages such as high availability, high performance, and high flexibility to the dynamics of networks. However, multidimensional data indexing remains as a big challenge to P2P computing, because of the inefficiency in search and network maintenance caused by the complicated existing index structures, which greatly limits the scalability of applications and dimensionality of the data to be indexed.We propose SDI (Swift tree structure for multidimensional Data Indexing), a swift index scheme with a simple tree structure for multidimensional data indexing in large-scale distributed systems. While keeping the query efficiency in O(logN) in terms of routing hops, SDI has extremely low maintenance costs which is proved through theoretical analysis. Furthermore, SDI overcomes the root-bottleneck problem existing in most other tree-based distributed indexing systems. Extensive empirical study verifies the superiority of SDI in both query and maintenance performance. 相似文献
2.
The selection of the right I/O scheduler for a given workload can significantly improve the I/O performance. However, this is not an easy task because several factors should be considered, and even the “best” scheduler can change over the time, specially if the workload’s characteristics change too. To address this problem, we present a Dynamic and Automatic Disk Scheduling framework (DADS) that simultaneously compares two different Linux I/O schedulers, and dynamically selects that which achieves the best I/O performance for any workload at any time. The comparison is made by running two instances of a disk simulator inside the Linux kernel. Results show that, by using DADS, the performance achieved is always close to that obtained by the best scheduler. Thus, system administrators are exempted from selecting a suboptimal scheduler which can provide a good performance for some workloads, but may downgrade the system throughput when the workloads change. 相似文献
3.
4.
Today’s data storage systems are increasingly adopting flash-based solid-state drives (SSD), in which new data is written out-of-place. The space occupied by invalidated data is reclaimed by means of a garbage-collection process. This involves additional write operations that result in write amplification, which negatively affects the performance, endurance, and lifetime of SSDs. Several garbage-collection schemes have been proposed in the literature and corresponding models have been developed for assessing their efficiency. We demonstrate that some of these publications arrive at conflicting results. We establish that the discrepancies identified are due to pitfalls in the modeling and analysis of some of the basic garbage-collection schemes. We effectively resolve these discrepancies by rectifying the pitfalls and developing proper analytical models that yield accurate results. We obtain new results for the circular scheme that are subsequently used to develop a new accurate model for the windowed greedy garbage-collection scheme. Results of theoretical and practical importance are analytically derived and experimentally confirmed. They demonstrate that, as the window size decreases, the write amplification increases, illustrating the tradeoff between computation time and write amplification. The results also show that, under certain conditions, the write amplification of the simple circular scheme can be similar to that of the optimum, albeit more complex greedy garbage-collection scheme. In this case, the write amplification for the windowed greedy scheme is essentially found to be independent of the window size. 相似文献
5.
Andreas Wichert Pedro Teixeira Pedro Santos Helena Galhardas 《Journal of Intelligent Information Systems》2010,35(3):495-516
We will show that the hierarchical linear subspace method is a tree that divides the distances between the subspaces. By doing
so, the data is divided into disjoint entities. The asymptotic upper bound estimation of the maximum applicable number of
subspaces is logarithmically constrained by the number of represented elements and their dimension. The search in such a tree
starts at the subspace with the lowest dimension. In this subspace, the set of all possible similar images is determined.
In the next subspace, additional metric information corresponding to a higher dimension is used to reduce this set. The distances
between the subspaces correspond to the values represented by the difference between the mean distance of all the points in
one space and a corresponding mean distance of the objects in a subspace. The theoretical estimation of temporal complexity
of the algorithmic is logarithmic. The costs are equivalent to the search costs in an tree plus the additional costs of the
dimension of the data space. 相似文献
6.
William McCune 《Journal of Automated Reasoning》1992,9(2):147-167
This article addresses the problem of indexing and retrieving first-order predicate calculus terms in the context of automated deduction programs. The four retrieval operations of concern are to find variants, generalizations, instances, and terms that unify with a given term. Discrimination-tree indexing is reviewed, and several variations are presented. The path-indexing method is also reviewed. Experiments were conducted on large sets of terms to determine how the properties of the terms affect the performance of the two indexing methods. Results of the experiments are presented.This was supported by the Applied Mathematical Sciences subprogram of the Office of Energy Research, U.S. Department of Energy, under Contract W-31-109-Eng-38. 相似文献
7.
Antipole tree indexing to support range search and k-nearest neighbor search in metric spaces 总被引:2,自引:0,他引:2
Cantone D. Ferro A. Pulvirenti A. Recupero D.R. Shasha D. 《Knowledge and Data Engineering, IEEE Transactions on》2005,17(4):535-550
Range and k-nearest neighbor searching are core problems in pattern recognition. Given a database S of objects in a metric space M and a query object q in M, in a range searching problem the goal is to find the objects of S within some threshold distance to g, whereas in a k-nearest neighbor searching problem, the k elements of S closest to q must be produced. These problems can obviously be solved with a linear number of distance calculations, by comparing the query object against every object in the database. However, the goal is to solve such problems much faster. We combine and extend ideas from the M-tree, the multivantage point structure, and the FQ-tree to create a new structure in the "bisector tree" class, called the Antipole tree. Bisection is based on the proximity to an "Antipole" pair of elements generated by a suitable linear randomized tournament. The final winners a, b of such a tournament is far enough apart to approximate the diameter of the splitting set. If dist(a, b) is larger than the chosen cluster diameter threshold, then the cluster is split. The proposed data structure is an indexing scheme suitable for (exact and approximate) best match searching on generic metric spaces. The Antipole tree outperforms by a factor of approximately two existing structures such as list of clusters, M-trees, and others and, in many cases, it achieves better clustering properties. 相似文献
8.
Goetz Graefe Felix Halim Stratos Idreos Harumi Kuno Stefan Manegold Bernhard Seeger 《The VLDB Journal The International Journal on Very Large Data Bases》2014,23(2):303-328
Adaptive indexing initializes and optimizes indexes incrementally, as a side effect of query processing. The goal is to achieve the benefits of indexes while hiding or minimizing the costs of index creation. However, index-optimizing side effects seem to turn read-only queries into update transactions that might, for example, create lock contention. This paper studies concurrency control and recovery in the context of adaptive indexing. We show that the design and implementation of adaptive indexing rigorously separates index structures from index contents; this relaxes constraints and requirements during adaptive indexing compared to those of traditional index updates. Our design adapts to the fact that an adaptive index is refined continuously and exploits any concurrency opportunities in a dynamic way. A detailed experimental analysis demonstrates that (a) adaptive indexing maintains its adaptive properties even when running concurrent queries, (b) adaptive indexing can exploit the opportunity for parallelism due to concurrent queries, (c) the number of concurrency conflicts and any concurrency administration overheads follow an adaptive behavior, decreasing as the workload evolves and adapting to the workload needs. 相似文献
9.
Cristian Duda Donald Kossmann Chong Zhou 《The VLDB Journal The International Journal on Very Large Data Bases》2010,19(5):735-758
Google and other products have revolutionized the way we search for information. There are, however, still a number of research challenges. One challenge that arises specifically in desktop search is to exploit the structure and semantics of documents, as defined by the application program that generated the data (e.g., Word, Excel, or Outlook). The current generation of search products does not understand these structures and therefore often returns wrong results. This paper shows how today’s search technology can be extended in order to take the specific semantics of certain structures into account. The key idea is to extend inverted file index structures with predicates which encode the circumstances under which certain keywords of a document become visible to a user. This paper provides a framework that allows to express the semantics of structures in documents and algorithms to construct enhanced, predicate-based indexes. Furthermore, this paper shows how keyword and phrase queries can be processed efficiently on such enhanced indexes. It is shown that the proposed approach has superior retrieval performance with regard to both recall and precision and has tolerable space and query running time overheads. 相似文献
10.
Recent papers have indicated that indexing is a promising approach to fast model-based object recognition because it allows most of the possible matches between sets of image features and sets of model features to be quickly eliminated from consideration. This correspondence describes a system that is capable of indexing using sets of three points undergoing 3D transformations and projection by taking advantage of the probabilistic peaking effect. To be able to index using sets of three points, we must allow false negatives. These are overcome by ensuring that we examine several correct hypotheses. The use of these techniques to speed up the alignment method is described. Results are given on real and synthetic data 相似文献
11.
The increasing performance and wider spread use of automated semantic annotation and entity linking platforms has empowered the possibility of using semantic information in information retrieval. While keyword-based information retrieval techniques have shown impressive performance, the addition of semantic information can increase retrieval performance by allowing for more accurate sense disambiguation, intent determination, and instance identification, just to name a few. Researchers have already delved into the possibility of integrating semantic information into practical search engines using a combination of techniques such as using graph databases, hybrid indices and adapted inverted indices, among others. One of the challenges with the efficient design of a search engine capable of considering semantic information is that it would need to be able to index information beyond the traditional information stored in inverted indices, including entity mentions and type relationships. The objective of our work in this paper is to investigate various ways in which different data structure types can be adopted to integrate three types of information including keywords, entities and types. We will systematically compare the performance of the different data structures for scenarios where (i) the same data structure types are adopted for the three types of information, and (ii) different data structure types are integrated for storing and retrieving the three different information types. We report our findings in terms of the performance of various query processing tasks such as Boolean and ranked intersection for the different indices and discuss which index type would be appropriate under different conditions for semantic search. 相似文献
12.
13.
Andrew E. Jackson Robert R. Safford William W. Swart 《Computers & Industrial Engineering》1993,25(1-4):303-306
America's Space Exploration program is characterized by cutting edge technology in support of NASA's goals and research objectives. One principal component of the Space Shuttle System is the Orbiter Vehicle which undergoes necessary repairs and refurbishment at the Kennedy Space Center following each mission. The Orbiter's Thermal Protection System (TPS) is a mission critical component which protects the orbiter from the heat of re-entry from space. The University of Central Florida (UCF) research team was tasked with evaluating TPS processing requirements to reduce overall cycle times between operational mission. A job/task performance analysis was conducted to identify candidate TPS processes for evaluation and possible improvements. The research team developed a computer generated Improvement Potential Index (IPI) following a systematic data collection process. The IPI was then used to identify the top twenty processes which TPS engineers, quality assurance personnel and technicians identified as tasks which would most likely benefit from process improvement. The specified tasks were then analyzed for methods optimization. This paper documents the results of this effort to date, with emphasis on the Improvement Potential Index. 相似文献
14.
引言随着计算机科技的发展,无纸办公日益成为各单位日常办公的主要形式。而随着USB存储设备日益广泛的使用,数据泄漏的危害也越来越严重。因此在单位内部对USB存储设备的操作权限进行控制是很有必要的。本设计可将不同的USB存储设备(包括安全存储设 相似文献
15.
单片机实现对CF卡的读写 总被引:5,自引:0,他引:5
CF 卡是一种包含了控制器和大容量 Flash 存储器的标准器件,具有容量大、体积小、高性能、携带方便等优点,已广泛应用在数据采集系统和许多消费类电子产品中。本文详细介绍 C F 卡在单片机系统中的硬件接口电路,以及单片机对 CF 卡进行标准文件读写的实现,且写入的文件能被 Windows 操作系统读写。 相似文献
16.
Kruliš Martin Osipyan Hasmik Marchand-Maillet Stéphane 《Multimedia Tools and Applications》2017,76(9):11859-11887
Multimedia Tools and Applications - Permutation-based indexing is one of the most popular techniques for the approximate nearest-neighbor search problem in high-dimensional spaces. Due to the... 相似文献
17.
Authenticated indexing for outsourced spatial databases 总被引:1,自引:0,他引:1
Yin Yang Stavros Papadopoulos Dimitris Papadias George Kollios 《The VLDB Journal The International Journal on Very Large Data Bases》2009,18(3):631-648
In spatial database outsourcing, a data owner delegates its data management tasks to a location-based service (LBS), which
indexes the data with an authenticated data structure (ADS). The LBS receives queries (ranges, nearest neighbors) originating
from several clients/subscribers. Each query initiates the computation of a verification object (VO) based on the ADS. The VO is returned to the client that can verify the result correctness using the public key of the owner. Our first contribution
is the MR-tree, a space-efficient ADS that supports fast query processing and verification. Our second contribution is the
MR*-tree, a modified version of the MR-tree, which significantly reduces the VO size through a novel embedding technique. Finally, whereas most ADSs must be constructed and maintained by the owner, we
outsource the MR- and MR*-tree construction and maintenance to the LBS, thus relieving the owner from this computationally
intensive task. 相似文献
18.
Data deduplication (dedupe for short) is a special data compression technique. It has been widely adopted to save backup time as well as storage space, particularly in backup storage systems. Therefore, most dedupe research has primarily focused on improving dedupe write performance. However, backup storage dedupe read performance is also a crucial problem for storage recovery. This paper designs a new dedupe storage read cache for backup applications that improves read performance by exploiting a special characteristic: the read sequence is the same as the write sequence. Consequently, for better cache utilization, by looking ahead for future references within a moving window, it evicts victims from the cache having the smallest future access. Moreover, to further improve read cache performance, it maintains a small log buffer to judiciously cache future access data chunks. Extensive experiments with real-world backup workloads demonstrate that the proposed read cache scheme improves read performance by up to 64.3% 相似文献
19.
文章介绍.一种基于单片机的感应卡读写器的软、硬件设计,系统地介绍了硬件电路的原理,给出了读写器电路原理图以及软件程序框图,该读写器在进行读写操作时只需将卡片放在其附近即可进行数据交换,无需任何接触,实现了读写器与感应卡之间的无线数据通讯。 相似文献
20.
Multidimensional indexing for recognizing visual shapes 总被引:3,自引:0,他引:3
Califano A. Mohan R. 《IEEE transactions on pattern analysis and machine intelligence》1994,16(4):373-392