期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Exploiting the performance gains of modern disk drives by enhancing data locality 总被引：2，自引：0，他引：2

Yuhui Deng 《Information Sciences》2009,179(14):2494-2511

Due to the widening performance gap between RAM and disk drives, a large number of I/O optimization methods have been proposed and designed to alleviate the impact of this gap. One of the most effective approaches of improving disk access performance is enhancing data locality. This is because the method could increase the hit ratio of disk cache and reduce the seek time and rotational latency. Disk drives have experienced dramatic development since the first disk drive was announced in 1956. This paper investigates some important characteristics of modern disk drives. Based on the characteristics and the observation that data access on disk drives is highly skewed, the frequently accessed data blocks and the correlated data blocks are clustered into objects and moved to the outer zones of a modern disk drive. The idea attempts to enhance spatial locality, improve the efficiency of aggressive sequential prefetch, and take advantage of Zoned Bit Recording (ZBR). An experimental simulation is employed to investigate the performance gains generated by the enhanced data locality. The performance gains are analyzed by breaking down the disk access time into seek time, rotational latency, data transfer time, and hit ratio of the disk cache. Experimental results provide useful insights into the performance behaviours of a modern disk drive with enhanced data locality. 相似文献

2.

Improving multimedia systems performance using constant-density recording disks

Philip Kwok Chung Tse Clement H.C. Leung 《Multimedia Systems》2000,8(1):47-56

Multimedia systems store and retrieve large amounts of data which require extremely high disk bandwidth and their performance critically depends on the efficiency of disk storage. However, existing magnetic disks are designed for small amounts of data retrievals geared to traditional operations; with speed improvements mainly focused on how to reduce seek time and rotational latency. When the same mechanism is applied to multimedia systems, overheads in disk I/O can result in dramatic deterioration in system performance. In this paper, we present a mathematical model to evaluate the performance of constant-density recording disks, and use this model to analyze quantitatively the performance of multimedia data request streams. We show that high disk throughput may be achieved by suitably adjusting the relevant parameters. In addition to demonstrating quantitatively that constant-density recording disks perform significantly better than traditional disks for multimedia data storage, a novel disk-partitioning scheme which places data according to their bandwidths is presented. 相似文献

3.

A hybrid filesystem for hard disk drives in tandem with flash memory

Nils Fisher Zhen He Mitzi McCarthy 《Computing》2012,94(1):21-68

The traditional hard disk drive (HDD) is often a bottleneck in the overall performance of modern computer systems. With the development of solid state drives (SSD) based on flash memory, new possibilities are available to improve secondary storage performance. In this work, we propose a new hybrid SSD–HDD storage system and a selection of algorithms designed to assign pages across an HDD and an SSD to optimise I/O performance. The hybrid system combines the advantages of the SSD’s fast random seek speed with the sequential access speed and large storage capacity of the HDD to produce significantly improved performance in a variety of situations. We further improve performance by allowing concurrent access across the two types of storage devices. We show the drive assignment problem is NP-complete and accordingly propose effective heuristic solutions. Extensive experiments using both synthetic and real data sets show our system with a small SSD can outperform a striped dual HDD and remain competitive with a dual SSD. 相似文献

4.

A Multiresolution Terrain Model for Efficient Visualization Query Processing

Kai Xu Xiaofang Zhou Xuemin Lin Heng Tao Shen Ke Deng 《Knowledge and Data Engineering, IEEE Transactions on》2006,18(10):1382-1396

Multiresolution Triangular Mesh (MTM) models are widely used to improve the performance of large terrain visualization by replacing the original model with a simplified one. MTM models, which consist of both original and simplified data, are commonly stored in spatial database systems due to their size. The relatively slow access speed of disks makes data retrieval the bottleneck of such terrain visualization systems. Existing spatial access methods proposed to address this problem rely on main-memory MTM models, which leads to significant overhead during query processing. In this paper, we approach the problem from a new perspective and propose a novel MTM called direct mesh that is designed specifically for secondary storage. It supports available indexing methods natively and requires no modification to MTM structure. Experiment results, which are based on two real-world data sets, show an average performance improvement of 5-10 times over the existing methods. 相似文献

5.

Databases deepen the Web 总被引：2，自引：0，他引：2

Ghanem T.M. Aref W.G. 《Computer》2004,37(1):116-117

The Web has become the preferred medium for many database applications, such as e-commerce and digital libraries. These applications store information in huge databases that users access, query, and update through the Web. Database-driven Web sites have their own interfaces and access forms for creating HTML pages on the fly. Web database technologies define the way that these forms can connect to and retrieve data from database servers. The number of database-driven Web sites is increasing exponentially, and each site is creating pages dynamically-pages that are hard for traditional search engines to reach. Such search engines crawl and index static HTML pages; they do not send queries to Web databases. The information hidden inside Web databases is called the "deep Web" in contrast to the "surface Web" that traditional search engines access easily. We expect deep Web search engines and technologies to improve rapidly and to dramatically affect how the Web is used by providing easy access to many more information resources. 相似文献

6.

Performance bottleneck of subsequence matching in time-series databases: Observation, solution, and performance evaluation

Sang-Wook Kim Byeong-Soo Jeong 《Information Sciences》2007,177(22):4841-4858

Subsequence matching is an operation that finds subsequences whose changing patterns are similar to a given query sequence from time-series databases. This paper identifies a performance bottleneck in subsequence matching, and then proposes an effective method that substantially improves the performance of entire subsequence matching by resolving the performance bottleneck. First, we analyze the disk access and CPU processing times required during the index searching and post-processing steps of subsequence matching through preliminary experiments. Based on these results, we show that the post-processing step is a main performance bottleneck in subsequence matching. Then, we argue that the optimization of the post-processing step is a crucial issue overlooked in previous approaches. In order to resolve the performance bottleneck, we propose a simple yet highly effective method for expediting the post-processing step. By rearranging the order of candidate subsequences to be compared with a query sequence, our method completely eliminates the redundancies of disk accesses and CPU processing that occur in the post-processing step. Our method is fairly efficient, and does not incur any false dismissal. We quantitatively demonstrate the superiority of our method through extensive experimentation. The results show that our method produces a significantly faster post-processing step; When using a data set of real-world stock sequences, our method was 43.36-96.75 times faster than previous methods, and when using data sets of large numbers of synthetic sequences, our method was 12.48-26.95 times faster than previous methods. Also, the results show that our method reduces the weight of the post-processing step over entire subsequence matching from more than 97% to less than 67%. This implies that our method successfully resolves the performance bottleneck in subsequence matching. As a result, our method provides excellent performance in entire subsequence matching. Compared with previous methods, our method is 16.17-32.64 times faster when using a data set of real-world stock sequences and 8.64-14.29 times faster when using data sets of large numbers of synthetic sequences. 相似文献

7.

基于分布式内存数据库的移动对象全时态索引

周翔宇程春玲杨雁莹《计算机科学》2016,43(7):203-207, 216

针对现有移动索引仅对内存/磁盘两层结构进行优化,忽略了索引节点在内存中的缓存敏感性,提出一种基于分布式内存数据库的全时态索引结构DFTB^x树。该索引结构针对存储器Cache、内存和磁盘3层结构进行优化,根据Cache行、指令数量和TLB失配数等多个条件设计内存索引节点的大小。同时,根据磁盘数据页的大小设计历史数据迁移链节点的大小,使得Cache和内存能够一次读取索引节点和迁移链节点数据,避免多次读取数据带来的延迟。此外,构建历史数据迁移链,实现历史数据持久化,从而支持移动对象全时态索引。实验结果表明:与B^x树、B^dual树、TPR*树和STRIPES算法相比,DFTB^x树具有较高的查询和更新效率。相似文献

8.

Optimizing sort order query execution in balanced and nested gridfiles

Mueck T.A. Schauer M.J. 《Knowledge and Data Engineering, IEEE Transactions on》1995,7(2):246-260

Disk input/output (I/O) efficient query execution is an important topic with respect to DBMS performance. In this context, we elaborate on the construction of disk access plans for sort order queries in balanced and nested grid files. The key idea is to use the order information contained in the directory of the multiattribute search structure. The presented algorithms are shown to yield a significant decrease in the number of disk I/O operations by appropriate use of the order information. Two algorithms for the construction of appropriate disk access plans are proposed, namely a greedy approach and a heuristic divide-and-conquer approach. Both approaches yield considerable I/O savings compared to straightforward query processing without consideration of any directory order information. The former performs well for small buffer page allocations, i.e., for a small number of buffer pages relative to the number of data buckets processed in the query. The latter is superior to the greedy algorithm with respect to the total number of I/O operations and with respect to the overall maximum of buffer pages needed to achieve the minimal number of disk I/O operations. Both approaches rely on a binary trie as a temporary data structure. This trie is used as an explicit representation of the order information. The storage consumption of the temporary data structure is shown to be negligible in realistic cases, Even for pathological cases with respect to degenerated balanced and nested grid files, reasonable upper bounds can be given 相似文献

9.

Secure query processing against encrypted XML data using Query-Aware Decryption

Jae-Gil Lee Kyu-Young Whang 《Information Sciences》2006,176(13):1928-1947

Dissemination of XML data on the internet could breach the privacy of data providers unless access to the disseminated XML data is carefully controlled. Recently, the methods using encryption have been proposed for such access control. However, in these methods, the performance of processing queries has not been addressed. A query processor cannot identify the contents of encrypted XML data unless the data are decrypted. This limitation incurs overhead of decrypting the parts of the XML data that would not contribute to the query result. In this paper, we propose the notion of Query-Aware Decryption for efficient processing of queries against encrypted XML data. Query-Aware Decryption allows us to decrypt only those parts that would contribute to the query result. For this purpose, we disseminate an encrypted XML index along with the encrypted XML data. This index, when decrypted, informs us where the query results are located in the encrypted XML data, thus preventing unnecessary decryption for other parts of the data. Since the size of this index is much smaller than that of the encrypted XML data, the cost of decrypting this index is negligible compared with that for unnecessary decryption of the data itself. The experimental results show that our method improves the performance of query processing by up to six times compared with those of existing methods. Finally, we formally prove that dissemination of the encrypted XML index does not compromise security. 相似文献

10.

The Design and Evaluation of a Magnetic/Optical Access Structure for Temporal Databases

VRAM KOURAMAJIAN RAMEZ ELMASRI 《Journal of Systems Integration》1997,7(1):47-75

This paper describes a magnetic/optical access structure for append–only temporal databases. We formally define the properties of an access structure, called the MonotonicB ⁺ -Tree (MBT), that is well suited for Write-Once Read Many optical disks. We present an insertion algorithm for the MBT that does not require splitting of index nodes and give the time analysis for this algorithm. We also describe a storage architecture where optical disks work in tandem with magnetic disks. Magnetic disks are used for storing current versions and recent past versions, whereas optical disks are dedicated for archiving older past versions. Our archiving techniques: (1) allow temporal data and the MBT access structure to span magnetic disks and optical disks; (2) minimize the overhead of the migration process by taking advantage of append–only nature of temporal databases; (3) gracefully handle object versions with very long time intervals so that the delay in the migration process is kept to minimum; and (4) ensure that no false magnetic or optical disk address lookup is performed during search operations by duplicating some closed versions on both magnetic and optical disks. To validate our claims for the efficiency of migration techniques, we analyze the performance of temporal access structures partitioned between magnetic and optical disks. We show that the migration process has a minimal effect on the search time. Our simulation identifies important parameters, and shows how they affect the performance of the temporal access structures. These include mean of version lifespan, block size, query time interval length, and total number of versions. 相似文献

11.

A comprehensive analytical performance model for disk devices underrandom workloads

Triantafillou P. Christodoulakis S. Georgiadis C.A. 《Knowledge and Data Engineering, IEEE Transactions on》2002,14(1):140-155

Our goal is to contribute a common theoretical framework for studying the performance of disk-storage devices. Understanding the performance behavior of these devices will allow prediction of the I/O cost in modern applications. Current disk technologies differ in terms of the fundamental modeling characteristics, which include the magnetic/optical nature, angular and linear velocities, storage capacities, and transfer rates. Angular and linear velocities, storage capacities, and transfer rates are made constant or variable in different existing disk products. Related work in this area has studied Constant Angular Velocity (CAV) magnetic disks and Constant Linear Velocity (CLV) optical disks. We present a comprehensive analytical model, validated through simulations, for the random retrieval performance of disk devices which takes into account all the above-mentioned fundamental characteristics and includes, as special cases, all the known disk-storage devices. Such an analytical model can be used, for example, in the query optimizer of large traditional databases as well as in an admission controller of multimedia storage servers. Besides the known models for magnetic CAV and optical CLV disks, our unifying model is also reducible to a model for a more recent disk technology, called zoned disks, the retrieval performance of which has not been modeled in detail before. The model can also be used to study the performance retrieval of possible future technologies which combine a number of the above characteristics and in environments containing different types of disks (e.g., magnetic-disk-based secondary storage and optical-disk-based tertiary storage). Using our model, we contribute an analysis of the performance behavior of zoned disks and we compare it against that for the traditional CAV disks, as well as against that of some possible/future technologies. This allows us to gain insights into the fundamental performance trade-offs 相似文献

12.

非固定双头镜像磁盘实时调度算法的研究

秦啸庞丽萍韩宗芬李胜利《软件学报》1999,10(9):996-1002

文章给出一个实时非固定双头镜像磁盘系统的形式化模型.该磁盘模型中的每个双头磁盘都有两个相互独立的磁臂,能够独立地完成寻找磁道过程.针对该磁盘系统,文章研究了3种实时调度算法.模拟实验表明,“忽略超截止期调度算法”的性能最好,因为它忽略了对超截止期限实时请求的处理.文章同时分析了固定双头镜像磁盘与非固定双头镜像磁盘之间的性能差别.实验结果表明,由于非固定双头磁盘的两个磁头可以独立寻找磁道,因此非固定双头镜像磁盘的性能比固定双头镜像磁盘的性能要好. 相似文献

13.

Clustering spatial networks for aggregate query processing: A hypergraph approach

Engin Demir Cevdet Aykanat B. Barla Cambazoglu 《Information Systems》2008

In spatial networks, clustering adjacent data to disk pages is highly likely to reduce the number of disk page accesses made by the aggregate network operations during query processing. For this purpose, different techniques based on the clustering graph model are proposed in the literature. In this work, we show that the state-of-the-art clustering graph model is not able to correctly capture the disk access costs of aggregate network operations. Moreover, we propose a novel clustering hypergraph model that correctly captures the disk access costs of these operations. The proposed model aims to minimize the total number of disk page accesses in aggregate network operations. Based on this model, we further propose two adaptive recursive bipartitioning schemes to reduce the number of allocated disk pages while trying to minimize the number of disk page accesses. We evaluate our clustering hypergraph model and recursive bipartitioning schemes on a wide range of road network datasets. The results of the conducted experiments show that the proposed model is quite effective in reducing the number of disk accesses incurred by the network operations. 相似文献

14.

一种性能优化的小文件存储访问策略的研究 总被引：1，自引：0，他引：1

赵跃龙谢晓玲蔡咏才王国华刘霖《计算机研究与发展》2012,49(7):1579-1586

在分布式文件系统中,小文件的管理一般存在访问性能较差和存储空间浪费较大等缺点.为了解决这些问题,提出了一种性能优化的小文件存储访问(SFSA)策略.SFSA将逻辑上连续的数据尽可能存储在物理磁盘的连续空间,使用Cache充当元数据服务器的角色并通过简化的文件信息节点提高Cache利用率,提高了小文件访问性能;写数据时聚合更新数据及其文件夹域中的相关数据为一次I/O请求写入,减少了文件碎片数量,提高了存储空间利用率;文件传输时利用局部性原理,提前发送批量的高访问率的小文件,降低了建立网络连接开销,提升了文件传输性能.理论分析和实验证明,SFSA的设计思想和方法能有效地优化小文件的存储访问性能. 相似文献

15.

An empirical study of a CDC 844-41 disk subsystem

J William Atwood Alexander MacLeod Keh-Chiang Yu 《Performance Evaluation》1982,2(1):29-56

Three methods (seek overlap, seek arm scheduling, static file repositioning) for improvement of disk subsystem performance are reviewed. Detailed measured data are reported for seek time, probability of zero-length seek, latency, and transfer time, for a 12-spindle CDC 844-41 disk subsystem shared between two CDC Cyber 170/750 central processors. The probability of zero-length seeks is shown to be high, and the spindle queue lengths are observed to be low. The transfer time data are very different from the data published by others for IBM systems. A detailed simulation model of the measured system is shown to validate. This model is then used to demonstrate that seek arm scheduling is unlikely to produce much improvement, that static file repositioning can improve performance by about 20%, and that seek overlap can almost double the interactive capacity of the system measured. 相似文献

16.

网络存储系统中低开销高性能的第二级缓存替换算法的研究

赵英杰肖侬《计算机工程与科学》2012,34(5):84-88

针对网络存储访问特性所带来的第二级缓存性能降低的问题,提出了一种高性能的第二级缓存替换算法。本算法采用顺序页面检测机制,能根据缓存页面顺序程度的不同做出替换选择,以减少缓存失效引起随机访问磁盘的次数,避免不必要的磁头寻道和旋转开销,从而改善整个存储系统的性能。实验结果表明,在多种缓存大小下,本文算法能显著降低有效响应时间,使网络存储系统达到更优的性能。相似文献

17.

Design and Evaluation of a Smart Disk Cluster for DSS Commercial Workloads

《Journal of Parallel and Distributed Computing》2001,61(11):1633-1664

The requirements for storage space and computational power of large-scale applications are increasing rapidly. Clusters seem to be the most attractive architecture for such applications, due to their low costs and high scalability. On the other hand, smart disk systems, with their large storage capacities and growing computational power are becoming increasingly popular. In this work, we compare the performance of these architectures with a single host-based system using representative queries from the Decision Support System (DSS) databases. We show how to implement individual database operations in the smart disk system and also show how to optimize the execution of the whole query by bundling frequently occurring operations together and executing the bundle in a single invocation. Besides decreasing the overall execution time, operation bundling also offers an easy-to-program and easy-to-use interface to access the data on smart disks. We also present a protocol for minimizing the communication time in the smart-disk-based system. To measure the response times, we have developed the DBsim, an accurate simulator which can simulate the database operations for the single host-based, cluster-based, and smart-disk-based systems. Using this simulator, we illustrate that the smart disk architecture offers substantial benefits in terms of overall query execution times of the TPC-D benchmark suite. In particular, the average response time of the smart disk architecture for the representative queries from the TPC-D benchmark in our base configuration is 71% smaller than the response time on the single host-based system and 4.2% smaller than the response time on the fastest cluster architecture. We also demonstrate the effectiveness of the operation bundling and compare the scalabilities of the cluster-based and smart-disk-based systems. 相似文献

18.

Local heat transfer characteristics in a single rotating disk and co-rotating disks

H. H. Cho C. H. Won G. Y. Ryu D. H. Rhee 《Microsystem Technologies》2003,9(6-7):399-408

The present study investigates convective heat/mass transfer and flow characteristics inside rotating disks. The rotating disks are simulated on the commonly used 3.5 hard disk drives (HDD). The experiments are conducted for the various hub heights of 5, 10 and 15 mm in a single rotating disk and 4, 6 and 8 mm in co-rotating disks and for the various rotating Reynolds numbers of 5.53 × 10⁴, 8.53 × 10⁴ and 1.13 × 10⁵. To accommodate the general operating conditions of HDD, the experiments are also conducted with an obstruction of rectangular cross-section in the space, which simulates a read-write head arm. A naphthalene sublimation technique is employed to determine the detailed local heat transfer coefficients on the rotating disks using the heat and mass transfer analogy. Flow field measurements are conducted using laser Doppler anemometry (LDA) and numerical calculations are performed simultaneously to analyze the flow patterns induced by disk rotation. The results of a single rotating disk show that the heat transfer on the rotating disk is enhanced considerably according to the reduction of the hub height and the increase of the rotating Reynolds number. The head arm inserted in the cavity between the rotating disk and the cover enhances uniformity of the heat/mass transfer on the disk due to the deficit of the momentum in the average flow despite the enhancement of the tangential component of fluctuation velocity. The heat/mass transfer rates on the co-rotating disks have very low values near the hub in the inner region of the solid-body rotation and increase rapidly toward the outer region. The change of heat/mass transfer for various hub heights is negligible.The authors wish to acknowledge support for this study by the Ministry of Science and Technology through their National Research Laboratory program and by the KOSEF through the Center of Information Storage Device. 相似文献

19.

Load balanced and optimal disk allocation strategy for partial match queries on multidimensional files

Das S.K. Pinotti C.M. 《Parallel and Distributed Systems, IEEE Transactions on》2002,13(12):1211-1219

A multidimensional file is one whose data are characterized by several attributes, each specified in a given domain. A partial match query on a multidimensional file extracts all data whose attributes match the values of one or more attributes specified in the query. The disk allocation problem of a multidimensional file F on a database system with multiple disks accessible in parallel is the problem of distributing F among the disks such that the data qualifying for each partial match query are distributed as evenly as possible among the disks of the system. We propose an optimal solution to this problem for multidimensional files with pairwise prime domains based on a large and flexible class of maximum distance separable codes, namely, the redundant residue codes. We also introduce a new family of residue codes, called the redundant nonpairwise prime residue codes, to deal with files whose attribute domains are nonpairwise prime. 相似文献

20.

多路视频监控系统的存储调度策略

周兵方俊《计算机工程与应用》2004,40(18):85-87

采用传统的“轮巡式”视频监控系统在多路存储情况下,存在着存储的效率低下、因硬盘空间不足改变存储路径而造成的存储“抖动”等问题。该文介绍了旨在提高多路存储效率的多磁盘存储调度算法和解决“抖动”问题的存储预分配算法及其设计思路和实现方法。实际应用表明,多磁盘存储调度算法能够很好地利用多硬盘大容量的特点,根据各个硬盘的容量和被访问的次数,将多路存储“平均”分配于多个硬盘中,充分提高存储访问的效率,并通过资源预约方式、预分配算法很好地解决了存储“抖动”问题。相似文献