期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

陈昊罡汪小林王振林靳辛欣温翔罗英伟李晓明《计算机科学与探索》2010,4(12):1073-1088

在虚拟机(virtual machine)系统中,随着虚拟机数量和应用程序需求的不断增长,内存容量已经成为应用程序性能的主要瓶颈。为了提升内存密集型和I/O密集型程序的页面交换性能,提出了虚拟机的远程磁盘缓存机制REMOCA,它允许运行在一台物理主机上的虚拟机将其他物理主机的内存作为其二级磁盘缓存。由于网络传输延迟远远小于磁盘访问,用网络传输代替磁盘访问就能够有效地降低虚拟机的平均磁盘访问延迟。REMOCA的目标就要尽可能地减少磁盘访问。REMOCA运行在虚拟机管理器中,其基本工作原理是截获并处理虚拟机的页面淘汰、磁盘访问等事件。REMOCA能够与现有的虚拟机内存管理机制(如气球技术、影子缓存)相结合,从而提供更加灵活的内存资源管理策略。实验数据表明,REMOCA能有效地降低页面抖动对虚拟机性能的影响,并在很大程度上提升虚拟机中I/O密集型应用的性能。相似文献

2.

Clustering spatial networks for aggregate query processing: A hypergraph approach

Engin Demir Cevdet Aykanat B. Barla Cambazoglu 《Information Systems》2008

In spatial networks, clustering adjacent data to disk pages is highly likely to reduce the number of disk page accesses made by the aggregate network operations during query processing. For this purpose, different techniques based on the clustering graph model are proposed in the literature. In this work, we show that the state-of-the-art clustering graph model is not able to correctly capture the disk access costs of aggregate network operations. Moreover, we propose a novel clustering hypergraph model that correctly captures the disk access costs of these operations. The proposed model aims to minimize the total number of disk page accesses in aggregate network operations. Based on this model, we further propose two adaptive recursive bipartitioning schemes to reduce the number of allocated disk pages while trying to minimize the number of disk page accesses. We evaluate our clustering hypergraph model and recursive bipartitioning schemes on a wide range of road network datasets. The results of the conducted experiments show that the proposed model is quite effective in reducing the number of disk accesses incurred by the network operations. 相似文献

3.

Efficient video file allocation schemes for video-on-demand services

Yuewei Wang Jonathan C.L. Liu David H.C. Du Jenwei Hsieh 《Multimedia Systems》1997,5(5):283-296

A video-on-demand (VOD) server needs to store hundreds of movie titles and to support thousands of concurrent accesses. This, technically and economically, imposes a great challenge on the design of the disk storage subsystem of a VOD server. Due to different demands for different movie titles, the numbers of concurrent accesses to each movie can differ a lot. We define access profile as the number of concurrent accesses to each movie title that should be supported by a VOD server. The access profile is derived based on the popularity of each movie title and thus serves as a major design goal for the disk storage subsystem. Since some popular (hot) movie titles may be concurrently accessed by hundreds of users and a current high-end magnetic disk array (disk) can only support tens of concurrent accesses, it is necessary to replicate and/or stripe the hot movie files over multiple disk arrays. The consequence of replication and striping of hot movie titles is the potential increase on the required number of disk arrays. Therefore, how to replicate, stripe, and place the movie files over a minimum number of magnetic disk arrays such that a given access profile can be supported is an important problem. In this paper, we formulate the problem of the video file allocation over disk arrays, demonstrate that it is a NP-hard problem, and present some heuristic algorithms to find the near-optimal solutions. The result of this study can be applied to the design of the storage subsystem of a VOD server to economically minimize the cost or to maximize the utilization of disk arrays. 相似文献

4.

Optimal secondary storage access sequence for performing relationaljoin

Fotouhi F. Pramanik S. 《Knowledge and Data Engineering, IEEE Transactions on》1989,1(3):318-328

Two graph models are developed to determine the minimum required buffer size for achieving the theoretical lower bound on the number of disk accesses for performing relational joins. Here, the lower bound implies only one disk access per joining block or page. The first graph model is based on the block connectivity of the joining relations. Using this model, the problem of determining an ordered list of joining blocks that requires the smallest buffer is considered. It is shown that this problem as well as the problem of computing the least upper bound on the buffer size is NP-hard. The second graph model represents the page connectivity of the joining relations. It is shown that the problem of computing the least upper bound on the buffer size for the page connectivity model is also NP-hard. Heuristic procedures are presented for the page connectivity model and it is shown that the sequence obtained using the heuristics requires a near-optimal buffer size The authors also show the performance improvement of the proposed heuristics over the hybrid-has join algorithm for a wide range of join factors 相似文献

5.

A unified benchmarking and model-based framework for building QoS-aware streaming media services

Ludmila Cherkasova Wenting Tang Amin Vahdat 《Multimedia Systems》2006,11(6):532-549

A number of technology and workload trends motivate us to consider the appropriate resource allocation mechanisms and policies for streaming media services in shared cluster environments. We present MediaGuard – a model-based infrastructure for building streaming media services – that can efficiently determine the fraction of server resources required to support a particular client request over its expected lifetime. The proposed solution is based on a unified cost function that uses a single value to reflect overall resource requirements such as the CPU, disk, memory, and bandwidth necessary to support a particular media stream based on its bit rate and whether it is likely to be served from memory or disk. We design a novel, time-segment-based memory model of a media server to efficiently determine in linear time whether a request will incur memory or disk access when given the history of previous accesses and the behavior of the server's main memory file buffer cache. Using the MediaGuard framework, we design two media services: (1) an efficient and accurate admission control service for streaming media servers that accounts for the impact of the server's main memory file buffer cache, and (2) a shared streaming media hosting service that can efficiently allocate the predefined shares of server resources to the hosted media services, while providing performance isolation and QoS guarantees among the hosted services. Our evaluation shows that, relative to a pessimistic admission control policy that assumes that all content must be served from disk, MediaGuard (as well as services that are built using it) deliver a factor of two improvement in server throughput. 相似文献

6.

Efficient scheduling of page access in index-based join processing 总被引：1，自引：0，他引：1

Chee Yong Chan Beng Chin Ooi 《Knowledge and Data Engineering, IEEE Transactions on》1997,9(6):1005-1011

The paper examines the issue of scheduling page accesses in join processing, and proposes new heuristics for the following scheduling problems: 1) an optimal page access sequence for a join such that there are no page reaccesses using the minimum number of buffer pages, and 2) an optimal page access sequence for a join such that the number of page reaccesses for a given number of buffer pages is minimum. The experimental performance results show that the new heuristics perform better than existing heuristics for the first problem and also perform better for the second problem, provided that the number of available buffer pages is not much less than the optimal buffer size 相似文献

7.

An effective approach to vertical partitioning for physical designof relational databases

Cornell D.W. Yu P.S. 《IEEE transactions on pattern analysis and machine intelligence》1990,16(2):248-258

Vertical partitioning can be used to enhance the performance of relational database systems by reducing the number of disk accesses. The authors identify the key parameters for capturing the behavior of an access plan and propose a two-step methodology consisting of a query analysis step to estimate the parameters and a binary partitioning step which can be applied recursively. The partitioning uses an integer linear programming technique to minimize the number of disk accesses. Significant performance benefit would be achieved for join if the partitioned (inner) relation could fit into the memory buffer under the inner-outer loop join method, or if the partitioned relation could fit into the sort buffer under the sort-merge join method, but not the original relation. For cases where a segment scan or a cluster index scan is used, vertical partitioning of the relation with the algorithm described is still often found to lead to substantial performance improvement 相似文献

8.

Page access scheduling in join processing

Andrew Kwan Oon 《Data & Knowledge Engineering》2001,37(3):267-284

The join relational operation is one of the most expensive among database operations. In this study, we consider the problem of scheduling page accesses in join processing. This raises two interesting problems: (1) determining a page access sequence that uses the minimum number of buffer pages without any page reaccesses, and (2) determining a page access sequence that minimizes the number of page reaccesses for a given buffer size. We use a graph model to represent the pages from the relations that contain tuples to be joined, and present new heuristics for the two problems. Our experimental results show that the new heuristic performs well. 相似文献

9.

Performance of a Mass-Storage System for Video-on-Demand

《Journal of Parallel and Distributed Computing》1995,30(2):147-167

Advancements in storage technology along with the fast deployment of high-speed networks has allowed the storage, transmission, and manipulation of multimedia information such as text, graphics, still images, video, and audio to be feasible. Our study focused on the performance of the mass storage system for a large-scale video-on-demand server. Different video file striping schemes, such as application level striping and device driver level striping, were examined in order to study scalability and performance issues. To study the impact of different concurrent access patterns on the performance of a server, experimental results were obtained on group access on a single video file and multiple group accesses on multiple video files. All of our experiments were conducted on a fully configured Silicon Graphics Inc. Onyx computer system. The Onyx machine was connected to 31 SCSI-2 channels which have 496 GBytes disk storage, 20 MIPS R4400 processors, and 768 MBytes main memory. From the experimental results, the storage system of Onyx machine can potentially provide about 360 concurrent video accesses with guaranteed quality of service. 相似文献

10.

针对嵌入式多媒体系统的外存页面重映射

王力生康珊《计算机应用研究》2008,25(9):2697-2699

介绍一种基于硬件的、可编程的外存页面重映射机制,它可以明显地改善性能,并且由于减少了外存总线的访问而降低了功耗。另外还提出了一种把应用数据与指令存储器映射到外存页面的高效算法,使用图着色技术来支配页面映射程序,目标是通过把冲突页面重映射到不同的存储体来避免页面缺失。相似文献

11.

Characterization of database access pattern for analytic prediction of buffer hit probability 总被引：4，自引：0，他引：4

Asit Dan Ph.D. Philip S. Yu Ph.D. Jen-Yao Chung Ph.D. 《The VLDB Journal The International Journal on Very Large Data Bases》1995,4(1):127-154

The analytic prediction of buffer hit probability, based on the characterization of database accesses from real reference traces, is extremely useful for workload management and system capacity planning. The knowledge can be helpful for proper allocation of buffer space to various database relations, as well as for the management of buffer space for a mixed transaction and query environment. Access characterization can also be used to predict the buffer invalidation effect in a multi-node environment which, in turn, can influence transaction routing strategies. However, it is a challenge to characterize the database access pattern of a real workload reference trace in a simple manner that can easily be used to compute buffer hit probability. In this article, we use a characterization method that distinguishes three types of access patterns from a trace: (1) locality within a transaction, (2) random accesses by transactions, and (3) sequential accesses by long queries. We then propose a concise way to characterize the access skew across randomly accessed pages by logically grouping the large number of data pages into a small number of partitions such that the frequency of accessing each page within a partition can be treated as equal. Based on this approach, we present a recursive binary partitioning algorithm that can infer the access skew characterization from the buffer hit probabilities for a subset of the buffer sizes. We validate the buffer hit predictions for single and multiple node systems using production database traces. We further show that the proposed approach can predict the buffer hit probability of a composite workload from those of its component files. 相似文献

12.

基于ARC的闪存数据库缓冲区算法

梁鑫林铭炜姚志强《计算机系统应用》2018,27(3):156-161

闪存是一种纯电子设备,具备体积小、数据读取速度快、能耗低、抗震性强等优点,被用来部分替代机械硬盘从而提升存储系统的性能.但是,现有的缓冲区置换算法都是针对机械硬盘的物理特性进行设计和优化,因此有必要针对闪存的物理特性重新设计缓冲区置换算法.提出一种新的面向闪存数据库的缓冲区替换算法CF-ARC.算法设计了一种新的页替换机制,即在替换干净页或者脏页的时候考虑其访问频度的大小,优先将访问频度少的干净页替换出缓冲区,使得热页继续留在缓冲区提高命中率,从而获得更好的性能,通过对实验结果的对比分析发现CF-ARC在多数情况下具有比其它置换算法更高的性能. 相似文献

13.

Performance improvement of parallel programs on a broadcast-based distributed shared memory multiprocessor by simulation

《Simulation Modelling Practice and Theory》2008,16(3):338-352

Due to advances in fiber optics and VLSI technology, interconnection networks that allow simultaneous broadcasts are becoming feasible. Distributed shared memory (DSM) implementations on such networks promise high performance even for small applications with small granularity. This paper, after summarizing the architecture of one such implementation called the Simultaneous Multiprocessor Optical Exchange Bus (SOME-Bus), presents simple algorithms for improving the performance of parallel programs running on the SOME-Bus multiprocessor implementing cache-coherent DSM. The algorithms are based on run-time data redistribution via dynamic page migration protocol. They use memory access references together with the information of average channel utilization, average channel waiting time, number of messages in the channel queue or short-term average channel waiting time reported by each node and gathered by hardware monitors to make correct decisions related to the placement of shared data. Simulations with four parallel codes on a 64-processor SOME-Bus show that the algorithms yield significant performance improvements such as reduction in the execution times, number of remote memory accesses, average channel waiting times, average network latencies and increase in average channel utilizations. 相似文献

14.

HAT: an efficient buffer management method for flash-based hybrid storage systems 总被引：1，自引：0，他引：1

Yanfei LV Bin CUI Xuexuan CHEN Jing LI 《Frontiers of Computer Science》2014,8(3):440-455

Flash solid-state drives (SSDs) provide much faster access to data compared with traditional hard disk drives (HDDs). The current price and performance of SSD suggest it can be adopted as a data buffer between main memory and HDD, and buffer management policy in such hybrid systems has attracted more and more interest from research community recently. In this paper, we propose a novel approach to manage the buffer in flash-based hybrid storage systems, named hotness aware hit (HAT). HAT exploits a page reference queue to record the access history as well as the status of accessed pages, i.e., hot, warm, and cold. Additionally, the page reference queue is further split into hot and warm regions which correspond to the memory and flash in general. The HAT approach updates the page status and deals with the page migration in the memory hierarchy according to the current page status and hit position in the page reference queue. Compared with the existing hybrid storage approaches, the proposed HAT can manage the memory and flash cache layers more effectively. Our empirical evaluation on benchmark traces demonstrates the superiority of the proposed strategy against the state-of-the-art competitors. 相似文献

15.

Heuristics for join processing using nonclustered indexes

Omiecinski E.R. 《IEEE transactions on pattern analysis and machine intelligence》1989,15(1):18-25

The author examines join processing when the access paths available are nonclustered indexes on the joining attribute(s) for both relations involved in the join. He uses a bipartite graph model to represent the pages from the two relations that contain tuples to be joined. The minimization of the number of page accesses needed to compute a join in the author's database environment is explored from two perspectives. The first is to reduce the maximum buffer size so that no page is accessed more than once, and the second is to reduce the number of page accesses for a fixed buffer size. The author has developed heuristics for these problems. He gives performance comparisons of these heuristics and another method that recently appeared in the literature. Results show that one particular heuristic performs very well for addressing the problem from either perspective 相似文献

16.

Analysis and simulation of an out-of-order execution model in vector multiprocessor systems

《Parallel Computing》1997,23(13):1963-1986

Memory conflict is a major phenomenon which may cause dramatic loss of performance in vector pipeline multiprocessors. Various techniques have been proposed and implemented to avoid such conflicts. They rely mostly on well-tuned vector element allocation in memory banks (either using programming tools or hard-wired features). We tackle this problem in another way. Instead of trying to avoid memory contention, we aim to enhance the performance of the memory system by scheduling vector element accesses in order to increase memory accesses. This scheduling depends on the memory bank activity when an access is issued, leading to out-of-order access to vector elements. An out-of-order pipeline execution is associated with this out-of-order memory access in order to maintain the processor functional unit chaining. In this paper we study some factors influencing this execution model: vector length, number of processors and number of banks. An analysis of this model using the Markov chain technique and simulation results are also presented. They show the importance of this model in comparison with the classical one encountered in pipelined vector supercomputers. 相似文献

17.

A memory interference model for regularly patterned multiple streamvector accesses

Qing Yang Tao Yang 《Parallel and Distributed Systems, IEEE Transactions on》1995,6(5):520-530

Most existing analytical models for memory interference generally assume random bank selection for each memory access. In vector computers, however, memory accesses are typically regularly patterned with a number of data items being accessed concurrently from different banks. Very little is known about the queueing behavior of memory interferences in multiple stream vector accesses. This paper presents an analytical model for memory interferences due to vector accesses in multiple vector processor systems. The model captures the effects of both bank conflicts among elements within one vector access stream and conflicts among multiple vector access streams on system performance. The model is based on a closed queueing network assuming an ideal interconnection network. An approximation technique is proposed to solve the memory queueing system that serves customers in a complicated way (non-FIFO). We also carry out extensive simulation experiments to study memory interference and validate our analytical model. Simulation results and analytical results are in a very good agreement, indicating that the model is very accurate. We further validate our analysis by comparing the numerical results obtained from our analytical model with those measurement results that were published by other researchers. Based on our analytical model and simulations, we carry out performance evaluation of the multiple vector processor systems. Our numerical results show that memory access conflicts pose a severe limitation on the number of useful processors in the system, implying that memory system design is essential to high-performance computing 相似文献

18.

A cost model for spatio-temporal queries using the TPR-tree

《Journal of Systems and Software》2004,73(1):101-112

A query optimizer requires cost models to calculate the costs of various access plans for a query. An effective method to estimate the number of disk (or page) accesses for spatio-temporal queries has not yet been proposed. The TPR-tree is an efficient index that supports spatio-temporal queries for moving objects. Existing cost models for the spatial index such as the R-tree do not accurately estimate the number of disk accesses for spatio-temporal queries using the TPR-tree, because they do not consider the future locations of moving objects, which change continuously as time passes.In this paper, we propose an efficient cost model for spatio-temporal queries to solve this problem. We present analytical formulas which accurately calculate the number of disk accesses for spatio-temporal queries. Extensive experimental results show that our proposed method accurately estimates the number of disk accesses over various queries to spatio-temporal data combining real-life spatial data and synthetic temporal data. To evaluate the effectiveness of our method, we compared our spatio-temporal cost model (STCM) with an existing spatial cost model (SCM). The application of the existing SCM has the average error ratio from 52% to 93%, whereas our STCM has the average error ratio from 11% to 32%. 相似文献

19.

Linked instruction caches for enhancing power efficiency of embedded systems

Chang-Jung Ku Ching-Wen Chen An Hsia Chun-Lin Chen 《Microprocessors and Microsystems》2014

The power consumed by memory systems accounts for 45% of the total power consumed by an embedded system, and the power consumed during a memory access is 10 times higher than during a cache access. Thus, increasing the cache hit rate can effectively reduce the power consumption of the memory system and improve system performance. In this study, we increased the cache hit rate and reduced the cache-access power consumption by developing a new cache architecture known as a single linked cache (SLC) that stores frequently executed instructions. SLC has the features of low power consumption and low access delay, similar to a direct mapping cache, and a high cache hit rate similar to a two way-set associative cache by adding a new link field. In addition, we developed another design known as a multiple linked caches (MLC) to further reduce the power consumption during each cache access and avoid unnecessary cache accesses when the requested data is absent from the cache. In MLC, the linked cache is split into several small linked caches that store frequently executed instructions to reduce the power consumption during each access. To avoid unnecessary cache accesses when a requested instruction is not in the linked caches, the addresses of the frequently executed blocks are recorded in the branch target buffer (BTB). By consulting the BTB, a processor can access the memory to obtain the requested instruction directly if the instruction is not in the cache. In the simulation results, our method performed better than selective compression, traditional cache, and filter cache in terms of the cache hit rate, power consumption, and execution time. 相似文献

20.

On using cache conscious clustering for improving OODBMS performance

《Information and Software Technology》2006,48(11):1073-1082

The two main techniques of improving I/O performance of Object Oriented Database Management Systems (OODBMS) are clustering and buffer replacement. Clustering is the placement of objects accessed near to each other in time into the same page. Buffer replacement involves the selection of a page to be evicted, when the buffer is full. The page evicted ideally should be the page needed least in the future. These two techniques both influence the likelihood of a requested object being memory resident. We believe an effective way of reducing disk I/O is to take advantage of the synergy that exists between clustering and buffer replacement. Hence, we design a framework, whereby clustering algorithms incorporating buffer replacement cache behaviour can be conveniently employed for enhancing the I/O performance of OODBMS. We call this new type of clustering algorithm, Cache Conscious Clustering (C3). In this paper, we present the C3 framework, and a C3 algorithm that we have developed, namely C3-GP. We have tested C3-GP against three well known clustering algorithms. The results show that C3-GP out performs them by up to 40% when using popular buffer replacement algorithms such as LRU and CLOCK. C3-GP offers the same performance as the best existing clustering algorithm when the buffer size compared to the database size is very small. 相似文献