期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A prediction-based dynamic file assignment strategy for parallel file systems

《Parallel Computing》2015

Nowadays, the rapid development of the internet calls for a high performance file system, and a lot of efforts have already been devoted to the issue of assigning nonpartitioned files in a parallel file system with the aim of pursuing a prompt response to requests. Yet most of the existing strategies still fail to bring about an optimal performance on system mean response time metrics, and new strategies which can achieve better performance in terms of mean response time become indispensable for parallel file systems. This paper, while addressing the issue of assigning nonpartitioned files in parallel file systems where the file accesses exhibit Poisson arrival rates and fixed service times, presents an on-line file assignment strategy, named prediction-based dynamic file assignment (PDFA), to minimize the mean response time among disks under different workload conditions, and a comparison of the PDFA with the well-known file assignment algorithms, such as HP and SOR. Comprehensive experimental results show that PDFA is able to improve the performance consistently in terms of mean response time among all algorithms for comparison. 相似文献

2.

Fault tolerant file models for parallel file systems: introducing distribution patterns for every file

A. Calderón F. García-Carballeira L. M. Sánchez J. D. García J. Fernandez 《The Journal of supercomputing》2009,47(3):312-334

Parallelism in file systems is obtained by using several independent server nodes supporting one or more secondary storage devices. This approach increases the performance and scalability of the system, but a fault in one single node can stop the whole system. To avoid this problem, data must be stored using some kind of redundant technique, so any data stored in a faulty element can be recovered. Fault tolerance can be provided in I/O systems by using replication or RAID based schemes. However, most of the current systems apply the same technique for all files in the system. This paper describes the fault tolerance support provided by Expand, a parallel file system based on standard servers. This support can be applied to other parallel file systems with many benefices: fault tolerance at file level, flexible definition of fault tolerance scheme to be used, possibility to change the fault tolerant support used for a file, etc.

A. CalderónEmail:

相似文献

3.

Mlock:building delegable metadata service for the parallel file systems

ZHANG Quan FENG Dan WANG Fang WU Sen 《中国科学:信息科学(英文版)》2015,(3):66-79

The ever-growing demand for high performance computation calls for progressively larger parallel distributed file systems to match their requirement.These file systems can achieve high performance for large I/O operations through distributing load across numerous data servers.However,they fail to provide quality service for applications pertaining to small files.In this paper,we propose a delegable metadata service(DMS)for hiding latency of metadata accesses and optimizing small-file performance.In addition,four techniques have been designed to maintain consistency and efficiency in DMS:pre-allocate serial metahandles,directory-based metadata replacement,packing transaction operations and fine-grained lock revocation.These schemes have been employed in Cappella parallel distributed file system,and various experiments complying with industrial standards have been conducted for evaluation of its efficiency.The results show that our design has achieved significant improvement in performance of both metadata operations and small-file access.Moreover,this scheme is widely applicable for integration within many other distributed file systems. 相似文献

4.

A dynamic and adaptive load balancing strategy for parallel file system with large-scale I/O servers

Bin Dong Xiuqiao LiAuthor Vitae Qimeng WuAuthor VitaeLimin Xiao Li RuanAuthor Vitae 《Journal of Parallel and Distributed Computing》2012

Many solutions have been proposed to tackle the load imbalance issue of parallel file systems. However, all these solutions either adopt centralized algorithms, or lack considerations for both the network transmission and the tradeoff between benefits and side-effects of each dynamic file migration. Therefore, existing solutions will be prohibitively inefficient in large-scale parallel file systems. To address this problem, this paper presents SALB, a dynamic and adaptive load balancing algorithm which is totally based on a distributed architecture. To be also aware of the network transmission, SALB on the one hand adopts an adaptively adjusted load collection threshold in order to reduce the message exchanges for load collection, and on the other hand it employs an on-line load prediction model with a view to reducing the decision delay caused by the network transmission latency. Moreover, SALB employs an optimization model for selecting the migration candidates so as to balance the benefits and the side-effects of each dynamic file migration. Extensive experiments are conducted to prove the effectiveness of SALB. The results show that SALB achieves an optimal performance not only on the mean response time but also on the resource utilization among the schemes for comparison. The simulation results also indicate that SALB is able to deliver high scalability. 相似文献

5.

基于CPCI热切换技术实现高可用适度并行系统

熊庭刚马中《计算机工程与设计》2005,26(9):2400-2403,2406

为了满足应用对性能、可靠性和成本的严格要求,适度并行计算机系统必须具有很高的系统可用度.CPCI总线热切换规范作为一个工业标准,为实现高可用的适度并行计算机系统提供了良好的基础.提出了一种基于热切换技术实现高可用适度并行计算机系统的体系结构.运用该结构设计的适度并行计算机系统能够实现高效的容错并行计算,系统的可用性服务机制具有标准结构,较好地满足了应用的要求. 相似文献

6.

Energy efficient redundant configurations for real-time parallel reliable servers

Dakai Zhu Rami Melhem Daniel Mossé 《Real-Time Systems》2009,41(3):195-221

Modular redundancy and temporal redundancy are traditional techniques to increase system reliability. In addition to being used as temporal redundancy, with technology advancements, slack time in a system can also be used by energy management schemes to save energy. In this paper, we consider the combination of modular and temporal redundancy to achieve energy efficient reliable real-time service provided by multiple servers. We first propose an efficient adaptive parallel recovery scheme that appropriately processes service requests in parallel to increase the number of faults that can be tolerated and thus system reliability. Then we explore schemes to determine the optimal redundant configurations of the parallel servers to minimize system energy consumption for a given reliability goal or to maximize system reliability for a given energy budget. Our analysis results show that small requests, optimistic approaches, and parallel recovery favor lower levels of modular redundancy, while large requests, pessimistic approaches and restricted serial recovery favor higher levels of modular redundancy.

Daniel MosséEmail:

相似文献

7.

高可用应用开发平台的设计与实现 总被引：1，自引：0，他引：1

下载免费PDF全文

吴俊敏李黄海黄刘生鲍春健王文韬《计算机工程》2006,32(24):283-284,F0003

设计并实现了一个高可用应用开发平台，该平台提供了一个灵活的高可用编程环境，不仅包含高可用应用开发框架，还包含了应用开发所必需的检查点服务、分布式锁服务、事件服务、消息服务和成员服务等。采用该平台使得应用程序的开发只需要集中干具体的业务流程上。而不需要过多地考虑其他高可用特性，同时还可提供更强的可移植性。相似文献

8.

Avoiding disruptive failovers in transaction processing systems with multiple active nodes

Gong Su Arun Iyengar 《Journal of Parallel and Distributed Computing》2013

We present a highly available system for environments such as stock trading, where high request rates and low latency requirements dictate that service disruption on the order of seconds in length can be unacceptable. After a node failure, our system avoids delays in processing due to detecting the failure or transferring control to a back-up node. We achieve this by using multiple primary nodes which process transactions concurrently as peers. If a primary node fails, the remaining primaries continue executing without being delayed at all by the failed primary. Nodes agree on a total ordering for processing requests with a novel low overhead wait-free algorithm that utilizes a small amount of shared memory accessible to the nodes and a simple compare-and-swap like protocol which allows the system to progress at the speed of the fastest node. We have implemented our system on an IBM z990 zSeries eServer mainframe and show experimentally that our system performs well and can transparently handle node failures without causing delays to transaction processing. The efficient implementation of our algorithm for ordering transactions is a critically important factor in achieving good performance. 相似文献

9.

高可用应用开发平台的设计与实现

吴俊敏李黄海黄刘生鲍春健王文韬《计算机工程》2006,32(24):283-284

设计并实现了一个高可用应用开发平台,该平台提供了一个灵活的高可用编程环境,不仅包含高可用应用开发框架,还包含了应用开发所必需的检查点服务、分布式锁服务、事件服务、消息服务和成员服务等。采用该平台使得应用程序的开发只需要集中于具体的业务流程上,而不需要过多地考虑其他高可用特性,同时还可提供更强的可移植性。相似文献

10.

Practical prefetching techniques for multiprocessor file systems

David Kotz Carla Schlatter Ellis 《Distributed and Parallel Databases》1993,1(1):33-51

Improvements in the processing speed of multiprocessors are outpacing improvements in the speed of disk hardware. Parallel disk I/O subsystems have been proposed as one way to close the gap between processor and disk speeds. In a previous paper we showed that prefetching and caching have thepotential to deliver the performance benefits of parallel file systems to parallel applications. In this paper we describe experiments withpractical prefetching policies that base decisions only on on-line reference history, and that can be implemented efficiently. We also test the ability of those policies across a range of architectural parameters. 相似文献

11.

Load and storage balanced posting file partitioning for parallel information retrieval

Yung-Cheng MaAuthor Vitae Chung-Ping ChungAuthor VitaeTien-Fu ChenAuthor Vitae 《Journal of Systems and Software》2011,84(5):864-884

相似文献

12.

Design and implementation of a distributed file system

Hsiao-Chung Cheng Jang-Ping Sheu 《Software》1991,21(7):657-675

We introduce a new model for replication in distributed systems. The primary motivation for replication lies in fault tolerance. Although there are different kinds of replication approaches, our model combines the advantages of modular redundancy and primary-stand-by approaches to give more flexibility with respect to system configuration. To implement such a model, we select the IBM PC-net with MS-DOS environment as our base. Transparency as well as fault-tolerance file access are the highlights of our system design. To fulfil these requirements, we incorporate the idea of directory-oriented replication and extended prefix tables in the system design. The implementation consists of a command shell, a DOS manager, and a recovery manager. Through this design, we can simulate a UNIX-like distributed file system whose function is compatible with MS-DOS. 相似文献

13.

Architectural considerations for next-generation file systems

Prashant Shenoy Pawan Goyal Harrick M. Vin 《Multimedia Systems》2002,8(4):270-283

Integration – supporting multiple application classes with heterogeneous performance requirements – is an emerging trend in networks, file systems, and operating systems. We evaluate two architectural alternatives – partitioned and integrated – for designing next-generation file systems. Whereas a partitioned server employs a separate file system for each application class, an integrated file server multiplexes its resources among all application classes; we evaluate the performance of the two architectures with respect to sharing of disk bandwidth among the application classes. We show that although the problem of sharing disk bandwidth in integrated file systems is conceptually similar to that of sharing network link bandwidth in integrated services networks, the arguments that demonstrate the superiority of integrated services networks over separate networks are not applicable to file systems. Furthermore, we show that: an integrated server outperforms the partitioned server in a large operating region and has slightly worse performance in the remaining region; the capacity of an integrated server is larger than that of the partitioned server; and an integrated server outperforms the partitioned server by a factor of up to 6 in the presence of bursty workloads. 相似文献

14.

Semi-passive replication and Lazy Consensus

Xavier Dfago Andr Schiper 《Journal of Parallel and Distributed Computing》2004,64(12):248

This paper presents two main contributions: semi-passive replication and Lazy Consensus. The former is a replication technique with parsimonious processing. It is based on the latter; a variant of Consensus allowing the lazy evaluation of proposed values.Semi-passive replication is a replication technique with parsimonious processing. This means that, in the normal case, each request is processed by only one single process. The most significant aspect of semi-passive replication is that it requires a weaker system model than existing techniques of the same family. For semi-passive replication, we give an algorithm based on the Lazy Consensus.Lazy Consensus is a variant of the Consensus problem that allows the lazy evaluation of proposed values, hence the name. The main difference with Consensus is the introduction of an additional property of laziness. This property requires that proposed values are computed only when they are actually needed. We present an algorithm based on Chandra and Toueg's Consensus algorithm for asynchronous distributed systems with a S failure detector. 相似文献

15.

The order of merging operations for queries in inverted file systems

Anne Putkonen 《International journal of parallel programming》1980,9(5):351-369

In inverted file systems, queries can be written as Boolean expressions of inverted attributes. In response to a query, the system accesses address lists associated with the attributes in the query, merges them, and selects those records that satisfy the search logic. In this paper we consider the minimization of the CPU time needed for the merging operation. The time can possibly be reduced by taking address lists that occur in several product terms as a common factor of these products. This means that the union operation must be performed before the intersection operation. We present formulas which can be used to decide whether the above method is advantageous. The time can also be reduced by choosing the order of intersection operations so that it takes into consideration the occurrences of the address lists in the products and the lengths of the address lists. For choosing the order of intersection operations we give a heuristic algorithm that minimizes the total time needed for intersections. 相似文献

16.

A split-correct parallel algorithm for solving tridiagonal symmetric toeplitz systems

《国际计算机数学杂志》2012,89(3):303-313

In 1994, Yan and Chung produced a fast algorithm for solving a diagonally dominant symmetric Toeplitz tridiagonal system of linear equations Ax = b. In this work a method will be presented which will allow for problems of the above nature to be split into two separate systems which can be solved in parallel, and then combined and corrected to obtain a solution to the original system. An error analysis will be provided along with example cases and time comparison results. 相似文献

17.

Mapping functions and data redistribution for parallel files

Florin Isaila Walter F. Tichy 《The Journal of supercomputing》2008,46(3):213-236

Data distribution in memory or on disks is an important factor influencing the performance of parallel applications. On the other hand, programs or systems, like a parallel file system, frequently redistribute data between memory and disks. This paper presents a generalization of previous approaches of the redistribution problem. We introduce algorithms for mapping between two arbitrary distributions of a data set. The algorithms are optimized for multidimensional array partitions. We motivate our approach and present potential utilizations. The paper also presents a case study, the employment of mapping functions, and redistribution algorithms in a parallel file system.

Walter F. TichyEmail:

相似文献

18.

A novel dynamic load balancing scheme for parallel systems

Zhiling Valerie E. Greg 《Journal of Parallel and Distributed Computing》2002,62(12)

Adaptive mesh refinement (AMR) is a type of multiscale algorithm that achieves high resolution in localized regions of dynamic, multidimensional numerical simulations. One of the key issues related to AMR is dynamic load balancing (DLB), which allows large-scale adaptive applications to run efficiently on parallel systems. In this paper, we present an efficient DLB scheme for structured AMR (SAMR) applications. This scheme interleaves a grid-splitting technique with direct grid movements (e.g., direct movement from an overloaded processor to an underloaded processor), for which the objective is to efficiently redistribute workload among all the processors so as to reduce the parallel execution time. The potential benefits of our DLB scheme are examined by incorporating our techniques into a SAMR cosmology application, the ENZO code. Experiments show that by using our scheme, the parallel execution time can be reduced by up to 57% and the quality of load balancing can be improved by a factor of six, as compared to the original DLB scheme used in ENZO. 相似文献

19.

Optimal algorithms for online scheduling on parallel machines to minimize the makespan with a periodic availability constraint

Ming Liu Feifeng ZhengChengbin Chu Yinfeng Xu 《Theoretical computer science》2011,412(39):5225-5231

相似文献

20.

A high availability √N hierarchical grid algorithm for replicated data

Akhil Kumar 《Information Processing Letters》1991,40(6)

相似文献