首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 0 毫秒
Contemporary applications continuously modify large volumes of multidimensional data that must be accessed efficiently and, more importantly, must be updated in a timely manner. Single-server storage approaches are insufficient when managing such volumes of data, while the high frequency of data modification render classical indexing methods inefficient. To address these two problems we introduce a distributed storage manager for multidimensional data based on a Cluster-of-Workstations. The manager addresses the above challenges through a set of mechanisms that, through selective on-line data reorganization, collectively maintain a balanced load across a cluster of workstations. With the help of both a highly efficient and speedy self-tuning mechanism, based on a new data structure called stat-index, as well as a query aggregation and clustering algorithm, our storage manager attains short query response times even in the presence of massive modifications and highly skewed access patterns. Furthermore, we provide a data migration cost model used to determine the best data redistribution strategy. Through extensive experimentation with our prototype, we establish that our storage manager can sustain significant update rates with minimal overhead.  相似文献   

Clusters of workstations employ flexible topologies: regular, irregular, and hierarchical topologies have been used in such systems. The flexibility poses challenges for developing efficient collective communication algorithms since the network topology can potentially have a strong impact on the communication performance. In this paper, we consider the all-to-all broadcast operation on clusters with cut-through and store-and-forward switches. We show that near-optimal all-to-all broadcast on a cluster with any topology can be achieved by only using the links in a spanning tree of the topology when the message size is sufficiently large. The result implies that increasing network connectivity beyond the minimum tree connectivity does not improve the performance of the all-to-all broadcast operation when the most efficient topology specific algorithm is used. All-to-all broadcast algorithms that achieve near-optimal performance are developed for clusters with cut-through and clusters with store-and-forward switches. We evaluate the algorithms through experiments and simulations. The empirical results confirm our theoretical finding.  相似文献   

对随机模式匹配算法进行了改进,并根据MPICH并行编程环境中任务间通信的特点,设计了一种基于MPICH的改进的随机模式匹配并行算法。根据运行在COW(工作站机群)上的进程数目将文本串进行重叠划分,每个进程完成一个文本子串的模式匹配。实验结果表明,该改进的随机模式匹配并行算法有效地加快了模式匹配的速度,提高了工作站机群的资源利用率。  相似文献   

By splitting a large broadcast message into segments and broadcasting the segments in a pipelined fashion, pipelined broadcast can achieve high performance in many systems. In this paper, we investigate techniques for efficient pipelined broadcast on clusters connected by multiple Ethernet switches. Specifically, we develop algorithms for computing various contention-free broadcast trees that are suitable for pipelined broadcast on Ethernet switched clusters, extend the parametrized LogP model for predicting appropriate segment sizes for pipelined broadcast, show that the segment sizes computed based on the model yield high performance, and evaluate various pipelined broadcast schemes through experimentation on Ethernet switched clusters with various topologies. The results demonstrate that our techniques are practical and efficient for contemporary fast Ethernet and Giga-bit Ethernet clusters.  相似文献   

利用FCM求解最佳聚类数的算法   总被引:2,自引:0,他引:2  
利用FCM求解最佳聚类数的算法中,每次应用FCM算法都要重新初始化类中心,而FCM算法对初始类中心敏感,这样使得利用FCM求解最佳聚类数的算法很不稳定。对该算法进行了改进,提出了一个合并函数,使得(c-1)类的类中心依赖于类的类中心。仿真实验表明:新的算法稳定性好,且运算速度明显比旧的算法要快。  相似文献   

Network of workstations (NOW) has become a widely accepted form of high-performance parallel computing. As in conventional multicomputers, parallel programs running on such a platform are often written in an SPMD form to exploit data parallelism. Each workstation in a NOW is treated similarly to a processing element in a multicomputer system. However, workstations are far more powerful and flexible than the processing elements in conventional multicomputers. In this paper, we discuss how workstations in a NOW can be used to exploit more parallelism in an SPMD program, especially those induced from concurrent activities.  相似文献   

Volker Strumpen 《Software》1995,25(3):291-304
We present a highly scalable approach to distributed parallel computing on workstations in the Internet which provides significant speed-up to molecular biology sequence analysis. Recent developments show that smaller numbers of workstations connected via a local area network can be used efficiently for parallel computing. This work emphasizes scalability with respect to the number of workstations employed. We show that a massively parallel approach using several hundred workstations, dispersed over all continents, can successfully be applied for solving problems with low requirements on communication bandwidth. We calculated the optimal local alignment scores between a single genetic sequence and all sequences of a genetic sequence database using the ssearch code that is well known among molecular biologists. In a heterogeneous network with more than 800 workstations this job terminated after several minutes, in contrast to several days it would have taken on a single machine.  相似文献   

An important problem in distributed systems is to detect termination of a distributed computation. A computation is said to have terminated when all processes have become passive and all channels have become empty. In this paper, we present a suite of algorithms for detecting termination of a non-diffusing computation for an arbitrary communication topology under a variety of conditions. All our termination detection algorithms have optimal message complexity. Furthermore, they have optimal detection latency when message processing time is ignored. A preliminary version of the paper first appeared in the 18th Symposium on Distributed Computing (DISC), 2004 [27].  相似文献   

Computational Fluid Dynamics (CFD) applications are highly demanding for parallel computing. Many such applications have been shifted from expensive MPP boxes to cost-effective Networks of Workstations (NOW). Auto-CFD-NOW is a pre-compiler that transforms Fortran CFD sequential programs to efficient message-passing parallel programs running on NOW. Our work makes the following three unique contributions. First, this pre-compiler is highly automatic, requiring a minimum number of user directives for parallelization. Second, we have applied a dependency analysis technique for the CFD applications, called analysis after partitioning. We propose a mirror-image decomposition technique to parallelize self-dependent field loops that are hard to parallelize by existing methods. Finally, traditional optimizations of communication focus on eliminating redundant synchronizations. We have developed an optimization scheme which combines all the non-redundant synchronizations in CFD programs to further reduce the communication overhead. The Auto-CFD-NOW has been implemented on networks of workstations and has been successfully used for automatically parallelizing structured CFD application programs. Our experiments show its effectiveness and scalability for parallelizing large CFD applications. This work is supported in part by the China National Aerospace Science Foundation, and by the U.S. National Science Foundation under grants CCR-9812187, CCR-0098055, CCF-0325760, CCF 0514078, and CNS 0549006.  相似文献   

The availability of a large number of workstations connected through a network can represent an attractive option for high-performance computing for many applications. The message-passing interface (MPI) software environment is an effort from many organisations to define a de facto message-passing standard. In other words, the original specification was not designed as a comprehensive parallel programming environment and some researchers agree that the standard should be preserved as simple and clean as possible. Nevertheless, a software environment such as MPI should have somehow a scheduling mechanism for the effective submission of parallel applications on network of workstations. This paper presents an alternative lightweight approach called Selective-MPI (S-MPI), which was designed to enhance the efficiency of the scheduling of applications on an MPI implementation environment.  相似文献   

Cloud computing is emerging as an important platform for business, personal and mobile computing applications. In this paper, we study a stochastic model of cloud computing, where jobs arrive according to a stochastic process and request resources like CPU, memory and storage space. We consider a model where the resource allocation problem can be separated into a routing or load balancing problem and a scheduling problem. We study the join-the-shortest-queue routing and power-of-two-choices routing algorithms with the MaxWeight scheduling algorithm. It was known that these algorithms are throughput optimal. In this paper, we show that these algorithms are queue length optimal in the heavy traffic limit.  相似文献   

基于群机系统的并行程序的最大加速比计算   总被引:1,自引:0,他引:1  
加速比是并行程序的重要指标之一。在大多数并行系统中,在数据规 模确定的情况下,程序的加速比随节点工作站的增加而增加,但是大多数群机 系统的节点工作站是共享物理传输介质的,这使得许多并行程序的加速比在节 点机数目超过某一个值之后会随着节,点机的增加而减少。本文通过对群机系统 上并行程序执行时间的分析,论述了在数据规模确定的情况下,程序能够获得 的最大加速比和最短的计算时间,以及获得这个加速比和计算时间的节点机个 数。  相似文献   

Recently, many organisms have had their DNA entirely sequenced. This reality presents the need for comparing long DNA sequences, which is a challenging task due to its high demands for computational power and memory. Sequence comparison is a basic operation in DNA sequencing projects, and most sequence comparison methods currently in use are based on heuristics, which are faster but offer no guarantees of producing the best alignments possible. In order to alleviate this problem, Smith–Waterman proposed an algorithm. This algorithm obtains the best local alignments but at the expense of very high computing power and huge memory requirements. In this article, we present and evaluate our experiments involving three strategies to run the Smith–Waterman algorithm in a cluster of workstations using a Distributed Shared Memory System. Our results on an eight-machine cluster presented very good speed-up and indicate that impressive improvements can be achieved depending on the strategy used. In addition, we present a number of theoretical remarks concerning how to reduce the amount of memory used.  相似文献   

基于Web的远程集群监控系统的设计与实现   总被引:2,自引:0,他引:2  
集群系统的商品化部件构成特点在具有高性价比优点的同时,也带来了可用性和可管理性差的缺点,因此集群系统的监控就变得特别重要。该文结合国家高性能计算中心(西安)的Linux集群系统给出了一种基于Web的集群监控系统的体系结构框架以及实现策略,详细介绍了数据采集、信息收集和存储以及状态的可视化各个模块的具体实现,基于WEB的实现策略使该系统具有平台无关性和监控远程性的优点。  相似文献   

本文介绍了一个通用的pvm并行程序性能可视化软件工具VP~4。针对工作站机群的特点,它采用多层次性能数据采集方法和基于事件的采取策略,这样可以在尽量减少“侵入影响”的前提下,采集并汇总全部性能数据。VP~4对汇总的性能数据进行处理后,利用图形与动画生成各种易于使用的可视化性能视图。通过实验表明,本软件工具可以有效的帮助用户发现性能瓶颈,辅助用户开发高性能的并行程序。  相似文献   

为提高工位数固定的U型拆卸线拆卸效率, 减少有害部件对操作人员的潜在威胁, 针对高价值零部件和有害零部件的拆卸需求, 本文提出了工位数固定的U型拆卸线部分拆卸平衡问题, 建立了以最小化节拍时间、高危工位数目和负载均衡为目标的优化模型, 并设计了改进的变邻域搜索算法进行求解. 在编码过程中提出一种基于零部件释放位置的选择策略, 以减少前继零部件拆卸顺序对编码的影响; 提出最小偏差二分法, 有效减少解码的迭代次数; 提出瓶颈挤压局部搜索策略, 用以优化节拍时间和均衡负载指标. 通过与其他算法对比, 结果表明改进的变邻域搜索算法求解具有优越性, 并且可实现对工位数固定的U型拆卸线部分拆卸平衡问题的高效求解.  相似文献   

Multicore Clusters, which have become the most prominent form of High Performance Computing (HPC) systems, challenge the performance of MPI applications with non-uniform memory accesses and shared cache hierarchies. Recent advances in MPI collective communications have alleviated the performance issue exposed by deep memory hierarchies by carefully considering the mapping between the collective topology and the hardware topologies, as well as the use of single-copy kernel assisted mechanisms. However, on distributed environments, a single level approach cannot encompass the extreme variations not only in bandwidth and latency capabilities, but also in the capability to support duplex communications or operate multiple concurrent copies. This calls for a collaborative approach between multiple layers of collective algorithms, dedicated to extracting the maximum degree of parallelism from the collective algorithm by consolidating the intra- and inter-node communications.  相似文献   

The problem of using the idle cycles of a number of high performance workstations, interconnected by a high speed network, for solving computationally intensive tasks is discussed. The classes of distributed applications examined require some form of synchronization among the subtasks, hence the need for coscheduling to guarantee that subtasks start at the same time and execute at the same pace on a group of workstations. A model of the system is presented that allows the definition of an objective function to be maximized. Then a quadratic time and linear space algorithm is derived for computing the optimal coschedule, for the given model and class of applications addressed.  相似文献   

大规模化工过程系统的分解协调优化并行算法   总被引:2,自引:0,他引:2  
张帆 《计算机仿真》2004,21(6):74-77
该文针对大规模化工过程系统优化中计算能力不够的情况,研究一种适合于大系统求解的分解协调算法。在SQP算法分解计算的基础上,利用无约束优化算法进行协调,同时采用并行技术以提高求解效率。利用单机与机群系统建构仿真计算环境,对一换热器系统进行了实际解算。算例结果表明,此算法是行之有效的,在大规模过程系统优化计算中可进行推广应用。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号