首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
Traditional HPC (High Performance Computing) cluster is built on top of physical machines. It is usually not practical to reassign these machines to other tasks due to the fact that software installation is time consuming. As a result, these machines are usually dedicated for the cluster usage. Virtualization technology provides an abstract layer which allows several different operating systems (with different software packages) running on top of one physical machine. Cloud computing provides an easy way for the user to manage and interact with the computing resources (the virtual machines in this case). In this work, we demonstrate the feasibility of building a cloud-based cluster for HPC on top of a set of desktop computers that are interconnected by means of Fast Ethernet. Our cluster has several advantages. For instance, the deployment time of the cluster is quite fast: We need only 5 min to deploy a cluster of 30 machines, Besides, several performance benchmarks have been carried out. As expected, the embarrassingly parallel problem has the linear relationship between the performance and the cluster size.  相似文献   

2.
配电网络系统潮流计算的一种并行算法   总被引:1,自引:0,他引:1  
In this paper, aiming to the problems, such as slow convergence, long computing time in the tidal current computation of present medium or large -scale distributed power system, one parallel algorithm based on MPI programming model and the character of distributed power system is given. Then the performance analysis is described. And the algorithm has been programmed in MPICH language. At last, the validity of the algorithm is verified by a middle-scale sample computation on 8 CPUs in a small cluster with 128 CPUs.  相似文献   

3.
With the advances in the high speed computers network technologies, a workstation cluster is becoming the main environment for parallel processing. Finite element linear systems of equations are common throughout structural analysis in Civil Engineering. The preconditioned conjugate gradient method (PCGM) is an iterative method used to solve the finite element systems of equations with symmetric positive definite system matrices. In this paper, the algorithm of PCGM is parallelized and implemented on DELL workstation cluster. Optimization techniques for the sparse matrix vector multiplication are adopted in programming. The storage scheme is analyzed in detail. The experiment result shows that the designed parallel algorithm has high speedup and good efficiency on the high performance workstation cluster. This illustrates the power of parallel computing in solving large problems much faster than on a single processor.  相似文献   

4.
Currently,the cloud computing systems use simple key-value data processing,which cannot support similarity search efectively due to lack of efcient index structures,and with the increase of dimensionality,the existing tree-like index structures could lead to the problem of"the curse of dimensionality".In this paper,a novel VF-CAN indexing scheme is proposed.VF-CAN integrates content addressable network(CAN)based routing protocol and the improved vector approximation fle(VA-fle) index.There are two index levels in this scheme:global index and local index.The local index VAK-fle is built for the data in each storage node.VAK-fle is thek-means clustering result of VA-fle approximation vectors according to their degree of proximity.Each cluster forms a separate local index fle and each fle stores the approximate vectors that are contained in the cluster.The vector of each cluster center is stored in the cluster center information fle of corresponding storage node.In the global index,storage nodes are organized into an overlay network CAN,and in order to reduce the cost of calculation,only clustering information of local index is issued to the entire overlay network through the CAN interface.The experimental results show that VF-CAN reduces the index storage space and improves query performance efectively.  相似文献   

5.
Performance is one of the key problems in either high performance computing or GRID application.Performance data must be collected and analyzed for co-allocating resource efficiently,obtaining high performance and fault toleration.Furthermore,with the development of Internet and GRID,the exchange of data between virtual organiz-tions is becoming more and more important,and the type of performance is increasing following the increasing of the resource type,which requires a proper representation of the performance data.This paper does some research on the collection,analysis and representation of the performance data,and presents a Grid-oriented performance tool prototype:THGPT,which can achieve the runtime performance data,describe the data in XML,and implement a browserbased visualization tool of performance data analysis.  相似文献   

6.
Cloud computing has providing the possibility of services like store data, run applications and scalability of resources. The ability to measure computing resources according to request allows a great quantity of data to be stored in Data Kernels. Hadoop is a promising solution to solve problems with big sets of data. Mahout is a project developed with Hadoop that, by default, implements several grouping and classification algorithms, such as K-Means and Mean Shift, which are grouping algorithms successfully used during the last years in small databases. This paper presents a performance analysis of K-Means and Mean Shift in a standard implementation of Mahout in MapReduce distributed paradigm.  相似文献   

7.
Nowadays session-based applications are one of the typical applications in the Internet,and people build such applications on clusters on concern of scalability. Scheduling in such a cluster is a key technology since system performance depends on it. In this paper,we investigate the Round-Robin algorithm in the context of Session-based applications. An analyzing model for such sys-tems is proposed. Through both theoretical analysis and simulation,we find the main factor for system performance. And the result also shows that this algorithm shows up with significantly different performance under various conditions.  相似文献   

8.
System of Grid Resource Monitoring Service   总被引:1,自引:0,他引:1  
Resource monitoring is a key component in grid system. It can help understanding the performance limits and forecasting the failure of the system, advice in the scheduling and configuration of grid application and so on. This paper puts forward a system of grid resource monitoring service which is based on MDS (Monitoring and Discovery Service), which achieves real-time monitoring and the visualization of historical monitoring data for computing nodes in grid by Globus Toolkits. It introduces the design ideas of this service system, presents its architecture, and discusses its supports for low delay, low performance affection, scalability and manageability.  相似文献   

9.
We design a task mapper TPCM for assigning tasks to virtual machines, and an application-aware virtual machine scheduler TPCS oriented for parallel computing to achieve a high performance in virtual computing systems. To solve the problem of mapping tasks to virtual machines, a virtual machine mapping algorithm (VMMA) in TPCM is presented to achieve load balance in a cluster. Based on such mapping results, TPCS is constructed including three components: a middleware supporting an application-driven scheduling, a device driver in the guest OS kernel, and a virtual machine scheduling algorithm. These components are implemented in the user space, guest OS, and the CPU virtualization subsystem of the Xen hypervisor, respectively. In TPCS, the progress statuses of tasks are transmitted to the underlying kernel from the user space, thus enabling virtual machine scheduling policy to schedule based on the progress of tasks. This policy aims to exchange completion time of tasks for resource utilization. Experimental results show that TPCM can mine the parallelism among tasks to implement the mapping from tasks to virtual machines based on the relations among subtasks. The TPCS scheduler can complete the tasks in a shorter time than can Credit and other schedulers, because it uses task progress to ensure that the tasks in virtual machines complete simultaneously, thereby reducing the time spent in pending, synchronization, communication, and switching. Therefore, parallel tasks can collaborate with each other to achieve higher resource utilization and lower overheads. We conclude that the TPCS scheduler can overcome the shortcomings of present algorithms in perceiving the progress of tasks, making it better than schedulers currently used in parallel computing.  相似文献   

10.
The low bandwidth hinders the development of mobile computing.Besides providing relatively higher bandwidth on communication layer,constructing adaptable upper application is important.In this paper,a framework of autoadapting distributed object is proposed,and evaluating methods of object performance are given as well.Bistributed objects can abjust their behaviors automatically in the framework and keep in relatively good performance to serve requests of remote applications.It is an efficient way to implement the performance transparency for mobile clients.  相似文献   

11.
This paper introduces the design and implemetation of BCL-3,a high performance low-level communication software running on a cluster of SMPs(CLUMPS) called DAWNING-3000,BCL-3 provides flexible and sufficient functionality to fulfill the communication requirements of fundamental system software developed for DAWNING-3000 while guaranteeing security,scalability,and reliability,Important features of BCL-3 are presented in the paper,including special support for SMP and heterogeneous network environment,semiuser-level communication,reliable and ordered data transfer and scalable flow control,The performance evaluation of BCL-3 over Myrinet is also given.  相似文献   

12.
用户级通信协议BCL-3对IP协议支持的研究   总被引:2,自引:0,他引:2       下载免费PDF全文
陈志辉  马捷  陈国良  高帆 《软件学报》2003,14(9):1629-1634
为了充分利用高性能网络,研究人员开发了多种用户级通信协议.这些用户级通信协议可以获得底层硬件提供的高带宽、低延迟.然而由于它们提供完全不同的应用程序接口,用户级通信协议往往只能支持科学计算,而不能支持传统的基于Socket接口、采用核心级通信协议的网络应用程序.通过增加一个IP协议支持模块,BCL-3用户级通信协议在支持科学计算的同时,可以有效地支持现有的基于TCP/IP协议的网络应用程序.而且在分析TCP/IP协议软件开销的基础上,IP协议支持模块有针对性地采用了一些优化技术,使运行在BCL-3上的TCP/IP协议可以取得很高的网络性能.改进的BCL-3已经运行在曙光3000L超级服务器上.在曙光3000L上,运行于BCL-3之上的TCP/IP协议取得了最大带宽938Mbps,最小单向延迟48.1μs的性能.  相似文献   

13.
论文介绍了基于BCL机群底层通信协议的高性能TCP/IP通信(BCL/IP)在机群系统域网(SANs)中的设计与实现方法。作为曙光4000L超级服务器系统的重要组成部分,BCL/IP在充分发挥底层高效BCL协议和高速Myrinet网络性能的同时,还实现了同现有多数网络应用程序的二进制兼容。最后给出了Linux平台上BCL/IP的性能测试结果和简要分析。另外还提供了一种在对系统核心不做(或尽可能小)修改的前提下利用现有底层通信协议在机群系统域网中实现高性能TCP/IP协议的方法。  相似文献   

14.
基于Slice的H.264并行视频编码算法   总被引:1,自引:0,他引:1  
宁华  梅铮  李锦涛 《计算机工程》2005,31(4):181-182
从H.264视频编码标准的特点出发,提出了基于Slice级别的H.264视频编码并行算法,该算法不仅能够保证节点间的负载平衡,减少各节点间数据的依赖关系,还充分利用了已有的计算能力。最后给出了在曙光3000上的实验结果。  相似文献   

15.
介绍了在联众不锈钢水处理系统中CENTUM CS 3000系统硬件的配置,I/O卡件的选型;结合实例着重阐述了CS3000系统在常规控制、操作联锁、顺序控制,MODBUS通讯方面的应用软件组态,并总结了调试过程中遇到的一些问题。  相似文献   

16.
曙光3000超级服务器设计的关键问题研究   总被引:1,自引:0,他引:1  
孙凝晖  孟丹 《计算机学报》2002,25(11):1121-1132
曙光3000超级服务器是基于SMP机群体系结构的通用计算机系统,具有可扩展性,可用性,可管理性和高可用性的技术特点,该文着重介绍曙光3000系统设计中的若干关键问题,包括与SMP机群体系结构相关的可扩展性问题,系统软件中重要的可用性设计、底层通信的多种应用才机群管理系统的跨平台支持设计,另外还论述了超级服务器设计中存在的问题和作者的看法。  相似文献   

17.
针对某水力发电厂调度通信系统的应用现状,对原调度通信系统组网拓扑方式进行了分析,指出了存在的问题。提出了基于Coral IPX3000调度交换机和Coral Office自交换远端模块设备构成的组网拓扑方式,并对Coral IPX3000调度交换机的系统结构、软硬件平台主要性能要求进行了阐述。对Coral IPX3000系统在水电厂的应用情况进行了分析比较。  相似文献   

18.
基于高速通信协议的COSMOS机群文件系统性能研究   总被引:4,自引:0,他引:4  
作为曙光3000超级服务器的重要组成部分,COSMOS机群文件系统对机群文件系统协议,结构及性能优化等问题进行全面深入的探讨,首先描述了基于曙光3000机群高速通协议BCL-3的COSMOS文件系统的实现,然后引入并发带宽利用率,描述了通信与I/O对机群文件系统性能影响程序,最后介绍了有关性能实验并对实验结果作出解释。  相似文献   

19.
庞晓侠  汪升 《工矿自动化》2011,37(10):95-97
针对选矿厂通过增加远程I/O站或设置独立的小控制室来实现各个工段的监控,不仅增加投资成本,也不利于集中监控整个工艺流程和采集所有设备信息的问题,提出了一种采用Profibus-DP协议实现CS3000与子系统SIEMENS PLC数据通信的方案;以CS3000的ALP111通信模块为例,详细介绍了该方案的硬件配置及软件组态。实际应用表明,该方案实现了主站CS3000与子系统SIEMENS PLC之间的可靠数据通信,现场监视和控制效果良好。  相似文献   

20.
Parallel Algorithm Design on Some Distributed Systems   总被引:3,自引:0,他引:3       下载免费PDF全文
Some testing results on DAWINING-1000,Paragon and workstation cluster are described in this paper.On the home-made parallel system DAWNING-1000 with 32 computational processors,the practical performance of 1.1777 Gflops and 1.58 Gflops has been measured in solving a dense linear system and doing matrix multiplication,respectively .The scalability is also investigated.The importance of designing efficient parallel algorithms for evaluating parallel systems is emphasized.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号