首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Rajkumar Buyya 《Software》2000,30(7):723-739
Workstation/PC clusters have become a cost‐effective solution for high performance computing. C‐DAC's PARAM 10000 (or OpenFrame, internal code name) is a large cluster of high‐performance workstations interconnected through low‐latency and high bandwidth networks. The management and control of such a huge system is a tedious and challenging task since workstations/PCs are typically designed to work as a standalone system rather than part of a cluster. We have designed and developed a tool called PARMON that allows effective monitoring and control of large clusters. It supports the monitoring of critical system resource activities and their utilization at three different levels: entire system, node and component level. It also allows the monitoring of multiple instances of the same component; for instance, multiple processors in SMP type cluster nodes. PARMON is a portable, flexible, interactive, scalable, location‐transparent, and comprehensive environment based on client–server technology. The major components of PARMON are parmon‐server—system resource activities and utilization information provider and parmon‐client—a GUI based client responsible for interacting with parmon‐server and users for data gathering in real‐time and presenting information graphically for visualization. The client is developed as a Java application and the server is developed as a multithreaded server using C and POSIX/Solaris threads since Java does not support interfaces to access system internals. PARMON is regularly used to monitor PARAM 10000 supercomputer, a cluster of 48+ Ultra‐4 workstations powered by the Solaris operating system. The recent popularity of Beowulf‐class clusters (dedicated Linux clusters) in terms of price–performance ratio has motivated us to port PARMON to Linux (accomplished by porting system dependent portions of parmon‐server). This enables management/monitoring of both Solaris and Linux‐based clusters (federated clusters) through a single user interface. Copyright © 2000 John Wiley & Sons, Ltd.  相似文献   

2.
Cluster architectures are increasingly used to solve high‐performance computing applications. To build more computational power, sets of clusters, interconnected by high‐speed networks, can be used in an elaboration to form a cluster grid. In this type of architecture, it is difficult to exploit all the internal resources of a cluster, because each one can be shielded by a firewall and is usually configured with machines where there is only one visible IP front‐end node that hides all its internal nodes from the external world. The exploitation of resources is even more complicated if we consider the general case where each internal node of a cluster can be a front‐end node of an another cluster. This type of architecture has been defined as a multilayer cluster grid. In this paper, a Parallel Virtual Machine (PVM) extension is presented which, through a middleware solution based on the H2O distributed metacomputing framework, permits the building of a parallel virtual machine in a multilayer cluster grid environment. In addition, the existing code written for PVM can be executed into this environment without modifications. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

3.
On the basis of cluster size and cluster cohesion, we propose a generalized cluster‐reliability (CR) measure, which indicates the overall reliability of arguments in a cluster. Taking the reliability of clusters as order‐inducing variables, we introduce a generalized cluster‐reliability‐induced ordered weighted averaging (CRI‐OWA) operator from the viewpoint of combining representative arguments of clusters. Furthermore, we propose a grid‐based cohesion measure for grid‐based clusters. On the basis of this cohesion measure, we obtain the special CR measure and CRI‐OWA operator for the grid‐based clusters. Then we introduced two other special CR measures for graph‐based and prototype‐based clusters, respectively. Taking the CR, computed by these two measures, as order‐inducing variables, we can obtain two other kinds of CRI‐OWA operators for graph‐based and prototype‐based clusters, respectively. © 2012 Wiley Periodicals, Inc.  相似文献   

4.
K‐means clustering can be highly accurate when the number of clusters and the initial cluster centre are appropriate. An inappropriate determination of the number of clusters or the initial cluster centre decreases the accuracy of K‐means clustering. However, determining these values is problematic. To solve these problems, we used density‐based spatial clustering of application with noise (DBSCAN) because it does not require a predetermined number of clusters; however, it has some significant drawbacks. Using DBSCAN with high‐dimensional data and data with potentially different densities decreases the accuracy to some degree. Therefore, the objective of this research is to improve the efficiency of DBSCAN through a selection of region clusters based on density DBSCAN to automatically find the appropriate number of clusters and initial cluster centres for K‐means clustering. In the proposed method, DBSCAN is used to perform clustering and to select the appropriate clusters by considering the density of each cluster. Subsequently, the appropriate region data are chosen from the selected clusters. The experimental results yield the appropriate number of clusters and the appropriate initial cluster centres for K‐means clustering. In addition, the results of the selection of region clusters based on density DBSCAN method are more accurate than those obtained by traditional methods, including DBSCAN and K‐means and related methods such as Partitioning‐based DBSCAN (PDBSCAN) and PDBSCAN by applying the Ant Clustering Algorithm DBSCAN (PACA‐DBSCAN).  相似文献   

5.
The trends in parallel processing system design and deployment have been toward networked distributed systems such as cluster computing systems. Since the overall performance of such distributed systems often depends on the efficiency of their communication networks, performance analysis of the interconnection networks for such distributed systems is paramount. In this paper, we develop an analytical model, under non‐uniform traffic and in the presence of communication locality, for the m‐port n‐tree family interconnection networks commonly employed in large‐scale cluster computing systems. We use the proposed model to study two widely used interconnection networks flow control mechanism namely the wormhole and store&forward. The proposed analytical model is validated through comprehensive simulation. The results of the simulation demonstrated that the proposed model exhibits a good degree of accuracy for various system organizations and under different working conditions. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

6.
Symbolic computation has underpinned a number of key advances in Mathematics and Computer Science. Applications are typically large and potentially highly parallel, making them good candidates for parallel execution at a variety of scales from multi‐core to high‐performance computing systems. However, much existing work on parallel computing is based around numeric rather than symbolic computations. In particular, symbolic computing presents particular problems in terms of varying granularity and irregular task sizes that do not match conventional approaches to parallelisation. It also presents problems in terms of the structure of the algorithms and data. This paper describes a new implementation of the free open‐source GAP computational algebra system that places parallelism at the heart of the design, dealing with the key scalability and cross‐platform portability problems. We provide three system layers that deal with the three most important classes of hardware: individual shared memory multi‐core nodes, mid‐scale distributed clusters of (multi‐core) nodes and full‐blown high‐performance computing systems, comprising large‐scale tightly connected networks of multi‐core nodes. This requires us to develop new cross‐layer programming abstractions in the form of new domain‐specific skeletons that allow us to seamlessly target different hardware levels. Our results show that, using our approach, we can achieve good scalability and speedups for two realistic exemplars, on high‐performance systems comprising up to 32000 cores, as well as on ubiquitous multi‐core systems and distributed clusters. The work reported here paves the way towards full‐scale exploitation of symbolic computation by high‐performance computing systems, and we demonstrate the potential with two major case studies. © 2016 The Authors. Concurrency and Computation: Practice and Experience Published by John Wiley & Sons Ltd.  相似文献   

7.
In this paper, we present a generalized model for the performance evaluation of scheduling compute-intensive jobs with unknown service times in computational clusters. We propose the application of parameters defined in the SPECpower_ssj2008 benchmark of the Standard Performance Evaluation Corporation to construct a performance evaluation model. In addition, we also define a method to rank physical servers based on either the high performance priority or the energy efficiency priority, and measures to characterize the performance of computational clusters.We investigate three schemes (separate queue, class queue and common queue) for buffering jobs in a computational cluster that is built from Commercial Off-The-Shelf (COTS) servers. Numerical results show that the buffering schemes do not have impact on performance measures related to the energy consumption of the investigated cluster. However, the buffering schemes play an important role in ensuring the quality of service parameters such as the waiting time and the response time experienced by arriving jobs. Furthermore, Dynamic Voltage and Frequency Scaling should be carefully applied to reduce the energy consumption of computational clusters.  相似文献   

8.
云计算集群服务器系统监控方法的研究   总被引:1,自引:0,他引:1  
随着云计算技术越来越多地应用到信息产业的各个领域,云计算环境下集群服务器系统的监控与管理的需求越来越大。云计算下的集群服务器系统主要是通过一系列基于分布式架构的服务器集群组成,其下的服务器数量可能多达上万台。要管理好数量如此大的云计算集群服务器系统,保证其高性能运行,必然需要一套有效的云计算集群监控系统对其进行监测与调控。但是,传统的集群监测系统存在一些不足与弊端。本文对于云计算集群系统的高性能监测调度方案进行了研究,从云监控系统的架构、数据采集、负载均衡调度方面进行了探讨,构建了一个保证云计算集群系统高性能运营的云系统方案。  相似文献   

9.
10.
计算机和网络硬件设备逐步实现商品化和标准化,PC机或工作站的性能越来越高而价格越来越便宜,同时开源Linux微内核及集群工具中间件技术也日趋成熟稳定,高性能计算集群逐渐发展起来,并成为主流的高性能计算平台。高性能计算集群逐渐替代专用、昂贵的超级计算机对大规模并行应用构建原型、调试和运行。基于PCs或工作站的高性能计算快速部署及其可靠性和可管理性研究,对高性能计算集群在科学研究和工程计算等领域的应用,促进高性能计算技术的应用方面具有深远的意义。本文以OSCAR集群为实例,部署一个五结点的集群环境并运行简单的并行测试例子。  相似文献   

11.
Workstation and PC clusters interconnected by SCI (scalable coherent interface) are very promising technologies for high-performance cluster computing. Using commercial SBus to SCI interface cards and system software and drivers, a two-workstation cluster has been constructed for initial testing and evaluation. The PVM system has been adapted to operate on this cluster using both raw channel and shared-memory access to the SCI interconnect, and preliminary communications performance tests have been carried out. To achieve mutual exclusion in accessing shared-memory segments, two protocols were used. Our preliminary results indicate that communications throughput in the range of 17.7 Mbytes/s, and round-trip latencies of 80 μs using the first and 140 μs using the second protocol, can be obtained on SCI clusters. These figures are significantly better (by a factor of 2 to 4) for small and large messages than those attainable on Fast Ethernet LANs. Since these performance results are very encouraging, we expect that, in the very near future, SCI networks will be capable of delivering several tens of Mbytes/s bandwidth and a few tens of microseconds latencies, and will significantly enhance the viability of cluster computing. Copyright © 1999 John Wiley & Sons, Ltd.  相似文献   

12.
普适计算中间件技术的研究   总被引:3,自引:1,他引:3  
徐磊 《计算机工程》2004,30(20):113-115
普适计算作为新的计算模式对中间件技术有一些特殊的要求。该文从普适计算的特点出发,分析了普适计算中间件硬解决的有关问题和需求,提出了构建普适中间件的4个设计原则,分析了c0TS中间件技术和典型普适计算中间件的技术特点并指出了普适计算中间件技术的发展趋势。  相似文献   

13.
Cluster computing is an attractive approach to provide high‐performance computing for solving large‐scale applications. Owing to the advances in processor and networking technology, expanding clusters have resulted in the system heterogeneity; thus, it is crucial to dispatch jobs to heterogeneous computing resources for better resource utilization. In this paper, we propose a new job allocation system for heterogeneous multi‐cluster environments named the Adaptive Job Allocation Strategy (AJAS), in which a self‐scheduling scheme is applied in the scheduler to dispatch jobs to the most appropriate computing resources. Our strategy focuses on increasing resource utility by dispatching jobs to computing nodes with similar performance capacities. By doing so, execution times among all nodes can be equalized. The experimental results show that AJAS can improve the system performance. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

14.
Very large scale networks have become common in distributed systems. To efficiently manage these networks, various techniques are being developed in the distributed and networking research community. In this paper, we focus on one of those techniques, network clustering, i.e., the partitioning of a system into connected subsystems. The clustering we compute is size-oriented: given a parameter K of the algorithm, we compute, as far as possible, clusters of size K. We present an algorithm to compute a binary hierarchy of nested disjoint clusters. A token browses the network and recruits nodes to its cluster. When a cluster reaches a maximal size defined by a parameter of the algorithm, it is divided when possible, and tokens are created in both of the new clusters. The new clusters are then built and divided in the same fashion. The token browsing scheme chosen is a random walk, in order to ensure local load balancing. To allow the division of clusters, a spanning tree is built for each cluster. At each division, information on how to route messages between the clusters is stored. The naming process used for the clusters, along with the information stored during each division, allows routing between any two clusters.  相似文献   

15.
Modern cloud computing applications developed from different interoperable services that are interfacing with each other in a loose coupling approach. This work proposes the concept of the Virtual Machine (VM) cluster migration, meaning that services could be migrated to various clouds based on different constraints such as computational resources and better economical offerings. Since cloud services are instantiated as VMs, an application can be seen as a cluster of VMs that integrate its functionality. We focus on the VM cluster migration by exploring a more sophisticated method with regards to VM network configurations. In particular, networks are hard to managed because their internal setup is changed after a migration, and this is related with the configuration parameters during the re-instantiation to the new cloud platform. To address such issue, we introduce a Software Defined Networking (SDN) service that breaks the problem of network configuration into tractable pieces and involves virtual bridges instead of references to static endpoints. The architecture is modular, it is based on the SDN OpenFlow protocol and allows VMs to be paired in cluster groups that communicate with each other independently of the cloud platform that are deployed. The experimental analysis demonstrates migrations of VM clusters and provides a detailed discussion of service performance for different cases.  相似文献   

16.
With the recent emergence of cloud computing based services on the Internet, MapReduce and distributed file systems like HDFS have emerged as the paradigm of choice for developing large scale data intensive applications. Given the scale at which these applications are deployed, minimizing power consumption of these clusters can significantly cut down operational costs and reduce their carbon footprint—thereby increasing the utility from a provider’s point of view. This paper addresses energy conservation for clusters of nodes that run MapReduce jobs. The algorithm dynamically reconfigures the cluster based on the current workload and turns cluster nodes on or off when the average cluster utilization rises above or falls below administrator specified thresholds, respectively. We evaluate our algorithm using the GridSim toolkit and our results show that the proposed algorithm achieves an energy reduction of 33% under average workloads and up to 54% under low workloads.  相似文献   

17.
Reconstruction‐based one‐class classification has shown to be very effective in a number of domains. This approach works by attempting to capture the underlying structure of the normal class, typically, by means of clusters of objects. It has the main disadvantage, however, that one has to indicate the number of clusters in advance, for this yields an efficient way of computing a clustering. In this paper, we introduce a new algorithm, OCKRA++, which achieves a better performance, by enhancing a clustering‐based one‐class ensemble classifier (OCKRA) with a cluster validity index that is used to set the best number of clusters during the classifier's training process. We have thoroughly tested OCKRA++ in a particular domain, namely masquerade detection. For this purpose, we have used the Windows‐Users and ‐Intruder simulation Logs data set repository, which contains 70 different masquerade data sets. We have found that OCKRA++ is currently the algorithm that achieves the best area under the curve, with a significant difference, in masquerade detection using the file system navigation approach.  相似文献   

18.
Wireless sensor networks are rapidly evolving technological platforms with tremendous applications in several domains. Since sensor nodes are battery powered and may be used in dangerous or inaccessible environments, it is difficult to replace or recharge their power supplies. Clustering is an effective approach to achieve energy efficiency in wireless sensor networks. In clustering-based routing protocols, cluster heads are selected among all sensor nodes within the network, and then clusters are formed by simply assigning each node to the nearest cluster head. The main drawback is that there is no control on the distribution of cluster heads over the network. In addition to the problem of generating unbalanced clusters, almost all routing protocols are designed for a certain application scope, and could not cover all applications. In this paper, we propose a swarm intelligence based fuzzy routing protocol (named SIF), in order to overcome the mentioned drawbacks. In SIF, fuzzy c-means clustering algorithm is utilized to cluster all sensor nodes into balanced clusters, and then appropriate cluster heads are selected via Mamdani fuzzy inference system. This strategy not only guarantees to generate balanced clusters over the network, but also has the ability to determine the precise number of clusters. In fuzzy-based routing protocols in literature, the fuzzy rule base table is defined manually, which is not optimal for all applications. Since tuning the fuzzy rules very affects on the performance of the fuzzy system, we utilize a hybrid swarm intelligence algorithm based on firefly algorithm and simulated annealing to optimize the fuzzy rule base table of SIF. The fitness function can be defined according to the application specifications. Unlike other routing protocols which have been designed for a certain application scope, the main objective of our methodology is to prolong the network lifetime based on the application specifications. In other words, SIF not only prolongs the network lifetime, but also is applicable to any kind of application. Obtained simulation results over 10 heterogeneous networks show that SIF outperforms the existing clustering-based protocols in terms of generating balanced clusters and prolonging the network lifetime.  相似文献   

19.
Chee Shin Yeo  Rajkumar Buyya 《Software》2006,36(13):1381-1419
In utility‐driven cluster computing, cluster Resource Management Systems (RMSs) need to know the specific needs of different users in order to allocate resources according to their needs. This in turn is vital to achieve service‐oriented Grid computing that harnesses resources distributed worldwide based on users' objectives. Recently, numerous market‐based RMSs have been proposed to make use of real‐world market concepts and behavior to assign resources to users for various computing platforms. The aim of this paper is to develop a taxonomy that characterizes and classifies how market‐based RMSs can support utility‐driven cluster computing in practice. The taxonomy is then mapped to existing market‐based RMSs designed for both cluster and other computing platforms to survey current research developments and identify outstanding issues. Copyright © 2006 John Wiley & Sons, Ltd.  相似文献   

20.
分簇路由可以通过数据汇聚节省传感器网络的能量,但现有的分簇路由也存在一些不足之处,为此提出基于5色标记的分簇路由协议。该协议选择满足条件的最大度节点作为簇头,可以最大限度地让簇头覆盖更多的节点,从而减少生成的簇数,减少簇间的通信量;选择在通信半径内的节点作为簇成员,减少簇内的通信量,这样对整个网络的通信量都会有所控制,从而达到能量高效的目的;另外通过簇头的转移可以尽量地维护网络的稳定状态,减少了频繁生成簇的能量开销和网络时延,是传感器网络中一种能量高效的路由协议。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号