首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
S.J Shyu  R.C.T Lee   《Parallel Computing》1990,16(2-3):343-350
This paper describes a vectorized algorithm for the partition problem, a famous NP-complete problem. A set of partition problems that required 518 seconds to be solved on a VAX/8550 computer would require only 2 seconds on the Cyber 205 under the vector mode. A partition problem with n equal to 1000 was solved in only 6.34 seconds.  相似文献   

2.
背包问题属于著名的NP完全问题,在信息密码学和数论研究中有着极其重要的应用。在深入分析背包问题现有并行算法的基础上,本文提出了一种基于采样和MIMD结构的背包问题并行求解算法,并给出了算法性能的理论分析和在IBMP690超级计算机上的实验结果。实验结果表明,当背包实例的维数n≥40时,本算法的并行效率可达60%以上。因此,本并行算法具有较好的可扩展性,能应用于各种MIMD结构的并行机上有效地求解背包问题。  相似文献   

3.
Intensive use of computer experiments in modern science introduces qualitative changes into experimental resources. This implies the change in techniques used to solve relevant problems. An analysis of technological chains (from the problem statement to its solution) shows that often a particular problem can be solved in a variety of ways with the use of modern multiprocessor computers, which are also called supercomputers. The multiplicity of approaches to solving a problem requires that researches possess certain skills in using supercomputers. It is difficult for novice users of multiprocessor computers to find bearings when developing software for solving applied problems. The practice shows that main difficulties reveal themselves when it is required to develop portable and efficient parallel software. This is because tools that facilitate the development and provide full access to debugging information have yet to be elaborated. Actually, the problem is in the absence of standards for development and debugging tools for supercomputers, which is explained by the fact that computer science is yet young. For the same reason, no logically complete basic texts for concurrent programming courses for novices are available. On the basis of Russian-language literature, an attempt is made at setting up beacons that mark certain common and promising technologies in using supercomputers. The emphasis is made on problems encountered by programmers when solving applied problems with the use of supercomputers. The development of multiprocessor computers is closely related to concurrent programming technologies, both universal and oriented to specific supercomputer architectures. By programming technology, i.e., by memory management, we mean the use of tools designed for managing a particular computer system. It should be noted that when developing software for supercomputers (both management software and programs for solving applied problems), one must pay special attention to programming technique, i.e., to designing the logical architecture of a program. This implies the development and extending parallelizing algorithms, which enhances the efficiency of execution on multiprocessor computers. This review was compiled on the basis of publications in Russian journals and in the Russian Internet zone.  相似文献   

4.
A code for the direct numerical simulation (DNS) of incompressible flows with one periodic direction has been developed. It provides a fairly good performance on both Beowulf clusters and supercomputers. Since the code is fully explicit, from a parallel point-of-view, the main bottleneck is the Poisson equation. To solve it, a Fourier diagonalization is applied in the periodic direction to decompose the original 3D system into a set of mutually independent 2D systems. Then, different strategies can be used to solved them. In the previous version of the code, that was conceived for low-cost PC clusters with poor network performance, a Direct Schur-complement Decomposition (DSD) algorithm was used to solve them. Such a method, that is very efficient for PC clusters, cannot be used with an arbitrarily large number of processors and mesh sizes, mainly due to the RAM memory requirements. To do so, a new version of the solver is presented in this paper. It is based on the DSD algorithm that is used as a preconditioner for a Conjugate Gradient method. Numerical experiments showing the scalability and the flexibility of the method on both the MareNostrum supercomputer and a PC cluster with a conventional 100 Mbits/s network are presented and discussed. Finally, illustrative DNS results of an air-filled differentially heated cavity at Ra = 1011 are also presented.  相似文献   

5.
基于我国超级计算机平台,开展了大规模并行时域有限差分法(Finite-Difference Time-DomainFDTD)的性能和应用研究。在我国首台百万亿次"魔方"超级计算机、具有国产CPU的"神威蓝光"超级计算机和当前排名世界第一的"天河二号"超级计算机上就并行FDTD方法的并行性能进行了测试,并分别突破了10000 CPU核,100000 CPU核和300000 CPU核的并行规模。在不同测试规模下,该算法的并行效率均达到了50%以上,表明了本文并行算法具有良好的可扩展性。通过仿真分析多个微带天线阵的辐射特性和某大型飞机的散射特性,表明本文方法可以在不同架构的超级计算机上对复杂电磁问题进行精确高效电磁仿真。  相似文献   

6.
HPCG基准测试程序是一种新的超级计算机排名度量标准.该测试基准主要用于衡量超级计算机解决大规模稀疏线性系统的能力,更贴近实际应用,近年来广受关注.基于国产超级计算机研究异构众核并行HPCG软件具有非常重要的意义,其不仅可以提升国产超级计算机HPCG的排名,还对很多应用提供了并行算法、优化技术等方面的参考.面向某国产复杂异构超级计算机开展研究,首先采用了分块图着色算法对HPCG进行并行,并提出一种适用于结构化网格的图着色算法.该算法并行性能高于传统的JPL、CC等算法,且着色质量高,运用于HPCG后,迭代次数减少了3次,整体性能提升了6%.分析了复杂异构系统各个部件传输的开销,提出一套更适用于HPCG的任务划分方法,并从稀疏矩阵存储格式、稀疏矩阵重排、访存等角度开展了细粒度的优化.在多进程计算时,还采用内外区划分算法将核心函数SpMV、SymGS中的邻居通信操作进行了隐藏.最终整机测试时,性能达到了国产超级计算机峰值性能的1.67%,与单节点相比,整机弱可扩展性并行效率达到了92%.  相似文献   

7.
矩量法是广泛使用的高精度电磁数值算法之一。在仿真复杂电磁问题时,该算法需要处理大型复数稠密矩阵方程,这导致其面临内存需求高、计算时间长的问题。与传统基函数相比,本文采用的高阶多项式基函数能够在保证计算精度的前提下大幅度降低未知量,进而降低矩阵阶数。在此基础上,本文设计了基于分块矩阵的高效并行策略,在国内超级计算机平台开展了并行高阶矩量法的超级电磁计算研究,大幅度提升了矩量法的仿真能力。在国产神威蓝光超级计算机上,以机载天线阵列的辐射特性计算为例,对并行规模高达30720 CPU核时的算法性能进行了评估,测试结果表明算法在并行规模扩大20倍以上时仍可获得50%以上的并行效率。在当前排名世界第一的天河2号超级计算机上,以飞机散射特性计算为例,对并行规模高达201600 CPU核时的算法性能进行了评估,测试结果表明算法在并行规模扩大约8倍时可获得50%以上的并行效率。数值仿真结果表明并行高阶矩量法可以在不同架构的超级计算机上高效完成复杂电大目标的精确电磁计算。  相似文献   

8.
近两年,超级计算机、计算机图形学和科学可视化是三个引人注目的论题。今天,大量的计算机应用问题是运用超级计算机以及图形系统和科学可视化来解决的。本文介绍了超级计算机图形学产生和发展的背景,着重阐述了更高级的图形学及其应用的研究与发展,总结了超级计算机可视化的研究现状和发展趋势。最后,本文指出,基于超级计算机图形学的可视化时代已经到来。  相似文献   

9.
测试集问题的集合覆盖贪心算法的深入近似   总被引:1,自引:0,他引:1  
崔鹏  刘红静 《软件学报》2006,17(7):1494-1500
测试集问题是一个有着广泛应用的NP难问题.集合覆盖贪心算法是测试集问题的一个常用近似算法,其由集合覆盖问题得到的近似比21nn+1能否改进是一个公开的问题.集合覆盖贪心算法的推广被用来求解生物信息学中出现的冗余测试集问题.通过分析条目对被区分次数的分布情况,用去随机方法证明了集合覆盖贪心算法对测试集问题的近似比可以为1.51nn+0.5lnlnn+2,从而缩小了这种算法近似比分析的间隙.另外,给出了集合覆盖贪心算法对冗余度为n-1的加权冗余测试集问题的近似比的紧密下界(2-o(1))lnn-Θ 1).  相似文献   

10.
针对海洋数值模式大规模连续性运行保障的需求,基于国产神威超级计算机设计和开发一套切实可行的海洋数值模式运行管理系统。采用B/S的结构模式,基于JeeSite框架实现系统快速开发,实现了模式流程管理、模式软件与数据管理、模式性能分析和故障快速定位等主要功能。该系统规范了海洋数值模式软件的管理和使用流程,并能够及时发现模式运行异常和运行故障,有利于保障大规模模式作业的连续性运行,提高国产超级计算机的易用性。  相似文献   

11.
“神威·太湖之光”高效能计算机系统是世界上首台峰值运算速度超过 10 亿亿次量级的超级计算机,HPSEPS (High Performance Symmetric Eigenproblem Solvers) 是自主开发的大规模对称稠密矩阵特征问题并行求解器,包括标准对称稠密矩阵特征问题的并行计算方法, 对大规模数据问题的计算,表现出较好的性能,本文分别在中科院的“元”超级计算机上和神威·太湖之光超级计算机上进行了移植, 对比了两种超级计算机的系统性能, 并且在“神威·太湖之光”上分别链接适合其异构众核结构的 xMath 数学库和 mkl 数学库, 对求解器在链接两种不同数学库的计算机效果进行了测试与分析。  相似文献   

12.
HPCG基准测试程序是一种新的超级计算机排名度量标准.该测试基准主要用于衡量超级计算机解决大规模稀疏线性系统的能力,更贴近实际应用,近年来广受关注.基于国产超级计算机研究异构众核并行HPCG软件具有非常重要的意义,其不仅可以提升国产超级计算机HPCG的排名,还对很多应用提供了并行算法、优化技术等方面的参考.本文面向某国产复杂异构超级计算机开展研究,首先采用了分块图着色算法对HPCG进行并行,并提出一种适用于结构化网格的图着色算法,该算法并行性能高于传统的JPL、CC等算法,且着色质量高,运用于HPCG后,迭代次数减少了3次,整体性能提升了6%.本文还分析了复杂异构系统各个部件传输的开销,提出一套更适用于HPCG的任务划分方法,并从稀疏矩阵存储格式、稀疏矩阵重排、访存等角度开展了细粒度的优化.另外在多进程计算时,还采用了内外区划分算法将核心函数SpMV、SymGS中的邻居通信操作进行了隐藏.最终整机测试时,性能达到国产超级计算机峰值性能的1.67%,相比单节点,整机弱可扩展性并行效率达到了92%.  相似文献   

13.
In this paper, we introduce carousel greedy, an enhanced greedy algorithm which seeks to overcome the traditional weaknesses of greedy approaches. We have applied carousel greedy to a variety of well-known problems in combinatorial optimization such as the minimum label spanning tree problem, the minimum vertex cover problem, the maximum independent set problem, and the minimum weight vertex cover problem. In all cases, the results are very promising. Since carousel greedy is very fast, it can be used to solve very large problems. In addition, it can be combined with other approaches to create a powerful, new metaheuristic. Our goal in this paper is to motivate and explain the new approach and present extensive computational results.  相似文献   

14.
Micro-structural finite element (μFE) analysis based on high-resolution computed tomography represents the current gold standard to predict bone stiffness and strength. Recent progress in solver technology makes possible simulations on large supercomputers that involve billions of degrees of freedom.In this paper we present an improved solver that has a significantly smaller memory footprint compared to the currently used solvers. This new approach fully exploits the information that is contained in the underlying CT image itself. It admits to execute all steps in the underlying multigrid-preconditioned conjugate gradient algorithm in matrix-free form.The reduced memory footprint allows to solve bigger bone models on a given hardware. It is an important step forward to the clinical usage of μFE simulations.The new solver is fully parallel. We show almost perfect scalability up to 8000 cores of a Cray XT-5 supercomputer.  相似文献   

15.
The increasing availability and use of supercomputers has highlighted the need for better software development techniques and tools. Supercomputers have traditionally been extensively used by engineers and scientists whose preference for Fortran is well established and recognized. However, in the parallel environment offered by the latest configurations of supercomputers, more sophisticated languages and tools are required. The present experiment is concerned with devising a syntax-directed integrated programming environment based on the language Actus, which enables a user to develop and debug programs before submitting them to an actual supercomputer. Actus is a high-level, Pascal-like language, with SIMD parallel processing features. It enables the user to ignore the idiosyncrasies of the chosen supercomputer by abstracting the parallel processing detail to a higher level. The editing, compilation and testing phases of program development are all integrated, providing a single consistent interface for all activities relating to program development.  相似文献   

16.
With the increase of system scale, the inherent reliability of supercomputers becomes lower and lower. The cost of fault handling and task recovery increases so rapidly that the reliability issue will soon harm the usability of supercomputers. This issue is referred to as the “reliability wall”, which is regarded as a critical problem for current and future supercomputers. To address this problem, we propose an autonomous fault-tolerant system, named Iaso, in MilkyWay-2 system. Iaso introduces the concept of autonomous management in supercomputers. By autonomous management, the computer itself, rather than manpower, takes charge of the fault management work. Iaso automatically manage the whole lifecycle of faults, including fault detection, fault diagnosis, fault isolation, and task recovery. Iaso endows the autonomous features with MilkyWay-2 system, such as self-awareness, self-diagnosis, self-healing, and self-protection. With the help of Iaso, the cost of fault handling in supercomputers reduces from several hours to a few seconds. Iaso greatly improves the usability and reliability of MilkyWay-2 system.  相似文献   

17.
Supercomputers are prevalent and vital to scientific research and industrial fields, and may be used to represent the level of national scientific development. A summary of the evolution of supercomputers will help direct the future development of supercomputers and supercomputing applications. In this paper, we summarize the accomplishments in supercomputing, predict the trend of future supercomputers, and present several breakthroughs in supercomputer architecture research.  相似文献   

18.
In this paper, we describe the results of several tests that check the accuracy of numerical computation on the Cray supercomputer in vector and scalar modes. The known tests were modified to identify the critical point where roundings start causing problems. After describing the tests, we present an interval library called libavi.a. It was developed in Fortran 90 on the Cray Y-MP2E supercomputer of UFRGS-Brazil. This library makes interval mathematics accessible to the Cray supercomputers users. It works with real and complex intervals and intervals matrices and vectors. The library allows overloading of operators and functions. It is organized in four modules: real intervals, interval vectors and matrices, complex intervals, and linear algebra applications.  相似文献   

19.
Supercomputers are prevalent and vital to scientific research and industrial fields, and may be used to represent the level of national scientific development. A summary of the evolution of supercomputers will help direct the future development of supercomputers and supercomputing applications. In this paper, we summarize the accomplishments in supercomputing, predict the trend of future supercomputers, and present several breakthroughs in supercomputer architecture research.  相似文献   

20.
Given an undirected, vertex-weighted graph, the goal of the minimum weight vertex cover problem is to find a subset of the vertices of the graph such that the subset is a vertex cover and the sum of the weights of its vertices is minimal. This problem is known to be NP-hard and no efficient algorithm is known to solve it to optimality. Therefore, most existing techniques are based on heuristics for providing approximate solutions in a reasonable computation time.Population-based search approaches have shown to be effective for solving a multitude of combinatorial optimization problems. Their advantage can be identified as their ability to find areas of the space containing high quality solutions. This paper proposes a simple and efficient population-based iterated greedy algorithm for tackling the minimum weight vertex cover problem. At each iteration, a population of solutions is established and refined using a fast randomized iterated greedy heuristic based on successive phases of destruction and reconstruction. An extensive experimental evaluation on a commonly used set of benchmark instances shows that our algorithm outperforms current state-of-the-art approaches.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号