期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

陈军莫则尧李晓梅袁国兴《计算机研究与发展》2000,37(11):1382-1388

为适应未来超大型并行计算,要求算法和应用程序必须具有良好的可扩展性,以往的可扩展性研究更强调于对算法的分析,而对于实际程序可扩展性低的原因很少进行深入探讨,不能有针对性地指导用户改进程序。现提出了数值可护展性和并行可扩展性。用来描述并行系统的数值性能和并行性能的扩展行为。并深入地讨论了数值可扩展性和并行可扩展性可能低的原因,提出了一套可扩展性评价准则。使用这套评价准则和近优可扩展性方法,对一个大规模应用程序--二维等离子体粒子云网格法并行程序进行了分析,结果表明这套可扩展性评价准则可以帮助定位引起可扩展性低的原因,同时也表明,对于实际的大规模应用,在已知小规模问题的执行信息下,近优可扩展性分析方法提供了一种预测更大规模的问题在多少台处理机上运行更合理的途径。这里的“合理”,指的是时间接近最短时间而效率有较大提高。相似文献

2.

分布式存储环境下并行计算可扩展性的研究与应用

下载免费PDF全文

陈军李晓梅《计算机工程与科学》2001,23(4):109-109

对本文的研究与创新工作概括如下 :( 1 )并行计算模型是研究并行计算可扩展性的基础。本文在深入分析已有并行计算模型的基础上 ,对常用并行计算模型进行分类 ,指出了它们的适用范围和优缺点。( 2 )深入分析了可扩展性与执行时间、可扩展性与单机性能之间的关系。结果表明 :如果片面强调执行时间或单机性能 ,可能会对可扩展性带来不利的影响。从理论和实验上分析了任务和数据分配策略对并行系统可扩展性的影响。( 3)首次从费用有效性的角度提出了近优可扩展模型。它不仅可以描述并行系统的可扩展能力 ,而且可以根据小规模系统的性能指标 ,预… 相似文献

3.

一种更有效的并行系统可扩展性模型 总被引：12，自引：0，他引：12

王与力杨晓东《计算机学报》2001,24(1):84-90

文中首先分析了等效率、等速度和等并行开销计算比三种并行系统可扩展性模型的特点,论证了等效率、等速度和等并行开销计算比三种条件的等价性,并指出这三种模型在描描可扩展性时的不直观及其局限性。然后提出了一种新的可扩展性模型。此模型直观地反映出并行系统在机器规模和问题规模扩展时,其性能的扩展特性。实例研究表明,该模型能更有效地解决下列问题：（1）定量研究并行系统的可扩展性;（2）全面地反映程序、机器、环境方面的因素对可扩展性的影响;（3）指导如何保持并行系统的可扩展性。相似文献

4.

基于分布对象的虚拟网络实验系统设计与实现

龚向坚邹腊梅隆重《微机发展》2010,(1):116-119,123

信息时代网络的作用越来越重要,越来越多的人希望有一个廉价实用、简单方便和具有良好可扩展性的网络实验环境来学习和研究网络,因此计算机网络仿真应运而生。文中研究了分布式对象的并行实现及优化,提出了一个基于分布式对象的计算机网络仿真并行模型,以分布式对象技术为核心结合并行程序设计技术、仿真技术设计和实现了虚拟计算机网络实验系统。实验结果表明,该系统设计合理、并行性较好、响应速度适中,在网络教学和研究中均具有较好的应用价值。相似文献

5.

基于分布对象的虚拟网络实验系统设计与实现

龚向坚邹腊梅隆重《计算机技术与发展》2010,20(1):116-119,123

信息时代网络的作用越来越重要,越来越多的人希望有一个廉价实用、简单方便和具有良好可扩展性的网络实验环境来学习和研究网络,因此计算机网络仿真应运而生。文中研究了分布式对象的并行实现及优化,提出了一个基于分布式对象的计算机网络仿真并行模型,以分布式对象技术为核心结合并行程序设计技术、仿真技术设计和实现了虚拟计算机网络实验系统。实验结果表明,该系统设计合理、并行性较好、响应速度适中,在网络教学和研究中均具有较好的应用价值。相似文献

6.

基于数据并行的碰撞检测

《计算机工程》2017,(9):1-6

在建筑信息建模的精确碰撞检测应用中,数据量日趋庞大,但串行执行无法随处理机主频的增加而持续加速。针对该问题,构建面向多核及众核处理机的数据并行计算模型,基于此提出一种数据并行碰撞检测方法。对参与碰撞检测的模型进行立方体细分,去除数据相关性,设计数据并行的模型组合、冲突检测和归约计算过程,并分析算法的抽象形式和理论执行时间。实验结果表明,该方法具有可行性和持续可扩展性,可为解决数据密集型问题提供一种高效的数据并行方式。相似文献

7.

基于长并行距离优先的确定性多线程调度

马超尹杰江凌波甄凯《小型微型计算机系统》2012,(10):2177-2181

随着多核技术的不断发展,多线程技术更加广泛地应用于计算机软件中.但由于执行的不确定性,多线程程序的排错和调试存在着很大的困难.确定性多线程系统可以使多线程程序以确定的方式执行,即多次执行同一个多线程程序的顺序和结果是相同的,这可以大大简化多线程程序的排错和调试.但是,确定性多线程系统会导致多线程程序性能的下降.本文提出一种基于长并行距离优先的确定性多线程调度算法,优先执行并行距离长的线程,减少线程总体等待时间,从而提高多线程程序的效率.实验结果表明,本文方法可以使多线程程序的性能提升10%,并且具有很好的可扩展性. 相似文献

8.

飞行控制系统可视化仿真平台设计

下载免费PDF全文

刘希朱凡蔡满意张健陈冰《计算机工程》2011,37(16):260-262

提出一种通用的飞行控制系统可视化仿真平台,以并行化计算思想设计系统总体框架,通过OpenMP多线程并行多核编程技术和单程序多数据流技术实现飞行动力学系统和飞行控制系统的并行解算。该平台可以载入各种飞行控制器、飞行动力学模型和数字地图进行仿真,能以数字、曲线和三维动画的形式显示仿真结果。以Beaver多模态自动驾驶仪仿真设计为例进行验证,结果表明该平台具有执行效率高、易于扩展和通用性强的优点。相似文献

9.

面向OpenCL架构的GPGPU量化性能模型

朱俊峰陈钢张珂良吴百锋《小型微型计算机系统》2013,34(5)

为了评估数据并行(DLP)应用并行化后在GPU体系结构上的执行性能,针对OpenCL架构提出一种GPGPU量化性能模型.该模型充分考虑了影响GPGPU程序性能的各种因素:全局存储器访问、局部存储器访问、计算与访存重叠、条件分支转移和同步.通过对DLP应用的静态分析并设定具体的OpenCL执行配置,在无需编写实际GPGPU程序的前提下采用该模型即可估算出DLP应用在GPU体系结构上的执行时间.在AMD RadeonTM HD 5870 GPU和NVIDIA GeForceTM GTX 280 GPU上对矩阵乘法与并行前缀和的分析与实验结果表明:该性能模型能够相对准确地评估DLP应用并行化后的执行时间. 相似文献

10.

面向数据中心的事务内存框架设计

下载免费PDF全文

孙勇《计算机工程与应用》2011,47(27):74-76

针对由计算机集群构成的云计算数据中心的特性,提出了一种基于事务内存的分布式编程框架。该框架将云计算任务封装为事务,自动完成所有事务的调度执行、负载均衡和故障恢复;将数据中心的分布式数据封装为事务对象,保证事务访问事务对象时的ACID特性。与同类研究相比,它无需用户关心程序的并行控制,具有简单易用性。该框架已在仿真环境下实现,实验结果表明它具有良好的可扩展性和容错性。相似文献

11.

基于乐观策略的并行离散事件模拟研究

李俊红杨洪斌吴悦《计算机工程与设计》2006,27(1):12-14,40

并行离散事件模拟（PDES）又称分布式模拟,通过将一个离散事件模拟程序在多个处理器上并行执行来提高模拟性能。乐观策略在解决并行模拟中各模拟部分之间的同步关系时具有较好的性能。介绍了基于乐观策略的并行离散事件模拟的原理,讨论了存在的问题,并给出相应的解决方法。相似文献

12.

Asynchronous parallel simulation of parallel programs

Prakash S. Deelman E. Bagrodia R. 《IEEE transactions on pattern analysis and machine intelligence》2000,26(5):385-400

Parallel simulation of parallel programs for large datasets has been shown to offer significant reduction in the execution time of many discrete event models. The paper describes the design and implementation of MPI-SIM, a library for the execution driven parallel simulation of task and data parallel programs. MPI-SIM can be used to predict the performance of existing programs written using MPI for message passing, or written in UC, a data parallel language, compiled to use message passing. The simulation models can be executed sequentially or in parallel. Parallel execution of the models are synchronized using a set of asynchronous conservative protocols. The paper demonstrates how protocol performance is improved by the use of application-level, runtime analysis. The analysis targets the communication patterns of the application. We show the application-level analysis for message passing and data parallel languages. We present the validation and performance results for the simulator for a set of applications that include the NAS Parallel Benchmark suite. The application-level optimization described in the paper yielded significant performance improvements in the simulation of parallel programs, and in some cases completely eliminated the synchronizations in the parallel execution of the simulation model 相似文献

13.

HLA based architecture for molecular communication simulation

《Simulation Modelling Practice and Theory》2014

In parallel to the developments in nano-scale machines and engineered cells, communication among them is getting more attention. The use of molecules to transfer information from one nanomachine to another is one of the mechanisms to build networks at the nano-scale. The research activities on communication models and performance evaluation of molecular communication heavily depend on simulations to verify the proposed models. The existing simulation tools for computer networking can not be directly used for molecular communication due to different communication model and channel characteristics of the fluid environment and the carrier wave properties. Simulation of molecular communication requires modeling the new communication paradigm that comprises different options for encoding, transmission, propagation, reception, and decoding. It should consider possible architectural design options and performance evaluation of molecular communication networks. Since molecular communication involves the modeling of large number of nano-scale objects, the scalability of the simulation tool is another important concern. In this paper, we introduce a simulator design that aims at fulfilling important design issues of the molecular communication model, focusing on scalability. High Level Architecture (HLA), which is standardized under IEEE 1516, is used to design and develop a distributed simulation tool for molecular communication. The results show that different scalability options can be used to benefit from additional processing power to shorten the execution time. This also enables modeling large systems, which may not be possible otherwise. 相似文献

14.

并行离散事件模拟的同步机制研究 总被引：2，自引：0，他引：2

李俊红解建军王喜年陈丽娟《计算机工程与设计》2006,27(13):2375-2377,2389

逻辑模拟在设计新系统的过程中起着重要作用,通过计算机进行模拟可以实时反馈输出结果,及早发现潜在的问题,进而缩短设计周期,降低研发成本。并行离散事件模拟通过分散计算量到并行机或者网络的多个节点来减少模拟时间,被视为解决模拟速度问题的有效途径。在影响模拟性能的因素中,各并行子系统之间的同步问题是直接影响并行性能的关键因素之一。探讨了并行离散事件模拟的同步机制,介绍了其基本原理、特点及存在的问题,并阐述了可能的改进方法。相似文献

15.

Asynchronous parallel discrete event simulation

Yi-Bing Lin Fishwick P.A. 《IEEE transactions on systems, man, and cybernetics. Part A, Systems and humans : a publication of the IEEE Systems, Man, and Cybernetics Society》1996,26(4):397-412

Complex models may have model components distributed over a network and generally require significant execution times. The field of parallel and distributed simulation has grown over the past fifteen years to accommodate the need of simulating the complex models using a distributed versus sequential method. In particular, asynchronous parallel discrete event simulation (PDES) has been widely studied, and yet we envision greater acceptance of this methodology as more readers are exposed to PDES introductions that carefully integrate real-world applications. With this in mind, we present two key methodologies (conservative and optimistic) which have been adopted as solutions to PDES systems. We discuss PDES terminology and methodology under the umbrella of the personal communications services application 相似文献

16.

ArchSim: A System-Level Parallel Simulation Platform for the Architecture Design of High Performance Computer 总被引：2，自引：0，他引：2

下载免费PDF全文

Yong-Qin Huang 《计算机科学技术学报》2009,24(5):901-912

High performance computer (HPC) is a complex huge system, of which the architecture design meets increasing difficulties and risks. Traditional methods, such as theoretical analysis, component-level simulation and sequential simulation, are not applicable to system-level simulations of HPC systems. Even the parallel simulation using large-scale parallel machines also have many difficulties in scalability, reliability, generality, as well as efficiency. According to the current needs of HPC architecture design, this paper proposes a system-level parallel simulation platform: ArchSim. We first introduce the architecture of ArchSim simulation platform which is composed of a global server (GS), local server agents (LSA) and entities. Secondly, we emphasize some key techniques of ArchSim, including the synchronization protocol, the communication mechanism and the distributed checkpointing/restart mechanism. We then make a synthesized test of some main performance indices of ArchSim with the phold benchmark and analyze the extra overhead generated by ArchSim. Finally, based on ArchSim, we construct a parallel event-driven interconnection network simulator and a system-level simulator for a small scale HPC system with 256 processors. The results of the performance test and HPC system simulations demonstrate that ArchSim can achieve high speedup ratio and high scalability on parallel host machine and support system-level simulations for the architecture design of HPC systems. 相似文献

17.

A simulator for adaptive parallel applications

Basile Schaeli Sebastian Gerlach Roger D. Hersch 《Journal of Computer and System Sciences》2008,74(6):983-999

Dynamically allocating computing nodes to parallel applications is a promising technique for improving the utilization of cluster resources. Detailed simulations can help identify allocation strategies and problem decomposition parameters that increase the efficiency of parallel applications. We describe a simulation framework supporting dynamic node allocation which, given a simple cluster model, predicts the running time of parallel applications taking CPU and network sharing into account. Simulations can be carried out without needing to modify the application code. Thanks to partial direct execution, simulation times and memory requirements are reduced. In partial direct execution simulations, the application's parallel behavior is retrieved via direct execution, and the duration of individual operations is obtained from a performance prediction model or from prior measurements. Simulations may then vary cluster model parameters, operation durations and problem decomposition parameters to analyze their impact on the application performance and identify the limiting factors. We implemented the proposed techniques by adding direct execution simulation capabilities to the Dynamic Parallel Schedules parallelization framework. We introduce the concept of dynamic efficiency to express the resource utilization efficiency as a function of time. We verify the accuracy of our simulator by comparing the effective running time, respectively the dynamic efficiency, of parallel program executions with the running time, respectively the dynamic efficiency, predicted by the simulator under different parallelization and dynamic node allocation strategies. 相似文献

18.

众核处理器和众核集群的并行模拟

吕慧伟程元白露陈明宇范东睿孙凝晖《计算机研究与发展》2013,50(5):1110-1117

模拟器是计算机体系结构研究的重要工具.近年来并行计算机体系结构的发展给计算机模拟带来了巨大的挑战.一方面,随着体系结构朝着多核以及众核处理器发展,模拟的目标系统规模随着模拟核数以摩尔定律的速度增加而不断增大;另一方面,串行模拟的速度因为模拟器运行所在宿主机主频提速减缓而停滞不前.上述两方面的原因使得传统的串行模拟方式无法满足对新兴体系结构模拟规模和速度的需求.以众核处理器和众核集群这两种体系结构为例,并行模拟技术在并行计算机体系结构模拟中是必要而且可行的.对于众核处理器的模拟,使用并行离散事件模拟对其进行加速,在模拟精度不变的前提下,提高模拟速度10.9倍.对于众核集群的模拟,模拟的目标系统总规模达到1024核,并且支持MPI/Pthreads混合编程的运行环境. 相似文献

19.

近优可扩展性：一种实用的可扩展性度量 总被引：2，自引：0，他引：2

陈军李晓梅《计算机学报》2001,24(2):179-182

良好的可扩展性是并行算法和并行机设计人员追求的一项重要性能指标,以往的可扩展模型都只是孤立地考虑了问题的某个侧面,比如某种性能或最大可利用资源,而没有从整体上进行权衡。这些可扩展模型可以满足计算机研究人员的需要,因为他们关注于更高的效率和利用率。但应用科学家更强调短小的执行时间。文中提出的近优可扩展模型,它同时考虑了并行系统的效率和执行两个因素。在一个典型MPP上的两个算法实例分析表明,该可扩展模型不仅可以描述并行算法的可扩展能力,而且,当按照适当的可扩展曲线扩展时,可以使得执行时间接近量短,而效率不低,这对算法和并行机的最优匹配有指导作用,同时有益于并行算法设计和改进。相似文献

20.

保守PDES中时间管理问题研究 总被引：1，自引：0，他引：1

蔡吉淼汪厚祥《计算机工程与设计》2007,28(14):3536-3538

并行离散事件仿真是一种非常有用的分析求解大规模复杂问题的工具,近年来成为仿真界研究热点之一.而并行仿真算法则是并行离散事件仿真中的核心问题,对于具体的应用系统,采用不同的并行仿真算法将导致其仿真性能大的差异.从保守PDES的基础出发,阐述其在时间管理中所遇到的问题,并进行分析和解决,然后给出一种简单的保守PDES系统结构. 相似文献