首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
近年来,高性能计算与并行技术取得了很大的进展,粒子模拟的数据规模也随之快速增长,目前已达到10亿粒子以上,对其模拟结果的可视化已远远超出了单机的处理能力范围。为实现大规模粒子模拟结果的后处理与可视化,本文采用了并行处理和LOD两种技术,其中并行处理部分采用数据并行的方式,突破了单机处理能力的限制;LOD技术使用户能够控制进入可视化流水线的数据规模,从而减轻了可视化流水线的负担并加快了绘制速度。本文初步实现了大规模粒子模拟结果的可视化,为下一步工作奠定了基础。  相似文献   

2.
Graphical processing units (GPUs) have recently attracted attention for scientific applications such as particle simulations. This is partially driven by low commodity pricing of GPUs but also by recent toolkit and library developments that make them more accessible to scientific programmers. We discuss the application of GPU programming to two significantly different paradigms—regular mesh field equations with unusual boundary conditions and graph analysis algorithms. The differing optimization techniques required for these two paradigms cover many of the challenges faced when developing GPU applications. We discuss the relevance of these application paradigms to simulation engines and games. GPUs were aimed primarily at the accelerated graphics market but since this is often closely coupled to advanced game products it is interesting to speculate about the future of fully integrated accelerator hardware for both visualization and simulation combined. As well as reporting the speed‐up performance on selected simulation paradigms, we discuss suitable data‐parallel algorithms and present code examples for exploiting GPU features like large numbers of threads and localized texture memory. We find a surprising variation in the performance that can be achieved on GPUs for our applications and discuss how these findings relate to past known effects in parallel computing such as memory speed‐related super‐linear speed up. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

3.
This paper presents a parallel framework for simulating fluids with the Smoothed Particle Hydrodynamics (SPH) method. For low computational costs per simulation step, efficient parallel neighbourhood queries are proposed and compared. To further minimize the computing time for entire simulation sequences, strategies for maximizing the time step and the respective consequences for parallel implementations are investigated. The presented experiments illustrate that the parallel framework can efficiently compute large numbers of time steps for large scenarios. In the context of neighbourhood queries, the paper presents optimizations for two efficient instances of uniform grids, that is, spatial hashing and index sort. For implementations on parallel architectures with shared memory, the paper discusses techniques with improved cache‐hit rate and reduced memory transfer. The performance of the parallel implementations of both optimized data structures is compared. The proposed solutions focus on systems with multiple CPUs. Benefits and challenges of potential GPU implementations are only briefly discussed.  相似文献   

4.
Recent advances in computing architectures and networking are bringing parallel computing systems to the masses so increasing the number of potential users of these kinds of systems. In particular, two important technological evolutions are happening at the ends of the computing spectrum: at the “small” scale, processors now include an increasing number of independent execution units (cores), at the point that a mere CPU can be considered a parallel shared-memory computer; at the “large” scale, the Cloud Computing paradigm allows applications to scale by offering resources from a large pool on a pay-as-you-go model. Multi-core processors and Clouds both require applications to be suitably modified to take advantage of the features they provide. Despite laying at the extreme of the computing architecture spectrum – multi-core processors being at the small scale, and Clouds being at the large scale – they share an important common trait: both are specific forms of parallel/distributed architectures. As such, they present to the developers well known problems of synchronization, communication, workload distribution, and so on. Is parallel and distributed simulation ready for these challenges? In this paper, we analyze the state of the art of parallel and distributed simulation techniques, and assess their applicability to multi-core architectures or Clouds. It turns out that most of the current approaches exhibit limitations in terms of usability and adaptivity which may hinder their application to these new computing architectures. We propose an adaptive simulation mechanism, based on the multi-agent system paradigm, to partially address some of those limitations. While it is unlikely that a single approach will work well on both settings above, we argue that the proposed adaptive mechanism has useful features which make it attractive both in a multi-core processor and in a Cloud system. These features include the ability to reduce communication costs by migrating simulation components, and the support for adding (or removing) nodes to the execution architecture at runtime. We will also show that, with the help of an additional support layer, parallel and distributed simulations can be executed on top of unreliable resources.  相似文献   

5.
Dynamic terrain is useful for enhancing realism and immersion in VR simulation. Current approaches typically limit the resolution or size of the editable region of terrain, or handle one or few clients simultaneously. Our primary goals are to simulate large regions of terrain with dynamic attributes and to handle many clients simultaneously. To achieve these goals, we present two models of distributed dynamic-terrain databases, which we're utilizing the resources of a high-performance computing cluster (HPCC) to prototype. The first model breaks the terrain into mutually exclusive pieces, which are distributed among compute nodes, while the second model uses total terrain-data replication across multiple compute nodes. We compare the advantages and disadvantages of each model and discuss optimization techniques for each.  相似文献   

6.
秦勃  朱勇  秦雪 《计算机工程与科学》2015,37(12):2216-2221
乘潮水位计算是海洋环境信息处理的重要组成部分,具有计算量大、计算复杂度高、计算时间长等特性。采用传统集群计算模式实现乘潮水位计算业务,存在计算成本高、计算伸缩性和交互性差的问题。针对以上问题,提出一种基于Spark框架的乘潮水位计算和可视化平台。结合对Spark任务调度算法的研究,设计和实现了一种基于节点计算能力的任务调度算法,实现了长时间序列的多任务乘潮水位数据的检索、获取、数值计算、特征可视化的并行处理,达到了海量海洋环境数据计算和可视化处理的目的。实验结果表明,提出的基于Spark的乘潮水位计算和可视化平台可以有效地提高海量乘潮水位数据的分布式并行处理的效率,为更加快速和高效的乘潮水位计算提供了一种新的方法。  相似文献   

7.
罗立辉  张耀南 《软件学报》2013,24(S2):80-88
为了从多角度精确评估陆面特征,改善陆面过程模型的模拟性能,并为研究者提供一套完整的从数据处理到模拟分析的陆面建模系统.采用多种脚本语言和模型数据融合方法来构建陆面建模系统.此建模系统集成观测数据、陆面过程模型、高性能计算、数据处理和分析方法,以及可视化等技术手段.在此系统内针对两种不同的陆面过程模型进行了应用示范,证明了不同脚本语言的建模系统在目前高性能计算环境中的应用潜力,以及不同可视化方案在陆面建模系统的作用.  相似文献   

8.
Cloud Computing has evolved to become an enabler for delivering access to large scale distributed applications running on managed network-connected computing systems. This makes possible hosting Distributed Enterprise Information Systems (dEISs) in cloud environments, while enforcing strict performance and quality of service requirements, defined using Service Level Agreements (SLAs). SLAs define the performance boundaries of distributed applications, and are enforced by a cloud management system (CMS) dynamically allocating the available computing resources to the cloud services. We present two novel VM-scaling algorithms focused on dEIS systems, which optimally detect most appropriate scaling conditions using performance-models of distributed applications derived from constant-workload benchmarks, together with SLA-specified performance constraints. We simulate the VM-scaling algorithms in a cloud simulator and compare against trace-based performance models of dEISs. We compare a total of three SLA-based VM-scaling algorithms (one using prediction mechanisms) based on a real-world application scenario involving a large variable number of users. Our results show that it is beneficial to use autoregressive predictive SLA-driven scaling algorithms in cloud management systems for guaranteeing performance invariants of distributed cloud applications, as opposed to using only reactive SLA-based VM-scaling algorithms.  相似文献   

9.
In this paper we describe a GPU-based technique for creating illustrative visualization through interactive manipulation of volumetric models. It is partly inspired by medical illustrations, where it is common to depict cuts and deformation in order to provide a better understanding of anatomical and biological structures or surgical processes, and partly motivated by the need for a real-time solution that supports the specification and visualization of such illustrative manipulation. We propose two new feature-aligned techniques, namely surface alignment and segment alignment, and compare them with the axis-aligned techniques which was reported in previous work on volume manipulation. We also present a mechanism for defining features using texture volumes, and methods for computing correct normals for the deformed volume in respect to different alignments. We describe a GPU-based implementation to achieve real-time performance of the techniques and a collection of manipulation operators including peelers, retractors, pliers and dilators which are adaptations of the metaphors and tools used in surgical procedures and medical illustrations. Our approach is directly applicable in medical and biological illustration, and we demonstrate how it works as an interactive tool for focus+context visualization, as well as a generic technique for volume graphics.  相似文献   

10.
Extreme scale supercomputers available before the end of this decade are expected to have 100 million to 1 billion computing cores. The power and energy efficiency issue has become one of the primary concerns of extreme scale high performance scientific computing. This paper surveys the research on saving power and energy for numerical linear algebra algorithms in high performance scientific computing on supercomputers around the world. We first stress the significance of numerical linear algebra algorithms in high performance scientific computing nowadays, followed by a background introduction on widely used numerical linear algebra algorithms and software libraries and benchmarks. We summarize commonly deployed power management techniques for reducing power and energy consumption in high performance computing systems by presenting power and energy models and two fundamental types of power management techniques: static and dynamic. Further, we review the research on saving power and energy for high performance numerical linear algebra algorithms from four aspects: profiling, trading off performance, static saving, and dynamic saving, and summarize state-of-the-art techniques for achieving power and energy efficiency in each category individually. Finally, we discuss potential directions of future work and summarize the paper.  相似文献   

11.
面向并行负载平衡的数据剖分技术*   总被引:1,自引:0,他引:1  
对传统的数据剖分技术和负载平衡对大规模并行计算性能的影响进行了综述,介绍了目前典型的几何剖分方法和图剖分方法的特点,并分析比较各种剖分算法及常用剖分软件包(ParMETIS、Zoltan、JOSTLE等)在实际应用中的优缺点,深入探讨了数据剖分技术是如何对超大规模数值模拟计算任务进行高效划分以解决负载平衡问题的,以期为开展并行计算研究和并行性能优化的研究人员提供参考。  相似文献   

12.
In this paper, parallel mesh-partitioning algorithms are proposed for generating submeshes with optimal shape using evolutionary computing techniques. It is preferred to employ a formulation for mesh partitioning, which maintains constant number of design variables irrespective of the size of the mesh. Two distinct parallel computing models have been employed. The first model of parallel evolutionary algorithm uses the master–slave concept (single population model) and a new synchronous model is proposed to optimise the performance even on heterogeneous parallel hardware. Alternatively, a multiple population model is also developed which simulates it’s sequential counter part. The advantage of the second model is that it can fit in large size problems with large population even on moderate capacity parallel computing nodes. The performance of the evolutionary computing based mesh-partitioning algorithm is demonstrated first by solving several practical engineering problems and also several benchmark test problems available in the literature and comparing the results with the multilevel algorithms. Later the speedup of the parallel evolutionary algorithms on parallel hardware is evaluated by solving large scale practical engineering problems.  相似文献   

13.
为了充分利用游戏网格的计算资源,使用其强大的并行计算能力,部署在游戏网格的网络游戏必须要划分成可以并行的多个服务。提出了一种基于动态二叉树的游戏网格服务划分算法;讨论了如何采用二叉树的数据结构来组织服务节点并根据服务节点的负载动态调整其服务划分;最后实现一个模拟游戏网格环境,通过实验结果证明该算法可以取得良好的性能。  相似文献   

14.
Ray casting architectures for volume visualization   总被引:8,自引:0,他引:8  
Real-time visualization of large volume data sets demands high-performance computation, pushing the storage, processing and data communication requirements to the limits of current technology. General-purpose parallel processors have been used to visualize moderate-size data sets at interactive frame rates; however, the cost and size of these supercomputers inhibits the widespread use for real-time visualization. This paper surveys several special-purpose architectures that seek to render volumes at interactive rates. These specialized visualization accelerators have cost, performance and size advantages over parallel processors. All architectures implement ray casting using parallel and pipelined hardware. We introduce a new metric that normalizes performance to compare these architectures. The architectures included in this survey are VOGUE, VIRIM, Array-Based Ray Casting, EM-Cube and VIZARD II. We also discuss future applications of special-purpose accelerators  相似文献   

15.
The numerical investigation of the interaction of large, solid particles with fluids is an important area of research for many manufacturing processes. Such studies frequently lead to models that are very large and require the use of parallel solution techniques. This paper presents the results of a parallel implementation of a serial code for the direct numerical simulation of solid-liquid flows. The base code is a serial, arbitrary Lagrangian-Eulerian (ALE) formulation of the equations of motion, which views that particles as solid bodies are embedded into the flow domain. This particular model poses some interesting difficulties for domain decomposition type approaches for parallel solutions. In particular, it is not fully understood how the partitioning of the particles among the subdomains influences the performance of parallel solvers. We present several strategies for the partitioning of the solid particles, focusing on the effectiveness of these techniques in terms of parallel speedup and efficiency.  相似文献   

16.
分布式计算平台中混合多应用调度策略的研究*   总被引:1,自引:0,他引:1  
本文提出与分析了分布式计算平台中几种混合多应用的调度策略,它主要面向多个并行应用之间的调度而不是应用内部的调度,应用内部的调度采用了常见的工作队列容错调度算法。与资源信息有关的调度(Knowledge-Based)比较起来,这些调度策略运用到了与资源信息无关的调度方式(Knowledge-Free),这使它们的实现更加简单与容易,更加适合于高挥发性的分布式计算系统。针对各种不同的计算强度、资源可用性、任务粒度来划分实验场景,把各种调度策略进行了评测与比较。实验结果表明:这些调度策略各有优缺点,可以作为评估大规模分布式计算环境下的并行分布式应用的有效策略。  相似文献   

17.
We are witnessing the consolidation of the heterogeneous computing in parallel computing with architectures such as Cell Broadband Engine (Cell BE) or Graphics Processing Units (GPUs) which are present in a myriad of developments for high performance computing. These platforms provide a Software Development Kit (SDK) to maximize performance at the expense of dealing with complex and low-level architectural details which makes the software development a daunting task. This paper explores stencil computations in several heterogeneous programming models like Cell SDK, CellSs, ALF and CUDA to optimize the Jacobi method for solving Laplace??s differential equation. We describe the programming techniques to extract the maximum performance on the Cell BE and the GPU, and compare their computing paradigms. Experimental results are shown on two Nvidia Teslas and one IBM BladeCenter QS20 blade which incorporates two 3.2?GHz Cell BEs v?5.1. The speed-up factor for our set of GPU optimizations reaches 3?C4×, and the execution times defeat those of the Cell BE by an order of magnitude, also showing great scalability when moving towards newer GPU generations and/or more demanding problem sizes.  相似文献   

18.
本文主要介绍了大规模油藏数值模拟并行计算技术在国内的研究进展,提供了精细油藏模拟在国产Beowulf系统上的计算实例和应用效果,给出了百万网格点规模的油藏应用算例在不同处理器规模下的数值模拟计算结果与性能分析,并实现了一个针对海量数据可视化的三维图、二维图、表格显示的后处理显示系统.  相似文献   

19.
20.
CPU/GPU协同并行计算研究综述   总被引:6,自引:3,他引:3  
CPU/GPU异构混合并行系统以其强劲计算能力、高性价比和低能耗等特点成为新型高性能计算平台,但其复杂体系结构为并行计算研究提出了巨大挑战。CPU/GPU协同并行计算属于新兴研究领域,是一个开放的课题。根据所用计算资源的规模将CPU/GPU协同并行计算研究划分为三类,尔后从立项依据、研究内容和研究方法等方面重点介绍了几个混合计算项目,并指出了可进一步研究的方向,以期为领域科学家进行协同并行计算研究提供一定参考。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号