首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 0 毫秒
1.
2.
Implementing lattice Boltzmann computation on graphics hardware   总被引:14,自引:0,他引:14  
The Lattice Boltzmann Model (LBM) is a physically-based approach that simulates the microscopic movement of fluid particles by simple, identical, and local rules. We accelerate the computation of the LBM on general-purpose graphics hardware, by grouping particle packets into 2D textures and mapping the Boltzmann equations completely to the rasterization and frame buffer operations. We apply stitching and packing to further improve the performance. In addition, we propose techniques, namely range scaling and range separation, that systematically transform variables into the range required by the graphics hardware and thus prevent overflow. Our approach can be extended to acceleration of the computation of any cellular automata model.  相似文献   

3.
View-dependent multiresolution rendering places a heavy load on CPU. This paper presents a new method on view-dependent refinement of multiresolution meshes by using the computation power of modern programmable graphics hardware (GPU). Two rendering passes using this method are included. During the first pass, the level of detail selection is performed in the fragment shaders. The resultant buffer from the first pass is taken as the input texture to the second rendering pass by vertex texturing, and then the node culling and triangulation can be performed in the vertex shaders. Our approach can generate adaptive meshes in real-time, and can be fully implemented on GPU. The method improves the efficiency of mesh simplification, and significantly alleviates the computing load on CPU.  相似文献   

4.
A methodology to deal with the automatic design of internal pins in injection mold CAD via the automatic recognition of undercut features is developed. The approach to automatically identifying the undercut features is first proposed. For the given parting directions, all the inner and outer undercut features are identified based on the topological relationship of geometrical entities. The outer edge loop, which represents the largest cross-section boundary along the given parting directions, is extracted and patched up. The surfaces of molding are then identified based on the classifications of their geometrical entities. To identify the deep inner undercuts in the molding, the projection of the main core, internal pins and their bounding boxes along the parting direction and the pin withdrawal direction are generated. Upon determination of whether the bounding boxes of any two internal pins and the main core projection have intersection area, the deep inner undercuts are located. The complete methodology is finally implemented and verified through case studies and the efficiency of the methodology in handling complex molded parts is thus illustrated.  相似文献   

5.
Graphics Processing Units (GPUs) were originally designed to manipulate images, but due to their intrinsic parallel nature, they turned into a powerful tool for scientific applications. In this article, we evaluated GPU performance in an implementation of a traditional stochastic simulation – the correlated Brownian motion. This movement can be described by the Generalized Langevin Equation (GLE), which is a stochastic integro-differential equation, with applications in many areas like anomalous diffusion, transport in porous media, noise analysis, quantum dynamics, among many others. Our results show the power inherent in GPU programming when compared to traditional CPUs (Intel): we observed acceleration values up to sixty times by using a NVIDIA GPU in place of a single-core Intel CPU.  相似文献   

6.
Voxelization of solids, that is the representation of a solid by a set of voxels that approximates it, is an operation with important applications in fields like solid modeling, physical simulation or volume graphics. Moreover, the new generation of affordable 3D raster displays has renewed the interest on fast voxelization algorithms, as the scan-conversion of a solid is a basic operation on these devices. In this paper a hardware accelerated method for computing a voxelization of a polyhedron is presented. The algorithm is simple, efficient, robust and handles any kind of polyhedron (self-intersecting, with or without holes, manifold or non-manifold). Three different implementations are described in detail. The first is a conventional implementation in the CPU, the second is a hardware accelerated implementation that uses standard OpenGL primitives, and the third exploits the capabilities of modern GPUs by using vertex programs.  相似文献   

7.
自碰撞检测是可变形体模拟过程中最耗时的环节,提出一种使用图形硬件的快速算法。算法以质点而非三角形作为自碰撞检测的基本单元,用球体包围以质点为中心的局部区域,再用AABB包围该球体的运动轨迹并将数据组织成纹理送入GPU,通过两遍离屏渲染计算出碰撞对集合及每个碰撞对的碰撞发生时间,算法复杂度为O(n)。实验结果表明,使用该算法在大规模布料模拟中检测自碰撞,效率较高。  相似文献   

8.
Control of autonomous systems subject to stochastic uncertainty is a challenging task. In guided airdrop applications, random wind disturbances play a crucial role in determining landing accuracy and terrain avoidance. This paper describes a stochastic parafoil guidance system which couples uncertainty propagation with optimal control to protect against wind and parameter uncertainty in the presence of impact area obstacles. The algorithm uses real-time Monte Carlo simulation performed on a graphics processing unit (GPU) to evaluate robustness of candidate trajectories in terms of delivery accuracy, obstacle avoidance, and other considerations. Building upon prior theoretical developments, this paper explores performance of the stochastic guidance law compared to standard deterministic guidance schemes, particularly with respect to obstacle avoidance. Flight test results are presented comparing the proposed stochastic guidance algorithm with a standard deterministic one. Through a comprehensive set of simulation results, key implementation aspects of the stochastic algorithm are explored including tradeoffs between the number of candidate trajectories considered, algorithm runtime, and overall guidance performance. Overall, simulation and flight test results demonstrate that the stochastic guidance scheme provides a more robust approach to obstacle avoidance while largely maintaining delivery accuracy.  相似文献   

9.
The Sora platform, which is a fully programmable, high-performance software radio platform based on a commodity general purpose PC, has recently received significant attention. However, acceleration techniques used in Sora are too complicated for developers, which can prevent researchers from modifying physical layer (PHY) processing. This paper presents the CuSora platform, which integrates the Sora platform with a popular multi-core graphics processing unit (GPU) as the modem processor to achieve high-speed PHY signal processing. CuSora also exploits software techniques to fulfill requirements for real-time communication. A software controller is presented to achieve multi-mode communication. The features of the single-instruction multiple data parallel computation of the GPU are also employed to accelerate PHY processing. Several wireless protocols, such as WiFi (802.11a) or WiMAX (802.16), are demonstrated on the CuSora platform for verification. CuSora meets the requirement of real-time communication and has an excellent bit error ratio performance. CuSora has a higher performance, shorter development cycle, and better coding flexibility than the Sora platform.  相似文献   

10.
Graphics processors (GPUs) provide a vast number of simple, data-parallel, deeply multithreaded cores and high memory bandwidths. GPU architectures are becoming increasingly programmable, offering the potential for dramatic speedups for a variety of general-purpose applications compared to contemporary general-purpose processors (CPUs). This paper uses NVIDIA’s C-like CUDA language and an engineering sample of their recently introduced GTX 260 GPU to explore the effectiveness of GPUs for a variety of application types, and describes some specific coding idioms that improve their performance on the GPU. GPU performance is compared to both single-core and multicore CPU performance, with multicore CPU implementations written using OpenMP. The paper also discusses advantages and inefficiencies of the CUDA programming model and some desirable features that might allow for greater ease of use and also more readily support a larger body of applications.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号