首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper discusses current status and recent advancements of 3D graphics on mobile platforms and describes open issues concerning its usage in different applications. We have treated two particular application fields. Firstly, we deal with problems of visualization of complex data structures on mobile devices. The implementation of a 3D visualization renderer on the Symbian platform for mobile devices is written as a C++ application and based on the DieselEngine® as a rendering engine. 3D visualization of data is generated in the form of a Virtual Reality Modelling Language (VRML) file meaning that actually any kind of 3D content written in VRML file format can be rendered on such a device. It was the result of a project the objective of which was to provide a user interface on a mobile platform displaying visualization of hierarchical Grid monitoring data. Secondly, we describe the system that brings face animation to embedded platforms. Face animation is considered to be one of the toughest tasks in computer animation today and its delivery to mobile platforms brings possibilities for development of new innovative and attractive services for the mobile market.  相似文献   

2.
基于GPU的虚拟内窥镜场景实时绘制算法   总被引:1,自引:0,他引:1  
朱兵  付飞蚺 《液晶与显示》2013,28(1):127-131
为满足影像引导手术(IGS)中高分辨率、海量数据的实时渲染,提出一种基于GPU的虚拟内窥镜场景实时绘制算法。该算法针对虚拟内窥镜渲染数据的特点(管腔数据占总数据比例小,5%左右),首先对图像进行自动分割,得到管腔组织的分割数据;仅将分割后的数据一次载入图像显存,利用光线投射算法进行渲染,并在多GPU负载方面做了优化。充分利用GPU渲染和并行计算的能力,实现了海量数据(1 024×1 024×1 024)的实时渲染。  相似文献   

3.
Recently graphic processing units (GPUs) are rising as a new vehicle for high-performance, general purpose computing. It is attractive to unleash the power of GPU for Electronic Design Automation (EDA) computations to cut the design turn-around time of VLSI systems. EDA algorithms, however, generally depend on irregular data structures such as sparse matrix and graphs, which pose major challenges for efficient GPU implementations. In this paper, we propose high-performance GPU implementations for a set of important irregular EDA computing patterns including sparse matrix, graph algorithms and message-passing algorithms. In the sparse matrix domain, we solve a core problem, sparse-matrix vector product (SMVP). On a wide range of EDA problem instances, our SMVP implementation outperforms all prior work and achieves a speedup up to 50× over the CPU baseline implementation. The GPU based SMVP procedure is applied to successfully accelerate two core EDA computing engines, timing analysis and linear system solution. In the graph algorithm domain, we developed a SMVP based formulation to efficiently solve the breadth-first search (BFS) problem on GPUs. We also developed efficient solutions for two message-passing algorithms, survey propagation (SP) based SAT solution and a register-transfer level (RTL) simulation. Our results prove that GPUs have a strong potential to accelerate EDA computing through designing GPU-friendly algorithms and/or re-organizing computing structures of sequential algorithms.  相似文献   

4.
This paper proposes new models of GPU energy consumption from the perspectives of hardware architects and graphics programmers by performing an architecture-independent analysis of the classical graphics rendering pipeline which is still in widespread use today. The detailed analysis includes graphics rendering workload, memory bandwidth and energy consumption . Although the models are derived from classical 3D pipeline, they are extensible to programmable pipelines. There are many factors that affect the performance and energy consumption of 3D graphics rendering, such as the number of textures, vertex sharing, level of details, and rendering algorithms. The proposed models are validated by our simulation study and used to guide our 3D graphics hardware design and 3D graphics programming in order to optimize performance and energy consumption of our GPU prototypes which have been successfully fabricated in SMIC 0.13 μm CMOS technology.  相似文献   

5.
海面背景红外实时仿真   总被引:1,自引:0,他引:1       下载免费PDF全文
海面背景红外实时仿真方法研究在现代红外技术发展中具有重要的理论意义和应用价值。结合现有计算机软硬件条件,引入多种优化技术,提出一种海面背景红外实时仿真方法,实现了较高帧速的海面背景红外实时仿真。使用2D FFT技术快速计算海面几何模型数据,引入Geomipmap技术优化组织海面几何模型数据,利用GPU加速红外辐射计算,提出预生成离线数据技术和基于视点渲染的技术,提高了实时渲染速度。实验结果表明,该方法生成的红外海面背景渲染速度可达到300帧/s以上,满足实时仿真的要求,同时,为仿真系统中其他部分的仿真预留出了较多计算资源。  相似文献   

6.
潘卫国  何宁  薛健  吕科  翟锐  代双凤 《电子学报》2016,44(2):472-478
近年来,随着科学数据的快速增长,海量数据的可视化分析成了急需解决的难题.越来越多的处理海量数据的方法向着并行、分布式处理的方向发展.本文提出了一种混合的框架来处理海量的超声数据,该框架通过整合多种硬件环境和计算资源来处理海量数据;所有的数据都存放在一个基于高速网络环境的数据共享中心,具有高性能显卡的前端工作站将耗时的处理任务分配到网络中的计算结点,而自身处理显示和交互的操作;同时基于OpenCL和OpenMP实现了可视化算法在GPU和CPU上的并行计算;核外算法应用在本框架中来处理海量的体数据.实验结果表明,本文提出的框架不仅可以处理海量数据,而且具有较高的交互性能.  相似文献   

7.
GPU Computing   总被引:9,自引:0,他引:9  
The graphics processing unit (GPU) has become an integral part of today's mainstream computing systems. Over the past six years, there has been a marked increase in the performance and capabilities of GPUs. The modern GPU is not only a powerful graphics engine but also a highly parallel programmable processor featuring peak arithmetic and memory bandwidth that substantially outpaces its CPU counterpart. The GPU's rapid increase in both programmability and capability has spawned a research community that has successfully mapped a broad range of computationally demanding, complex problems to the GPU. This effort in general-purpose computing on the GPU, also known as GPU computing, has positioned the GPU as a compelling alternative to traditional microprocessors in high-performance computer systems of the future. We describe the background, hardware, and programming model for GPU computing, summarize the state of the art in tools and techniques, and present four GPU computing successes in game physics and computational biophysics that deliver order-of-magnitude performance gains over optimized CPU applications.  相似文献   

8.
针对字符串匹配算法在各平台实现的性能问题,将算法在CPU、GPU及FPGA上做了测试对比。GPU具有计算单元多的特点,使得GPU对计算密集型应用有较大的效率提升;而FPGA具有级强的灵活性、可编程性及大量的逻辑运算单元,在处理字符串匹配时的处理速度快。通过对3种实现方式在Snort规则库下做的分析,其结果表明,FPGA的处理速度最快,相比GPU的处理速度提升了10倍。而CPU的串行处理速度最慢,且FPGA的资源消耗最多,GPU次之,CPU的资源消耗最少,且实现最简单。  相似文献   

9.
针对数字全息重建算法计算速度慢、实时应用能力弱以及现有GPU加速策略跨平台移植性差等问题,该文提出一种利用开放运算语言(OpenCL)架构提高数字全息重建算法执行效率的方案。该方案充分利用OpenCL架构的异构协同计算能力,对数字全息卷积重建算法进行CPU+GPU的异构运行设计,并采用数据并行模式编程实现。针对不同分辨率数字全息图、不同GPU加速平台的测试结果表明,该加速策略的平均执行时间均比CPU低1个数量级,最高总加速比达到54.2,并行运算加速比甚至高达94.7,且具有规模增长性及良好的跨平台特性,加速效率显著,更加适用于数字全息技术的工程化实现及实时性应用场合。  相似文献   

10.
王纲  季振洲  张泽旭 《电子学报》2012,40(9):1746-1751
 本文提出一种大规模真实感雪景实时渲染方法.首先建立雪花模型及其在降雪过程中的运动模型;其次将风的运动速度分为平均风速与随机风速,简化了风场计算,通过Perlin噪声生成了随机风,并将计算结果存储在3D纹理中,减轻粒子系统的计算负荷;第三,建立积雪与融化模型,实现雪的积累和融化的模拟;最后建立基于GPU的粒子系统,提高粒子系统的渲染效率.仿真结果表明本文的方法适合大规模雪景实时渲染,并已成功应用于某直升机模拟器中.  相似文献   

11.
Gibeom Gu  Duksu Kim 《ETRI Journal》2020,42(4):608-618
We present a novel GPU‐based ray‐casting algorithm for volume rendering of unstructured grid data. Our volume rendering system uses a ray‐casting method that guarantees accurate rendering results. We also employ the per‐pixel intersection list concept in the Bunyk algorithm to guarantee an accurate result for non‐convex meshes. For efficient memory access for the lists on the GPU, we represent the intersection lists for all faces as an array with our novel construction algorithm. With the intersection lists, we perform ray‐casting on a GPU, and a GPU thread handles each ray. To increase ray‐coherency in a thread block and improve memory access efficiency, we extend a prior image‐tile‐based work distribution method to fit modern GPU architectures. We also show that a prior approach using a per‐thread local buffer to reduce redundant computation is not appropriate for modern GPU architectures. Instead, we take an on‐demand calculation strategy that achieves better performance even though it allows duplicate computations. We applied our method to three unstructured grid datasets with different characteristics. With a GPU, our method achieved up to 36.5 times higher performance for the ray‐casting process and 19.7 times higher performance for the whole volume rendering process compared with the Bunyk algorithm using a CPU core. Also, our approach showed up to 8.2 times higher performance than a GPU‐based cell projection method while generating more accurate rendering results. These results demonstrate the efficiency and accuracy of our method.  相似文献   

12.
高分辨率地形高程和影像数据给交互式3维地形可视化应用带来沉重压力,主要体现在数据存储、调度传输及实时渲染等方面。该文设计一种基于提升小波变换与并行混合熵编码的地形数据高性能压缩方法,并结合图形处理器(Graphics Process Unit, GPU)Ray-casting实现大规模3维地形可视化。首先建立多分辨率地形块的小波变换模型来映射其求精和化简操作;其次,基于提升小波变换分别构建格网数字高程模型(Digital Elevation Model, DEM)和地表纹理的多分辨率四叉树,对量化后的稀疏小波系数引入并行游程编码与并行变长霍夫曼编码相结合的混合熵编码进行压缩;将压缩数据组织成多序列层进码流进行实时解压渲染。在GPU上基于统一计算设备构架(Compute Unified Device Architecture, CUDA)实现该文的提升小波变换与混合熵编码。实验表明,在压缩比、信噪比与编解码的数据吞吐量综合指标方面,该文方法优于其它类似方法。实时渲染的高帧率满足了交互式可视化的要求。  相似文献   

13.
新一代人工智能技术的特征,表现为借助GPU计算、云计算等高性能分布式计算能力,使用以深度学习算法为代表的机器学习算法,在大数据上进行学习训练,来模拟、延伸和扩展人的智能。不同数据来源、不同的计算物理位置,使得目前的机器学习面临严重的隐私泄露问题,因此隐私保护机器学习(PPM)成为目前广受关注的研究领域。采用密码学工具来解决机器学习中的隐私问题,是隐私保护机器学习重要的技术。该文介绍隐私保护机器学习中常用的密码学工具,包括通用安全多方计算(SMPC)、隐私保护集合运算、同态加密(HE)等,以及应用它们来解决机器学习中数据整理、模型训练、模型测试、数据预测等各个阶段中存在的隐私保护问题的研究方法与研究现状。  相似文献   

14.
In this paper, a 3D display processor embedding a programmable 3D graphics rendering engine is proposed. The proposed processor combines a 3D graphics rendering engine and a 3D image synthesis engine to support both true realism and interactivity for the future multimedia applications. Using high coherence between 3D graphics data and 3D display inputs, both pipelines are merged by sharing buffers such that a 3D display engine directly uses the output of a 3D graphics rendering engine. The merged architecture has synergetic coupling effects such as freely providing various rendering effects to 3D images and easily computing disparities without complex extraction processes. In the 3D image synthesis engine, we adopt view interpolation algorithm and propose real-time synthesis method, pixel-by-pixel process. The view interpolation algorithm reduces the number of images to be rendered, resulting in the reduction of external memory size to 64.8% compared to conventional synthesis process. The proposed pixel-by-pixel process synthesizes 3D images at 36 fps through bandwidth reduction of 26.7% and decreases internal memory size to 64.2% compared to typical image-by-image process. The 3D graphics rendering engine is programmable and supports the instruction sets of the latest 3D graphics standard APIs, Pixel Shader 3.0 and OpenGL|ES 2.0. The die contains about 1.7 M transistors, occupies 5 mm times 5 mm in 0.18 mum CMOS and dissipates 379 mW at 1.85 V.  相似文献   

15.
目前主流三维数字地球平台在数据支持和数据管理方面存在兼容性缺陷。针对三维数字地球平台建设需求,研究了osgEarth 的实时地形渲染机制及三维数字地球开发过程中数据加载和海浪模拟技术难点。针对数据实时加载问题,提出基于数据库存储的LOD四叉树模型;针对海浪模拟逼真度不够问题,提出基于改良Higgins算法的海浪模拟。重点结合软件构件化的设计思想,进行了面向实际工程应用的三维数字地球平台开发。  相似文献   

16.
詹洪陈  袁杰 《现代电子技术》2012,35(20):87-90,94
通过Matlab和Visual C++两个平台,实现了对图像工程的并行加速处理,并且通过Jacket,CUDA两种加速方案的介绍,进一步了解使用GPU高性能并行计算的工作流程以及性能效益。最后,给出了通过并行处理之后的关于两个图像工程计算性能的测试结果及比对。结果证明,经并行处理后的图像工程在计算效率方面有显著提高,结果精确,计算耗时小。  相似文献   

17.
赵杨 《电子测试》2020,(4):37-39,97
本文针对图像及视频风格化的实际工程应用,提出了基于GPU加速的图像及视频的艺术风格化实时渲染算法,并实现了一个实时图像及视频艺术风格化绘制系统。该系统能够很好的利用GPU并行计算的特性,对耗时的像素遍历读取处理进行并行加速,实现了对输入图像及视频的梵高流线油画风格的快速转换,为用户提供了较好的交互体验。  相似文献   

18.
Medical simulations of lung dynamics promise to be effective tools for teaching and training clinical and surgical procedures related to lungs. Their effectiveness may be greatly enhanced when visualized in an augmented reality (AR) environment. However, the computational requirements of AR environments limit the availability of the central processing unit (CPU) for the lung dynamics simulation for different breathing conditions. In this paper, we present a method for computing lung deformations in real time by taking advantage of the programmable graphics processing unit (GPU). This will save the CPU time for other AR-associated tasks such as tracking, communication, and interaction management. An approach for the simulations of the three-dimensional (3-D) lung dynamics using Green's formulation in the case of upright position is taken into consideration. We extend this approach to other orientations as well as the subsequent changes in breathing. Specifically, the proposed extension presents a computational optimization and its implementation in a GPU. Results show that the computational requirements for simulating the deformation of a 3-D lung model are significantly reduced for point-based rendering.  相似文献   

19.

Recent advances in general-purpose graphics processing units (GPGPUs) have resulted in massively parallel hardware that is widely available to achieve high performance in desktop, notebook, and even mobile computer systems. While multicore technology has become the norm of modern computers, programming such systems requires the understanding of underlying hardware architecture and hence posts a great challenge for average programmers, who might be professionals in specific domains, but not experts in parallel programming. This paper presents a GUI tool called GPUBlocks that can facilitate parallel programming on multicore computer systems. GPUBlocks is developed based on the OpenBlocks framework, an extendable tool for graphical programming, to construct the GUI-based programming environment for CUDA and OpenCL parallel computing platforms. Programmers simply need to drag-n-drop blocks, fill the fields of the blocks, and connect them according to array or matrix computations that are specified by algorithms. GPUBlocks can then translate block-based code to CUDA or OpenCL programs. Furthermore, a couple of optimization constructs have also been offered for rapid program optimization. Experimental results have shown that the generated CUDA and OpenCL programs can achieve reasonable speedups on GPUs. Consequently, GPUBlocks can be used as a tool for fast prototyping of GPU applications or a platform for educational parallel programming.

  相似文献   

20.
Parallel processing over the Internet is now becoming a realistic possibility. There are numerous of-the-shelf high-performance computing (HPC) platforms available with Internet access, on which to implement computationally intensive algorithms. HPC can be applied in the field of computational electromagnetics. The networking capabilities of the Internet now allow these computing resources to be used as a remote service. Additionally, the pragmatics of their utilization can be abstracted by adopting a World Wide Web (WWW) interface. A Web-based environment can provide the supportive tools for data entry, program initiation, result visualization, and even interactive modifications of the geometry and/or electromagnetic (EM) properties. For realistic interaction, the emerging question is which algorithm to use that supports the exploitation of parallelism. In order to exploit and utilize all the available performance of current and predicted HPC platforms, inherently-parallel-based algorithms have to be devised. One such algorithm is the parallel method of moments/method of auxiliary sources, P(MoM/MAS), introduced in this paper. The resulting algorithm parallelization enables the MoM/MAS method to be applied to solving electrically-large-in-size and complex EM structures on various computational platforms. This paper concentrates on the parallel-processing issues, and on the importance of adopting suitable algorithms, such as the MoM/MAS technique  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号