期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Parallel Shear-Warp Factorization Volume Rendering Using Efficient 1-D and 2-D Partitioning Schemes for Distributed Memory Multicomputers

Ching-Feng Lin Don-Lin Yang Yeh-Ching Chung 《The Journal of supercomputing》2002,22(3):277-302

3-D data visualization is very useful for medical imaging and computational fluid dynamics. Volume rendering can be used to exhibit the shape and volumetric properties of 3-D objects. However, volume rendering requires a considerable amount of time to process the large volume of data. To deliver the necessary rendering rates, parallel hardware architectures such as distributed memory multicomputers offer viable solutions. The challenge is to design efficient parallel algorithms that utilize the hardware parallelism effectively. In this paper, we present two efficient parallel volume rendering algorithms, the 1D-partition and 2D-partition methods, based on the shear-warp factorization for distributed memory multicomputers. The 1D-partition method has a performance bound on the size of the volume data. If the number of processors is less than a threshold, the 1D-partition method can deliver a good rendering rate. If the number of processors is over a threshold, the 2D-partition method can be used. To evaluate the performance of these two algorithms, we implemented the proposed methods along with the slice data partitioning, volume data partitioning, and sheared volume data partitioning methods on an IBM SP2 parallel machine. Six volume data sets were used as the test samples. The experimental results show that the proposed methods outperform other compatible algorithms for all test samples. When the number of processors is over a threshold, the experimental results also demonstrate that the 2D-partition method is better than the 1D-partition method. 相似文献

2.

Scalable Ray Tracing Using the Distributed FrameBuffer

Will Usher Ingo Wald Jefferson Amstutz Johannes Günther Carson Brownlee Valerio Pascucci 《Computer Graphics Forum》2019,38(3):455-466

Image‐ and data‐parallel rendering across multiple nodes on high‐performance computing systems is widely used in visualization to provide higher frame rates, support large data sets, and render data in situ. Specifically for in situ visualization, reducing bottlenecks incurred by the visualization and compositing is of key concern to reduce the overall simulation runtime. Moreover, prior algorithms have been designed to support either image‐ or data‐parallel rendering and impose restrictions on the data distribution, requiring different implementations for each configuration. In this paper, we introduce the Distributed FrameBuffer, an asynchronous image‐processing framework for multi‐node rendering. We demonstrate that our approach achieves performance superior to the state of the art for common use cases, while providing the flexibility to support a wide range of parallel rendering algorithms and data distributions. By building on this framework, we extend the open‐source ray tracing library OSPRay with a data‐distributed API, enabling its use in data‐distributed and in situ visualization applications. 相似文献

3.

四种体绘制算法的分析与评价 总被引：9，自引：0，他引：9

尹学松张谦吴国华潘志庚《计算机工程与应用》2004,40(16):97-100

体绘制技术是科学计算可视化的重要组成部分,具有较大的研究价值和广阔的应用前景。体绘制有两种方法即间接体绘制和直接体绘制,直接体绘制(简称体绘制)以其在体数据处理及特征信息表现方面的优势,已经得到了研究者越来越多的重视。文章对四种典型的体绘制算法进行了描述,并概述了它们的改进之处,同时对它们各自的性能进行了分析和评价。相似文献

4.

医学体数据三维可视化方法的分类与评价 总被引：23，自引：0，他引：23

下载免费PDF全文

沈海戈柯有安《中国图象图形学报》2000,5(7):545-550

医学图象三维可视化具有极大的医学研究和临床诊疗应用前景,是现代医学影象研究的重要领域。医学图象三维可视化方法通常分为表面绘制和体绘制,已有许多具体的算法提出,一些算法兼具表面绘制和体绘制的特点,归为混合绘制方法,该文概述了一些典型算法,并讨论了其特点及相互联系,同时对各类方法的应用场合及前景了分析和评价。相似文献

5.

Adaptive parallel rendering on multiprocessors and workstationclusters

Wai-Sum Lin Lau R.W.H. Kai Hwang Xiaola Lin Cheung P.Y.S. 《Parallel and Distributed Systems, IEEE Transactions on》2001,12(3):241-258

This paper presents the design and performance of a new parallel graphics renderer for 3D images. This renderer is based on an adaptive supersampling approach that works for time/space-efficient execution on two classes of parallel computers. Our rendering scheme takes subpixel supersamples only along polygon edges. This leads to a significant reduction in rendering time and in buffer memory requirements. Furthermore, we offer a balanced rasterization of all transformed polygons. Experimental results prove these advantages on both a shared-memory SGI multiprocessor server and a Unix cluster of Sun workstations. We reveal performance effects of the new rendering scheme on subpixel resolution, polygon number, scene complexity, and memory requirements. The balanced parallel renderer demonstrates scalable performance with respect to increase in graphic complexity and in machine size. Our parallel renderer outperforms Crow's scheme in benchmark experiments performed. The improvements are made in three fronts: (1) reduction in rendering time, (2) higher efficiency with balanced workload,: and (3) adaptive to available buffer memory size. The balanced renderer can be more cost-effectively embedded within many 3D graphics algorithms, such as those for edge smoothing and 3D visualization. Our parallel renderer is MPI-coded, offering high portability and cross-platform performance. These advantages can greatly improve the QoS in 3D imaging and in real-time interactive graphics 相似文献

6.

Analysis of a parallel volume rendering system based on theshear-warp factorization

Lacroute P. 《IEEE transactions on visualization and computer graphics》1996,2(3):218-231

This paper presents a parallel volume rendering algorithm that can render a 256×256×225 voxel medical data set at over 15 Hz and a 512×512×334 voxel data set at over 7 Hz on a 32-processor Silicon Graphics Challenge. The algorithm achieves these results by minimizing each of the three components of execution time: computation time, synchronization time, and data communication time. Computation time is low because the parallel algorithm is based on the recently-reported shear-warp serial volume rendering algorithm which is over five times faster than previous serial algorithms. The algorithm uses run-length encoding to exploit coherence and an efficient volume traversal to reduce overhead. Synchronization time is minimized by using dynamic load balancing and a task partition that minimizes synchronization events. Data communication costs are low because the algorithm is implemented for shared-memory multiprocessors, a class of machines with hardware support for low-latency fine-grain communication and hardware caching to hide latency. We draw two conclusions from our implementation. First, we find that on shared-memory architectures data redistribution and communication costs do not dominate rendering time. Second, we find that cache locality requirements impose a limit on parallelism in volume rendering algorithms. Specifically, our results indicate that shared-memory machines with hundreds of processors would be useful only for rendering very large data sets 相似文献

7.

Analysis of multidimensional images on the Connection Machine system

Giampiero Marcenaro Massimo Tistarelli 《Concurrency and Computation》1991,3(6):699-713

The Connection Machine (CM) has been demonstrated to be an efficient and fast computational engine for the solution of many problems related to image processing. The high-level parallelism of the CM naturally fits to many large-scale data intensive applications. In this paper the implementation of parallel algorithms for the analysis of multidimensional images on the CM is presented. Different aspects in the analysis of multidimensional images are considered. In the field of artificial vision, the implementation of algorithms for the filtering of image sequences (both in space and time) and the estimation of the optical flow is described and some results in terms of accuracy and computation time are presented. The processing of three-dimensional images is investigated in the field of biomedical engineering. In this case the goal is the development of algorithms for the 3-D reconstruction of human body segments and their visualization. The parallel implementations exploit the fine grain parallelism allowed by the CM, processing each point of the data on a different processor. This mechanism is allowed by the possibility of dynamically reconfiguring the connectivity of the CM nodes and of defining a huge number of virtual processors. Moreover, as the CM processors operate on one-bit data, it is possible to tune the number of bits for each data point to match the accuracy required by the application. 相似文献

8.

DPGL： The Direct3D9-based Parallel Graphics Library for Multi-display Environment

Zhen Liu Jiao-Ying Shi 《国际自动化与计算杂志》2007,4(1):30-37

The emergence of high performance 3D graphics cards has opened the way to PC clusters for high performance multi- display environment.In order to exploit the rendering ability of PC clusters,we should design appropriate parallel rendering algorithms and parallel graphics library interfaces.Due to the rapid development of Direct3D,we bring forward DPGL,the Direct3D9-based parallel graphics library in D3DPR parallel rendering system,which implements Direct3D9 interfaces to support existing Direct3D9 application parallelization with no modification.Based on the parallelism analysis of Direct3D9 rendering pipeline,we briefly introduce D3DPR parallel rendering system.DPGL is the fundamental component of D3DPR.After presenting DPGL three layers architecture, we discuss the rendering resource interception and management.Finally,we describe the design and implementation of DPGL in detail, including rendering command interception layer,rendering command interpretation layer and rendering resource parallelization layer. 相似文献

9.

基于核方法的并行模糊聚类算法 总被引：1，自引：0，他引：1

彭秋生魏文红《计算机工程与设计》2008,29(8):1881-1883

介绍并分析了模糊C-均值聚类算法、基于核方法的模糊C-均值聚类算法以及硬聚类算法.将硬聚类算法和模糊聚类算法结合起来,利用硬聚类算法初始化聚类中心,有效的减少模糊聚类算法的迭代次数.针对海量数据处理问题,将改进后的算法并行化,有效地提高了数据处理速度和效率,并在分布式互联PC环境下进行了性能测试.测试结果表明,基于核方法的并行模糊聚类算法具有很好的规模增长性和加速比. 相似文献

10.

Applying Visual Analytics to Physically Based Rendering

G. Simons S. Herholz V. Petitjean T. Rapp M. Ament H. Lensch C. Dachsbacher M. Eisemann E. Eisemann 《Computer Graphics Forum》2019,38(1):197-208

Physically based rendering is a well‐understood technique to produce realistic‐looking images. However, different algorithms exist for efficiency reasons, which work well in certain cases but fail or produce rendering artefacts in others. Few tools allow a user to gain insight into the algorithmic processes. In this work, we present such a tool, which combines techniques from information visualization and visual analytics with physically based rendering. It consists of an interactive parallel coordinates plot, with a built‐in sampling‐based data reduction technique to visualize the attributes associated with each light sample. Two‐dimensional (2D) and three‐dimensional (3D) heat maps depict any desired property of the rendering process. An interactively rendered 3D view of the scene displays animated light paths based on the user's selection to gain further insight into the rendering process. The provided interactivity enables the user to guide the rendering process for more efficiency. To show its usefulness, we present several applications based on our tool. This includes differential light transport visualization to optimize light setup in a scene, finding the causes of and resolving rendering artefacts, such as fireflies, as well as a path length contribution histogram to evaluate the efficiency of different Monte Carlo estimators. 相似文献

11.

Efficient External Memory Algorithms by Simulating Coarse-Grained Parallel Algorithms

Dehne Dittrich Hutchinson 《Algorithmica》2003,36(2):97-122

External memory (EM) algorithms are designed for large-scale computational problems in which the size of the internal memory of the computer is only a small fraction of the problem size. Typical EM algorithms are specially crafted for the EM situation. In the past, several attempts have been made to relate the large body of work on parallel algorithms to EM, but with limited success. The combination of EM computing, on multiple disks, with multiprocessor parallelism has been posted as a challenge by the ACM Working Group on Storage I/ O for Large-Scale Computing. In this paper we provide a simulation technique which produces efficient parallel EM algorithms from efficient BSP-like parallel algorithms. The techniques obtained can accommodate one or multiple processors on the EM target machine, each with one or more disks, and they also adapt to the disk blocking factor of the target machine. When applied to existing BSP-like algorithms, our simulation technique produces improved parallel EM algorithms for a large number of problems. 相似文献

12.

优化处理并行数据库查询的并行数据流方法 总被引：1，自引：0，他引：1

李建中《软件学报》1998,9(3):174-180

本文使用并行数据流技术优化和处理并行数据库查询的方法,提出了一整套相关算法,并给出了一个基于并行数据流方法的并行数据库查询优化处理器的完整设计.这些算法和相应的查询优化处理器已经用于作者自行设计的并行数据库管理系统原型.实践证明,并行数据流方法不仅能够快速有效地实现并行数据库管理系统,也能够有效地进行并行数据库查询的优化处理. 相似文献

13.

Table‐driven Adaptive Importance Sampling

David Cline Daniel Adams Parris Egbert 《Computer Graphics Forum》2008,27(4):1115-1123

Monte Carlo rendering algorithms generally rely on some form of importance sampling to evaluate the measurement equation. Most of these importance sampling methods only take local information into account, however, so the actual importance function used may not closely resemble the light distribution in the scene. In this paper, we present Table‐driven Adaptive Importance Sampling (TAIS), a sampling technique that augments existing importance functions with tabular importance maps that direct sampling towards undersampled regions of path space. The importance maps are constructed lazily, relying on information gathered during the course of sampling. During sampling the importance maps act either in parallel with or as a preprocess to existing importance sampling methods. We show that our adaptive importance maps can be effective at reducing variance in a number of rendering situations. 相似文献

14.

A GPU implementation for LBG and SOM training 总被引：1，自引：1，他引：0

Yi Xiao Chi Sing Leung Tze-Yui Ho Ping-Man Lam 《Neural computing & applications》2011,20(7):1035-1042

Vector quantization (VQ) is an effective technique applicable in a wide range of areas, such as image compression and pattern recognition. The most time-consuming procedure of VQ is codebook training, and two of the frequently used training algorithms are LBG and self-organizing map (SOM). Nowadays, desktop computers are usually equipped with programmable graphics processing units (GPUs), whose parallel data-processing ability is ideal for codebook training acceleration. Although there are some GPU algorithms for LBG training, their implementations suffer from a large amount of data transfer between CPU and GPU and a large number of rendering passes within a training iteration. This paper presents a novel GPU-based training implementation for LBG and SOM training. More specifically, we utilize the random write ability of vertex shader to reduce the overheads mentioned above. Our experimental results show that our approach can run four times faster than the previous approach. 相似文献

15.

A relaxation scheme for increasing the parallelism in Jacobi-SVD

Sanguthevar Rajasekaran Mingjun Song 《Journal of Parallel and Distributed Computing》2008

The Singular Value Decomposition (SVD) is a vital problem that finds a place in numerous application domains in science and engineering. As an example, SVDs are used in processing voluminous datasets. Many sequential and parallel algorithms have been proposed to compute SVDs. The best known sequential algorithms take cubic time. This amount of time may not be acceptable especially when the data size is large. Thus parallel algorithms are desirable. In this paper, we present a novel technique for the parallel computation of SVDs. This technique yields impressive speedups. 相似文献

16.

复式并行流水线在基于PC集群机的并行绘制中的应用 总被引：2，自引：0，他引：2

彭浩宇金哲凡秦爱红熊华石教英《计算机辅助设计与图形学学报》2006,18(10):1581-1586

提出基于动态绘制组的混合式体系结构,除了动态绘制组间的并行处理流水线外,在动态绘制组内部设计了缓帧并行流水线改进工作流程,形成了复式的并行绘制流水线,大大地提高了基于PC集群机的并行图形绘制系统的整体性能.采用此复式流水线的原型系统在实际测试中表现突出,性能比单层并行绘制流水线绘制系统有较大提高. 相似文献

17.

从同心拼图中恢复深度信息 总被引：2，自引：0，他引：2

李寅卢汉清沈向洋《计算机学报》2000,23(12):1306-1312

同心拼图（Concentric Mosaics,CM)是一种重要的基于图像的绘制方法。如果利用深度信息则可以进一步提高绘制质量并减少数据量。作者发现在CM序列中也存在着近拟的极线平面图像（Epipolar Plane Image,EPI）,而且像点在EPI图像上的轨迹斜率和物点的深度呈近似线性关系。基于此发现,该文提出了一种从CM序列中自动动恢复深度信息的方法。这个方法首先利用EPI图像的频谱分析对场景的深度分布范围做出估计,然后根据全光采样的原理在EPI的斜率空间均匀的分割投票箱,并对给定窗口内频谱能量进行投票,以求得能量的最大方向,从而获得这个窗口所对应的深度,最后组合成CM的场景深度图,实验结果证实了上述方法的有效性。相似文献

18.

Algorithms for rendering realistic terrain image sequences and their parellel implementation

Gennady Agranov Craig Gotsman 《The Visual computer》1995,11(9):455-464

We present algorithms for rendering realistic images of large terrains and their implementation on a parallel computer for rapid production of terrain-animation sequences. “Large” means datasets too large for RAM. A hybrid ray-casting and projection technique incorporates quadtree subdivision techniques and filtering using precomputed bit masks. Hilbert space-filling curves determine the imagepixel rendering order. A parallel version of the algorithm is based on a Meiko parallel computer architecture, designed to relieve dataflow bottlenecks and exploit temporal image coherence. Our parallel system, incorporating 26 processors, can generate a full color-terrain image at video resolution (without noticable aliasing artifacts) every 2 s, including I/O and communication overheads. 相似文献

19.

非规则数据场体绘制技术的研究

洪雄戴光明《微机发展》2004,14(8):44-46

科学计算可视化的核心是三维数据场的可视化．当前三维可视化的研究热点是体绘制技术。文中介绍了三维非规则数据场体绘制技术的研究现状。在此基础上，通过对已有非规则数据场体绘制技术和算法的分析比较．预测非规则数据场体绘制技术今后的发展趋势以及将来应该重视的研究方向。除了改进已有算法、将各种算法结合起来外，还应该在硬件及系统加速技术方面做研究，同时结合漫游技术研究和开发高效的三维空间非规则数据场的可视化技术和并行算法。相似文献

20.

Parallel computing of 3D smoking simulation based on OpenCL heterogeneous platform

Zhiyong Yuan Weixin Si Xiangyun Liao Zhaoliang Duan Yihua Ding Jianhui Zhao 《The Journal of supercomputing》2012,61(1):84-102

Open Computing Language (OpenCL) is an open royalty-free standard for general purpose parallel programming across Central Processing Units (CPUs), Graphic Processing Units (GPUs) and other processors. This paper introduces OpenCL to implement real-time smoking simulation in a virtual surgery training simulation system. Firstly, the Computational Fluid Dynamics (CFD) is adopted to construct the real-time smoking simulation model based on the Navier?CStokes (N-S) equations of an incompressible fluid under the condition of normal temperature and pressure. Then we propose a parallel computing technique based on OpenCL to accomplish the parallel computing of smoking simulation model on CPU and GPU, respectively. Finally, we render the smoke in real time by using a three-dimensional (3D) texture volume rendering method. Experimental results show that the parallel computing technique we have proposed achieve a satisfactory effect on image quality and rendering rate both on CPU and GPU. 相似文献