首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 53 毫秒
1.
In this paper, we propose an efficient solution that addresses the performance problems of current single-pass GPU raycasting algorithms. Our paper provides more control over the rendering process by introducing tighter ray segments for raycasting, while at the same time avoiding the introduction of any new rendering artefacts. We achieve this by dynamically generating, on the GPU, a coarsely fitted proxy geometry, composed of spheres, for the active blocks. The spheres are then rasterised into two z-buffers by a single rendering pass. The resulting two z-buffers are used as the first-hit and last-hit points for the subsequent raycaster. With this approach, only the valid ray segments between the two z-buffers need to be sampled during raycasting. This also provides more coherent parallelism on the GPU due to more consistent ray length and avoidance of the overheads and dynamic branching of performing checks on a per-sample basis during the raycasting pass.
Our technique is ideal for dynamic data exploration in which both the transfer function and view parameters need to be changed frequently at runtime. The rendering results of our algorithm are identical to the general cube-based proxy geometry algorithm, but the performance can be up to 15.7 times faster. Furthermore, the approach can be adopted by any existing raycasting system in a straightforward way.  相似文献   

2.
Metaballs are implicit surfaces widely used to model curved objects, represented by the isosurface of a density field defined by a set of points. Recently, the results of particle‐based simulations have been often visualized using a large number of metaballs, however, such visualizations have high rendering costs. In this paper we propose a fast technique for rendering metaballs on the GPU. Instead of using polygonization, the isosurface is directly evaluated in a per‐pixel manner. For such evaluation, all metaballs contributing to the isosurface need to be extracted along each viewing ray, on the limited memory of GPUs. We handle this by keeping a list of metaballs contributing to the isosurface and efficiently update it. Our method neither requires expensive precomputation nor acceleration data structures often used in existing ray tracing techniques. With several optimizations, we can display a large number of moving metaballs quickly.  相似文献   

3.
We develop an approach for hardware‐accelerated, high‐quality rendering of volume data using trivariate splines. The proposed quasi‐interpolating schemes are realtime reconstructions. The low total degrees provide several advantages for our GPU implementation. In particular, intersecting rays with spline isosurfaces for direct Phong illumination is performed by simple root finding algorithms (analytic and iterative), while the necessary normals result from blossoming. Since visualizations are on a fragment base, our renderer for isosurfaces includes an automatic level of detail. While we use well‐known spatial data structures in the CPU part of the algorithm for hierarchical view frustum culling and memory reduction, our GPU implementations have to take the highly complex structure of the splines into account. These include an appropriate organization of the data streams, i.e. we develop an advanced encoding scheme for the spline coefficients, as well as an implicit scheme for bounding geometry retrieval. In addition, we propose an elaborated clipping procedure to be performed in the fragment shader. These features essentially reduce bus traffic, memory consumption, and data access on the GPU leading to interactive frame rates for renderings of high visual quality. Compared with pure CPU implementations and existing GPU implementations for trivariate polynomials frame rates increase by factors between 10 and 100.  相似文献   

4.
Adaptive Caustic Maps Using Deferred Shading   总被引:1,自引:0,他引:1  
Caustic maps provide an interactive image-space method to render caustics, the focusing of light via reflection and refraction. Unfortunately, caustic mapping suffers problems similar to shadow mapping: aliasing from poor sampling and map projection as well as temporal incoherency from frame-to-frame sampling variations. To reduce these problems, researchers have suggested methods ranging from caustic blurring to building a multiresolution caustic map. Yet these all require a fixed photon sampling, precluding the use of importance-based photon densities. This paper introduces adaptive caustic maps. Instead of densely sampling photons via a rasterization pass, we adaptively emit photons using a deferred shading pass. We describe deferred rendering for refractive surfaces, which speeds rendering of refractive geometry up to 25% and with adaptive sampling speeds caustic rendering up to 200%. These benefits are particularly noticable for complex geometry or using millions of photons. While developed for a GPU rasterizer, adaptive caustic map creation can be performed by any renderer that individually traces photons, e.g., a GPU ray tracer.  相似文献   

5.
In this paper, we present a rapid prototyping framework for GPU‐based volume rendering. Therefore, we propose a dynamic shader pipeline based on the SuperShader concept and illustrate the design decisions. Also, important requirements for the development of our system are presented. In our approach, we break down the rendering shader into areas containing code for different computations, which are defined as freely combinable, modularized shader blocks. Hence, high‐level changes of the rendering configuration result in the implicit modification of the underlying shader pipeline. Furthermore, the prototyping system allows inserting custom shader code between shader blocks of the pipeline at run‐time. A suitable user interface is available within the prototyping environment to allow intuitive modification of the shader pipeline. Thus, appropriate solutions for visualization problems can be interactively developed. We demonstrate the usage and the usefulness of our framework with implementations of dynamic rendering effects for medical applications.  相似文献   

6.
Visualizing dynamic participating media in particle form by fully solving equations from the light transport theory is a computationally very expensive process. In this paper, we present a computational pipeline for particle volume rendering that is easily accelerated by the current GPU. To fully harness its massively parallel computing power, we transform input particles into a volumetric density field using a GPU-assisted, adaptive density estimation technique that iteratively adapts the smoothing length for local grid cells. Then, the volume data is visualized efficiently based on the volume photon mapping method where our GPU techniques further improve the rendering quality offered by previous implementations while performing rendering computation in acceptable time. It is demonstrated that high quality volume renderings can be easily produced from large particle datasets in time frames of a few seconds to less than a minute.  相似文献   

7.
We present an efficient and scalable system that enables programmable motion effects on GPUs. Our system is based on the framework proposed by Schmid et al. [ [SSBG10] ] that extends the concept of a surface shader to that of a programmable motion effect. While capable of expressing a variety of motion depiction styles, the execution of motion effect programs requires global knowledge about all portions of an object's surface that passes in front of a pixel during an arbitrarily long period of time, resulting in extremely high memory usage and significantly restricting the degree of parallelism of typical GPU rendering algorithms that parallelize computations over pixels in each frame of animations. To address this problem, we design our system to process multiple frames of a pixel in parallel. This new parallelization approach enables better utilization of GPU memory and also makes it possible to design an efficient out‐of‐core algorithm required in rendering real‐world animations. We also develop an analytical visibility algorithm to resolve depth conflicts of objects, reducing the required temporal resampling rate and further exposing parallelism. Experiments show that we are able to handle very large scenes and improve runtime performance up to an order of magnitude.  相似文献   

8.
In this paper, we present a new approach for shape‐grammar‐based generation and rendering of huge cities in real‐time on the graphics processing unit (GPU). Traditional approaches rely on evaluating a shape grammar and storing the geometry produced as a preprocessing step. During rendering, the pregenerated data is then streamed to the GPU. By interweaving generation and rendering, we overcome the problems and limitations of streaming pregenerated data. Using our methods of visibility pruning and adaptive level of detail, we are able to dynamically generate only the geometry needed to render the current view in real‐time directly on the GPU. We also present a robust and efficient way to dynamically update a scene's derivation tree and geometry, enabling us to exploit frame‐to‐frame coherence. Our combined generation and rendering is significantly faster than all previous work. For detailed scenes, we are capable of generating geometry more rapidly than even just copying pregenerated data from main memory, enabling us to render cities with thousands of buildings at up to 100 frames per second, even with the camera moving at supersonic speed.  相似文献   

9.
We present a hybrid ray tracing system, where the work is divided between the CPU cores and the GPU in an integrated chip, and communication occurs via shared memory. Rays are organized in large packets that can be distributed among the two units as needed. Testing visibility between rays and the scene is mostly performed using an optimized kernel on the GPU, but the CPU can help as necessary. The CPU cores typically handle most or all shading, which makes it easy to support complex appearances. For efficiency, the CPU cores shade whole batches of rays by sorting them on material and shading each material using a vectorized kernel. In addition, we introduce a method to support light paths with arbitrary recursion, such as multiple recursive Whitted‐style ray tracing and adaptive sampling where the result of a ray is examined before sending the next, while still batching up rays for the benefit of GPU‐accelerated traversal and vectorized shading. This allows our system to achieve high rendering performance while maintaining the flexibility to accommodate different rendering algorithms.  相似文献   

10.
Higher‐order finite element methods have emerged as an important discretization scheme for simulation. They are increasingly used in contemporary numerical solvers, generating a new class of data that must be analyzed by scientists and engineers. Currently available visualization tools for this type of data are either batch oriented or limited to certain cell types and polynomial degrees. Other approaches approximate higher‐order data by resampling resulting in trade‐offs in interactivity and quality. To overcome these limitations, we have developed a distributed visualization system which allows for interactive exploration of non‐conforming unstructured grids, resulting from space‐time discontinuous Galerkin simulations, in which each cell has its own higher‐order polynomial solution. Our system employs GPU‐based raycasting for direct volume rendering of complex grids which feature non‐convex, curvilinear cells with varying polynomial degree. Frequency‐based adaptive sampling accounts for the high variations along rays. For distribution across a GPU cluster, the initial object‐space partitioning is determined by cell characteristics like the polynomial degree and is adapted at runtime by a load balancing mechanism. The performance and utility of our system is evaluated for different aeroacoustic simulations involving the propagation of shock fronts.  相似文献   

11.
Level‐of‐Detail structures are a key component for scalable rendering. Built from raw 3D data, these structures are often defined as Bounding Volume Hierarchies, providing coarse‐to‐fine adaptive approximations that are well‐adapted for many‐view rasterization. Here, the total number of pixels in each view is usually low, while the cost of choosing the appropriate LoD for each view is high. This task represents a challenge for existing GPU algorithms. We propose ManyLoDs, a new GPU algorithm to efficiently compute many LoDs from a Bounding Volume Hierarchy in parallel by balancing the workload within and among LoDs. Our approach is not specific to a particular rendering technique, can be used on lazy representations such as polygon soups, and can handle dynamic scenes. We apply our method to various many‐view rasterization applications, including Instant Radiosity, Point‐Based Global Illumination, and reflection/refraction mapping. For each of these, we achieve real‐time performance in complex scenes at high resolutions.  相似文献   

12.
We present a performance comparison of bounding volume hierarchies and kd‐trees for ray tracing on many‐core architectures (GPUs). The comparison is focused on rendering times and traversal characteristics on the GPU using data structures that were optimized for very high performance of tracing rays. To achieve low rendering times, we extensively examine the constants used in termination criteria for the two data structures. We show that for a contemporary GPU architecture (NVIDIA Kepler) bounding volume hierarchies have higher ray tracing performance than kd‐trees for simple and moderately complex scenes. On the other hand, kd‐trees have higher performance for complex scenes, in particular for those with high depth complexity. Finally, we analyse the causes of the performance discrepancies using the profiling characteristics of the ray tracing kernels.  相似文献   

13.
We present a novel multi‐view, projective texture mapping technique. While previous multi‐view texturing approaches lead to blurring and ghosting artefacts if 3D geometry and/or camera calibration are imprecise, we propose a texturing algorithm that warps (“floats”) projected textures during run‐time to preserve crisp, detailed texture appearance. Our GPU implementation achieves interactive to real‐time frame rates. The method is very generally applicable and can be used in combination with many image‐based rendering methods or projective texturing applications. By using Floating Textures in conjunction with, e.g., visual hull rendering, light field rendering, or free‐viewpoint video, improved rendering results are obtained from fewer input images, less accurately calibrated cameras, and coarser 3D geometry proxies.  相似文献   

14.
We propose a versatile pipeline to render B‐Rep models interactively, precisely and without rendering‐related artifacts such as cracks. Our rendering method is based on dynamic surface evaluation using both tesselation and ray‐casting, and direct GPU surface trimming. An initial rendering of the scene is performed using dynamic tesselation. The algorithm we propose reliably detects then fills up cracks in the rendered image. Crack detection works in image space, using depth information, while crack‐filling is either achieved in image space using a simple classification process, or performed in object space through selective ray‐casting. The crack filling method can be dynamically changed at runtime. Our image space crack filling approach has a limited runtime cost and enables high quality, real‐time navigation. Our higher quality, object space approach results in a rendering of similar quality than full‐scene ray‐casting, but is 2 to 6 times faster, can be used during navigation and provides accurate, reliable rendering. Integration of our work with existing tesselation‐based rendering engines is straightforward.  相似文献   

15.
We present a GPU accelerated volume ray casting system interactively driving a multi‐user light field display. The display, driven by a single programmable GPU, is based on a specially arranged array of projectors and a holographic screen and provides full horizontal parallax. The characteristics of the display are exploited to develop a specialized volume rendering technique able to provide multiple freely moving naked‐eye viewers the illusion of seeing and manipulating virtual volumetric objects floating in the display workspace. In our approach, a GPU ray‐caster follows rays generated by a multiple‐center‐of‐projection technique while sampling pre‐filtered versions of the dataset at resolutions that match the varying spatial accuracy of the display. The method achieves interactive performance and provides rapid visual understanding of complex volumetric data sets even when using depth oblivious compositing techniques.  相似文献   

16.
Empty‐space skipping is an essential acceleration technique for volume rendering. Image‐order empty‐space skipping is not well suited to GPU implementation, since it must perform checks on, essentially, a per‐sample basis, as in kd‐tree traversal, which can lead to a great deal of divergent branching at runtime, which is very expensive in a modern GPU pipeline. In contrast, object‐order empty‐space skipping is extremely fast on a GPU and has negligible overheads compared with approaches without empty‐space skipping, since it employs the hardware unit for rasterisation. However, previous object‐order algorithms have been able to skip only exterior empty space and not the interior empty space that lies inside or between volume objects. In this paper, we address these issues by proposing a multi‐layer depth‐peeling approach that can obtain all of the depth layers of the tight‐fitting bounding geometry of the isosurface by a single rasterising pass. The maximum count of layers peeled by our approach can be up to thousands, while maintaining 32‐bit float‐point accuracy, which was not possible previously. By raytracing only the valid ray segments between each consecutive pair of depth layers, we can skip both the interior and exterior empty space efficiently. In comparisons with 3 state‐of‐the‐art GPU isosurface rendering algorithms, this technique achieved much faster rendering across a variety of data sets.  相似文献   

17.
We present novel parallel algorithms for collision detection and separation distance computation for rigid and deformable models that exploit the computational capabilities of many‐core GPUs. Our approach uses thread and data parallelism to perform fast hierarchy construction, updating, and traversal using tight‐fitting bounding volumes such as oriented bounding boxes (OBB) and rectangular swept spheres (RSS). We also describe efficient algorithms to compute a linear bounding volume hierarchy (LBVH) and update them using refitting methods. Moreover, we show that tight‐fitting bounding volume hierarchies offer improved performance on GPU‐like throughput architectures. We use our algorithms to perform discrete and continuous collision detection including self‐collisions, as well as separation distance computation between non‐overlapping models. In practice, our approach (gProximity) can perform these queries in a few milliseconds on a PC with NVIDIA GTX 285 card on models composed of tens or hundreds of thousands of triangles used in cloth simulation, surgical simulation, virtual prototyping and N‐body simulation. Moreover, we observe more than an order of magnitude performance improvement over prior GPU‐based algorithms.  相似文献   

18.
Historically, rendering system development has been mainly focused on improving the numerical accuracy of the rendering algorithms and their runtime efficiency. In this paper, we propose a method to improve the correctness not of the algorithms themselves, but of their implementation. Specifically, we show that by combining static type checking and generic programming, rendering system and shader development can take advantage of compile‐time checking to perform dimensional analysis, i.e. to enforce the correctness of physical dimensions and units in light transport, and geometric space analysis, i.e. to ensure that geometric computations respect the spaces in which points, vectors and normals were defined. We demonstrate our methods by implementing a CPU path tracer and a GPU renderer which previews direct illumination. While we build on prior work to develop our implementations, the main contribution of our work is to show that dimensional analysis and geometric space checking can be successfully integrated into the development of rendering systems and shaders.  相似文献   

19.
We present an efficient Graphics Processing Unit GPU‐based implementation of the Projected Tetrahedra (PT) algorithm. By reducing most of the CPU–GPU data transfer, the algorithm achieves interactive frame rates (up to 2.0 M Tets/s) on current graphics hardware. Since no topology information is stored, it requires substantially less memory than recent interactive ray casting approaches. The method uses a two‐pass GPU approach with two fragment shaders. This work includes extended volume inspection capabilities by supporting interactive transfer function editing and isosurface highlighting using a Phong illumination model.  相似文献   

20.
Polyhedral meshes consisting of triangles, quads, and pentagons and polar configurations cover all major sampling and modeling scenarios. We give an algorithm for efficient local, parallel conversion of such meshes to an everywhere smooth surface consisting of low‐degree polynomial pieces. Quadrilateral facets with 4‐valent vertices are ‘regular’ and are mapped to bi‐cubic patches so that adjacent bi‐cubics join C2 as for cubic tensor‐product splines. The algorithm can be implemented in the vertex and geometry shaders of the GPU pipeline and does not use the fragment shader. Its implementation in DirectX 10 achieves conversion plus rendering at 659 frames per second with 42.5 million triangles per second on input of a model of 1300 facets of which 60% are not regular.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号