首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We present a novel approach to ray tracing execution on commodity graphics hardware using CUDA. We decompose a standard ray tracing algorithm into several data‐parallel stages that are mapped efficiently to the massively parallel architecture of modern GPUs. These stages include: ray sorting into coherent packets, creation of frustums for packets, breadth‐first frustum traversal through a bounding volume hierarchy for the scene, and localized ray‐primitive intersections. We utilize the well known parallel primitives scan and segmented scan in order to process irregular data structures, to remove the need for a stack, and to minimize branch divergence in all stages. Our ray sorting stage is based on applying hash values to individual rays, ray stream compression, sorting and decompression. Our breadth‐first BVH traversal is based on parallel frustum‐bounding box intersection tests and parallel scan per each BVH level. We demonstrate our algorithm with area light sources to get a soft shadow effect and show that our concept is reasonable for GPU implementation. For the same data sets and ray‐primitive intersection routines our pipeline is ~3x faster than an optimized standard depth first ray tracing implemented in one kernel.  相似文献   

2.
State‐of‐the‐art density estimation methods for rendering participating media rely on a dense photon representation of the radiance distribution within a scene. A critical bottleneck of such kernel‐based approaches is the excessive number of photons that are required in practice to resolve fine illumination details, while controlling the amount of noise. In this paper, we propose a parametric density estimation technique that represents radiance using a hierarchical Gaussian mixture. We efficiently obtain the coefficients of this mixture using a progressive and accelerated form of the Expectation‐Maximization algorithm. After this step, we are able to create noise‐free renderings of high‐frequency illumination using only a few thousand Gaussian terms, where millions of photons are traditionally required. Temporal coherence is trivially supported within this framework, and the compact footprint is also useful in the context of real‐time visualization. We demonstrate a hierarchical ray tracing‐based implementation, as well as a fast splatting approach that can interactively render animated volume caustics.  相似文献   

3.
Domain‐continuous visibility determination algorithms have proved to be very efficient at reducing noise otherwise prevalent in stochastic sampling. Even though they come with an increased overhead in terms of geometrical tests and visibility information management, their analytical nature provides such a rich integral that the pay‐off is often worth it. This paper presents a time‐continuous, primary visibility algorithm for motion blur aimed at ray tracing. Two novel intersection tests are derived and implemented. The first is for ray versus moving triangle and the second for ray versus moving AABB intersection. A novel take on shading is presented as well, where the time continuum of visible geometry is adaptively point‐sampled. Static geometry is handled using supplemental stochastic rays in order to reduce spatial aliasing. Finally, a prototype ray tracer with a full time‐continuous traversal kernel is presented in detail. The results are based on a variety of test scenarios and show that even though our time‐continuous algorithm has limitations, it outperforms multi‐jittered quasi‐Monte Carlo ray tracing in terms of image quality at equal rendering time, within wide sampling rate ranges.  相似文献   

4.
Our hybrid display model combines multiple automultiscopic elements volumetrically to support horizontal and vertical parallax at a larger depth of field and better accommodation cues compared to single layer elements. In this paper, we introduce a framework to analyze the bandwidth of such display devices. Based on this analysis, we show that multiple layers can achieve a wider depth of field using less bandwidth compared to single layer displays. We present a simple algorithm to distribute an input light field to multiple layers, and devise an efficient ray tracing algorithm for synthetic scenes. We demonstrate the effectiveness of our approach by both software simulation and two corresponding hardware prototypes.  相似文献   

5.
We propose a novel algorithm for construction of bounding volume hierarchies (BVHs) for multi‐core CPU architectures. The algorithm constructs the BVH by a divisive top‐down approach using a progressively refined cut of an existing auxiliary BVH. We propose a new strategy for refining the cut that significantly reduces the workload of individual steps of BVH construction. Additionally, we propose a new method for integrating spatial splits into the BVH construction algorithm. The auxiliary BVH is constructed using a very fast method such as LBVH based on Morton codes. We show that the method provides a very good trade‐off between the build time and ray tracing performance. We evaluated the method within the Embree ray tracing framework and show that it compares favorably with the Embree BVH builders regarding build time while maintaining comparable ray tracing speed.  相似文献   

6.
We propose a method for creating a bounding volume hierarchy (BVH) that is optimized for all frames of a given animated scene. The method is based on a novel extension of surface area heuristic to temporal domain (T‐SAH). We perform iterative BVH optimization using T‐SAH and create a single BVH accounting for scene geometry distribution at different frames of the animation. Having a single optimized BVH for the whole animation makes our method extremely easy to integrate to any application using BVHs, limiting the per‐frame overhead only to refitting the bounding volumes. We evaluated the T‐SAH optimized BVHs in the scope of real‐time GPU ray tracing. We demonstrate, that our method can handle even highly complex inputs with large deformations and significant topology changes. The results show, that in a vast majority of tested scenes our method provides significantly better run‐time performance than traditional SAH and also better performance than GPU based per‐frame BVH rebuild.  相似文献   

7.
The generation of discrete stream surfaces is an important and challenging task in scientific visualization, which can be considered a particular instance of geometric modeling. The quality of numerically integrated stream surfaces depends on a number of parameters that can be controlled locally, such as time step or distance of adjacent vertices on the front line. In addition there is a parameter that cannot be controlled locally: stream surface meshes tend to show high quality, well‐shaped elements only if the current front line is “globally” approximately perpendicular to the flow direction. We analyze the impact of this geometric property and present a novel solution – a stream surface integrator that forces the front line to be perpendicular to the flow and that generates quad‐dominant meshes with well‐shaped and well‐aligned elements. It is based on the integration of a scaled version of the flow field, and requires repeated minimization of an error functional along the current front line. We show that this leads to computing the 1‐dimensional kernel of a bidiagonal matrix: a linear problem that can be solved efficiently. We compare our method with existing stream surface integrators and apply it to a number of synthetic and real world data sets.  相似文献   

8.
This survey gives an overview of the current state of the art in GPU techniques for interactive large‐scale volume visualization. Modern techniques in this field have brought about a sea change in how interactive visualization and analysis of giga‐, tera‐ and petabytes of volume data can be enabled on GPUs. In addition to combining the parallel processing power of GPUs with out‐of‐core methods and data streaming, a major enabler for interactivity is making both the computational and the visualization effort proportional to the amount and resolution of data that is actually visible on screen, i.e. ‘output‐sensitive’ algorithms and system designs. This leads to recent output‐sensitive approaches that are ‘ray‐guided’, ‘visualization‐driven’ or ‘display‐aware’. In this survey, we focus on these characteristics and propose a new categorization of GPU‐based large‐scale volume visualization techniques based on the notions of actual output‐resolution visibility and the current working set of volume bricks—the current subset of data that is minimally required to produce an output image of the desired display resolution. Furthermore, we discuss the differences and similarities of different rendering and data traversal strategies in volume rendering by putting them into a common context—the notion of address translation. For our purposes here, we view parallel (distributed) visualization using clusters as an orthogonal set of techniques that we do not discuss in detail but that can be used in conjunction with what we present in this survey.  相似文献   

9.
Simulation of light transport through lens systems plays an important role in graphics. While basic imaging properties can be conveniently derived from linear models (like ABCD matrices), these approximations fail to describe nonlinear effects and aberrations that arise in real optics. Such effects can be computed by proper ray tracing, for which, however, finding suitable sampling and filtering strategies is often not a trivial task. Inspired by aberration theory, which describes the deviation from the linear ray transfer in terms of wavefront distortions, we propose a ray‐space formulation for nonlinear effects. In particular, we approximate the analytical solution to the ray tracing problem by means of a Taylor expansion in the ray parameters. This representation enables a construction‐kit approach to complex optical systems in the spirit of matrix optics. It is also very simple to evaluate, which allows for efficient execution on CPU and GPU alike, including the computation of mixed derivatives of any order. We evaluate fidelity and performance of our polynomial model, and show applications in high‐quality offline rendering and at interactive frame rates.  相似文献   

10.
We present a performance comparison of bounding volume hierarchies and kd‐trees for ray tracing on many‐core architectures (GPUs). The comparison is focused on rendering times and traversal characteristics on the GPU using data structures that were optimized for very high performance of tracing rays. To achieve low rendering times, we extensively examine the constants used in termination criteria for the two data structures. We show that for a contemporary GPU architecture (NVIDIA Kepler) bounding volume hierarchies have higher ray tracing performance than kd‐trees for simple and moderately complex scenes. On the other hand, kd‐trees have higher performance for complex scenes, in particular for those with high depth complexity. Finally, we analyse the causes of the performance discrepancies using the profiling characteristics of the ray tracing kernels.  相似文献   

11.
We present a practical real‐time approach for rendering lens‐flare effects. While previous work employed costly ray tracing or complex polynomial expressions, we present a coarser, but also significantly faster solution. Our method is based on a first‐order approximation of the ray transfer in an optical system, which allows us to derive a matrix that maps lens flare‐producing light rays directly to the sensor. The resulting approach is easy to implement and produces physically‐plausible images at high framerates on standard off‐the‐shelf graphics hardware.  相似文献   

12.
Existing algorithms can efficiently render refractive objects of constant refractive index. For a medium with a continuously varying index of refraction, most algorithms use the ray equation of geometric optics to compute piecewise‐linear approximations of the non‐linear rays. By assuming a constant refractive index within each tracing step, these methods often need a large number of small steps to generate satisfactory images. In this paper, we present a new approach for tracing non‐constant, refractive media based on the ray equations of gradient‐index optics. We show that in a medium of constant index gradient, the ray equation has a closed‐form solution, and the intersection point between a ray and the medium boundaries can be efficiently computed using the bisection method. For general non‐constant media, we model the refractive index as a piecewise‐linear function and render the refraction by tracing the tetrahedron‐based representation of the media. Our algorithm can be easily combined with existing rendering algorithms such as photon mapping to generate complex refractive caustics at interactive frame rates. We also derive analytic ray formulations for tracing mirages – a special gradient‐index optical phenomenon.  相似文献   

13.
Compared with its competitors such as the bounding volume hierarchy, a drawback of the kd‐tree structure is that a large number of triangles are repeatedly duplicated during its construction, which often leads to inefficient, large and tall binary trees with high triangle redundancy. In this paper, we propose a space‐efficient kd‐tree representation where, unlike commonly used methods, an inner node is allowed to optionally store a reference to a triangle, so highly redundant triangles in a kd‐tree can be culled from the leaf nodes and moved to the inner nodes. To avoid the construction of ineffective kd‐trees entailing computational inefficiencies due to early, possibly unnecessary, ray‐triangle intersection calculations that now have to be performed in the inner nodes during the kd‐tree traversal, we present heuristic measures for determining when and how to choose triangles for inner nodes during kd‐tree construction. Based on these metrics, we describe how the new form of kd‐tree is constructed and stored compactly using a carefully designed data layout. Our experiments with several example scenes showed that our kd‐tree representation technique significantly reduced the memory requirements for storing the kd‐tree structure, while effectively suppressing the unavoidable frame‐rate degradation observed during ray tracing.  相似文献   

14.
We propose two hardware mechanisms to decrease energy consumption on massively parallel graphics processors for ray tracing. First, we use a streaming data model and configure part of the L2 cache into a ray stream memory to enable efficient data processing through ray reordering. This increases L1 hit rates and reduces off‐chip memory energy substantially through better management of off‐chip memory access patterns. To evaluate this model, we augment our architectural simulator with a detailed memory system simulation that includes accurate control, timing and power models for memory controllers and off‐chip dynamic random‐access memory . These details change the results significantly over previous simulations that used a simpler model of off‐chip memory, indicating that this type of memory system simulation is important for realistic simulations that involve external memory. Secondly, we employ reconfigurable special‐purpose pipelines that are constructed dynamically under program control. These pipelines use shared execution units that can be configured to support the common compute kernels that are the foundation of the ray tracing algorithm. This reduces the overhead incurred by on‐chip memory and register accesses. These two synergistic features yield a ray tracing architecture that reduces energy by optimizing both on‐chip and off‐chip memory activity when compared to a more traditional approach.  相似文献   

15.
Ray‐traced global illumination (GI) is becoming widespread in production rendering but incoherent secondary ray traversal limits practical rendering to scenes that fit in memory. Incoherent shading also leads to intractable performance with production‐scale textures forcing renderers to resort to caching of irradiance, radiosity, and other values to amortize expensive shading. Unfortunately, such caching strategies complicate artist workflow, are difficult to parallelize effectively, and contend for precious memory. Worse, these caches involve approximations that compromise quality. In this paper, we introduce a novel path‐tracing framework that avoids these tradeoffs. We sort large, potentially out‐of‐core ray batches to ensure coherence of ray traversal. We then defer shading of ray hits until we have sorted them, achieving perfectly coherent shading and avoiding the need for shading caches.  相似文献   

16.
We present a real‐time rendering algorithm for inhomogeneous, single scattering media, where all‐frequency shading effects such as glows, light shafts, and volumetric shadows can all be captured. The algorithm first computes source radiance at a small number of sample points in the medium, then interpolates these values at other points in the volume using a gradient‐based scheme that is efficiently applied by sample splatting. The sample points are dynamically determined based on a recursive sample splitting procedure that adapts the number and locations of sample points for accurate and efficient reproduction of shading variations in the medium. The entire pipeline can be easily implemented on the GPU to achieve real‐time performance for dynamic lighting and scenes. Rendering results of our method are shown to be comparable to those from ray tracing.  相似文献   

17.
We present a new algorithm for efficient rendering of high‐quality depth‐of‐field (DoF) effects. We start with a single rasterized view (reference view) of the scene, and sample the light field by warping the reference view to nearby views. We implement the algorithm using NVIDIA's CUDA to achieve parallel processing, and exploit the atomic operations to resolve visibility when multiple pixels warp to the same image location. We then directly synthesize DoF effects from the sampled light field. To reduce aliasing artifacts, we propose an image‐space filtering technique that compensates for spatial undersampling using MIP mapping. The main advantages of our algorithm are its simplicity and generality. We demonstrate interactive rendering of DoF effects in several complex scenes. Compared to existing methods, ours does not require ray tracing and hence scales well with scene complexity.  相似文献   

18.
We present an automatic image‐recoloring technique for enhancing color contrast for dichromats whose computational cost varies linearly with the number of input pixels. Our approach can be efficiently implemented on GPUs, and we show that for typical image sizes it is up to two orders of magnitude faster than the current state‐of‐the‐art technique. Unlike previous approaches, ours preserve temporal coherence and, therefore, is suitable for video recoloring. We demonstrate the effectiveness of our technique by integrating it into a visualization system and showing, for the first time, real‐time high‐quality recolored visualizations for dichromats.  相似文献   

19.
Signed distance functions (SDF) to explicit or implicit surface representations are intensively used in various computer graphics and visualization algorithms. Among others, they are applied to optimize collision detection, are used to reconstruct data fields or surfaces, and, in particular, are an obligatory ingredient for most level set methods. Level set methods are common in scientific visualization to extract surfaces from scalar or vector fields. Usual approaches for the construction of an SDF to a surface are either based on iterative solutions of a special partial differential equation or on marching algorithms involving a polygonization of the surface. We propose a novel method for a non‐iterative approximation of an SDF and its derivatives in a vicinity of a manifold. We use a second‐order algebraic fitting scheme to ensure high accuracy of the approximation. The manifold is defined (explicitly or implicitly) as an isosurface of a given volumetric scalar field. The field may be given at a set of irregular and unstructured samples. Stability and reliability of the SDF generation is achieved by a proper scaling of weights for the Moving Least Squares approximation, accurate choice of neighbors, and appropriate handling of degenerate cases. We obtain the solution in an explicit form, such that no iterative solving is necessary, which makes our approach fast.  相似文献   

20.
There is a vast number of applications that require distance field computation over triangular meshes. State‐of‐the‐art algorithms have quadratic or sub‐quadratic worst‐case complexity, making them impractical for interactive applications. While most of the research on this subject has been focused on reducing the computation complexity of the algorithms, in this work we propose an approximate algorithm that achieves similar results working in lower resolutions of the input meshes. The creation of lower resolution meshes is the essence of our proposal. The idea is to identify regions on the input mesh that can be unfolded into planar regions with minimal area distortion (i.e. quasi‐developable charts). Once charts are computed, their interior is re‐triangulated to reduce the number of triangles, which results in a collection of simplified charts that we call a base mesh. Due to the properties of quasi‐developable regions, we are able to compute distance fields over the base mesh instead of over the input mesh. This reduces the memory footprint and data processed for distance computations, which is the bottleneck of these algorithms. We present results that are one order of magnitude faster than current exact solutions, with low approximation errors.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号