首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
Stackless KD-Tree Traversal for High Performance GPU Ray Tracing   总被引:1,自引:1,他引:1  
Significant advances have been achieved for realtime ray tracing recently, but realtime performance for complex scenes still requires large computational resources not yet available from the CPUs in standard PCs. Incidentally, most of these PCs also contain modern GPUs that do offer much larger raw compute power. However, limitations in the programming and memory model have so far kept the performance of GPU ray tracers well below that of their CPU counterparts. In this paper we present a novel packet ray traversal implementation that completely eliminates the need for maintaining a stack during kd-tree traversal and that reduces the number of traversal steps per ray. While CPUs benefit moderately from the stackless approach, it improves GPU performance significantly. We achieve a peak performance of over 16 million rays per second for reasonably complex scenes, including complex shading and secondary rays. Several examples show that with this new technique GPUs can actually outperform equivalent CPU based ray tracers.  相似文献   

We present a performance comparison of bounding volume hierarchies and kd‐trees for ray tracing on many‐core architectures (GPUs). The comparison is focused on rendering times and traversal characteristics on the GPU using data structures that were optimized for very high performance of tracing rays. To achieve low rendering times, we extensively examine the constants used in termination criteria for the two data structures. We show that for a contemporary GPU architecture (NVIDIA Kepler) bounding volume hierarchies have higher ray tracing performance than kd‐trees for simple and moderately complex scenes. On the other hand, kd‐trees have higher performance for complex scenes, in particular for those with high depth complexity. Finally, we analyse the causes of the performance discrepancies using the profiling characteristics of the ray tracing kernels.  相似文献   

Interactive global illumination for fully deformable scenes with dynamic relighting is currently a very elusive goal in the area of realistic rendering. In this work we propose a system that is based on explicit visibility calculations and which is highly efficient and scalable. The rendering equation defines the light exchange between surfaces, which we approximate by subsampling. By utilizing the power of modern parallel GPUs using the CUDA framework we achieve interactive frame rates. Since we update the global illumination continuously in an asynchronous fashion, we maintain interactivity at all times for moderately complex scenes. We show that we can achieve higher frame rates for scenes with moving light sources, diffuse indirect illumination and dynamic geometry than other current methods, while maintaining a high image quality.  相似文献   

We introduce a screen‐space statistical filtering method for real‐time rendering with global illumination. It is inspired by statistical filtering proposed by Meyer et al. to reduce the noise in global illumination over a period of time by estimating the principal components from all rendered frames. Our work extends their method to achieve nearly real‐time performance on modern GPUs. More specifically, our method employs the candid covariance‐free incremental PCA to overcome several limitations of the original algorithm by Meyer et al., such as its high computational cost and memory usage that hinders its implementation on GPUs. By combining the reprojection and per‐pixel weighting techniques, our method handles the view changes and object movement in dynamic scenes as well.  相似文献   

Robust and efficient rendering of complex lighting effects, such as caustics, remains a challenging task. While algorithms like vertex connection and merging can render such effects robustly, their significant overhead over a simple path tracer is not always justified and – as we show in this paper ‐ also not necessary. In current rendering solutions, caustics often require the user to enable a specialized algorithm, usually a photon mapper, and hand‐tune its parameters. But even with carefully chosen parameters, photon mapping may still trace many photons that the path tracer could sample well enough, or, even worse, that are not visible at all. Our goal is robust, yet lightweight, caustics rendering. To that end, we propose a technique to identify and focus computation on the photon paths that offer significant variance reduction over samples from a path tracer. We apply this technique in a rendering solution combining path tracing and photon mapping. The photon emission is automatically guided towards regions where the photons are useful, i.e., provide substantial variance reduction for the currently rendered image. Our method achieves better photon densities with fewer light paths (and thus photons) than emission guiding approaches based on visual importance. In addition, we automatically determine an appropriate number of photons for a given scene, and the algorithm gracefully degenerates to pure path tracing for scenes that do not benefit from photon mapping.  相似文献   

This paper introduces a framebuffer level of detail algorithm for controlling the pixel workload in an interactive rendering application. Our basic strategy is to evaluate the shading in a low resolution buffer and, in a second rendering pass, resample this buffer at the desired screen resolution. The size of the lower resolution buffer provides a trade‐off between rendering time and the level of detail in the final shading. In order to reduce approximation error we use a feature‐preserving reconstruction technique that more faithfully approximates the shading near depth and normal discontinuities. We also demonstrate how intermediate components of the shading can be selectively resized to provide finer‐grained control over resource allocation. Finally, we introduce a simple control mechanism that continuously adjusts the amount of resizing necessary to maintain a target framerate. These techniques do not require any preprocessing, are straightforward to implement on modern GPUs, and are shown to provide significant performance gains for several pixel‐bound scenes.  相似文献   

Depth-of-Field Rendering by Pyramidal Image Processing   总被引:1,自引:0,他引:1  
We present an image-based algorithm for interactive rendering depth-of-field effects in images with depth maps. While previously published methods for interactive depth-of-field rendering suffer from various rendering artifacts such as color bleeding and sharpened or darkened silhouettes, our algorithm achieves a significantly improved image quality by employing recently proposed GPU-based pyramid methods for image blurring and pixel disocclusion. Due to the same reason, our algorithm offers an interactive rendering performance on modern GPUs and is suitable for real-time rendering for small circles of confusion. We validate the image quality provided by our algorithm by side-by-side comparisons with results obtained by distributed ray tracing.  相似文献   

Aggregate scattering operators (ASOs) describe the overall scattering behavior of an asset (i.e., an object or volume, or collection thereof) accounting for all orders of its internal scattering. We propose a practical way to precompute and compactly store ASOs and demonstrate their ability to accelerate path tracing. Our approach is modular avoiding costly and inflexible scene‐dependent precomputation. This is achieved by decoupling light transport within and outside of each asset, and precomputing on a per‐asset level. We store the internal transport in a reduced‐dimensional subspace tailored to the structure of the asset geometry, its scattering behavior, and typical illumination conditions, allowing the ASOs to maintain good accuracy with modest memory requirements. The precomputed ASO can be reused across all instances of the asset and across multiple scenes. We augment ASOs with functionality enabling multi‐bounce importance sampling, fast short‐circuiting of complex light paths, and compact caching, while retaining rapid progressive preview rendering. We demonstrate the benefits of our ASOs by efficiently path tracing scenes containing many instances of objects with complex inter‐reflections or multiple scattering.  相似文献   

Efficient intersection queries are important for ray tracing. However, building and maintaining the acceleration structures is demanding, especially for fully dynamic scenes. In this paper, we propose a quantized intersection framework based on compact voxels to quantize the intersection as an approximation. With high‐resolution voxels, the scene geometry can be well represented, which enables more accurate simulation of global illumination, such as detailed glossy reflections. In terms of memory usage in our graphics processing unit implementation, voxels are binarized and compactly encoded in a few 2D textures. We evaluate the rendering quality at various voxel resolutions. Empirically, high‐fidelity rendering can be achieved at the voxel resolution of 1 K3 or above, which produces images very similar to those of ray tracing. Moreover, we demonstrate the feasibility of our framework for various illumination effects with several applications, including first‐bounce indirect illumination, glossy refraction, path tracing, direct illumination, and ambient occlusion.  相似文献   

Today's PCs incorporate multiple CPUs and GPUs and are easily arranged in clusters for high-performance, interactive graphics. We present an approach based on hierarchical, screen-space tiles to parallelizing rendering with level of detail. Adapttiles, render tiles, and machine tiles are associated with CPUs, GPUs, and PCs, respectively, to efficiently parallelize the workload with good resource utilization. Adaptive tile sizes provide load balancing while our level of detail system allows total and independent management of the load on CPUs and GPUs. We demonstrate our approach on parallel configurations consisting of both single PCs and a cluster of PCs  相似文献   

Bidirectional path tracing is known to perform poorly for the rendering of highly occluded scenes. Indeed, the connection strategy between light and eye subpaths does not take into account the visibility factor, presenting no contribution for many sampled paths. To improve the efficiency of bidirectional path tracing, we propose a new method for adaptive resampling of connections between light and eye subpaths. Aiming for this objective, we build discrete probability distributions of light subpaths based on a skeleton of the empty space of the scene. In order to demonstrate the efficiency of our algorithm, we compare our method to both standard bidirectional path tracing and a recent important caching method.  相似文献   

We present a robust, unbiased technique for intelligent light‐path construction in path‐tracing algorithms. Inspired by existing path‐guiding algorithms, our method learns an approximate representation of the scene's spatio‐directional radiance field in an unbiased and iterative manner. To that end, we propose an adaptive spatio‐directional hybrid data structure, referred to as SD‐tree, for storing and sampling incident radiance. The SD‐tree consists of an upper part—a binary tree that partitions the 3D spatial domain of the light field—and a lower part—a quadtree that partitions the 2D directional domain. We further present a principled way to automatically budget training and rendering computations to minimize the variance of the final image. Our method does not require tuning hyperparameters, although we allow limiting the memory footprint of the SD‐tree. The aforementioned properties, its ease of implementation, and its stable performance make our method compatible with production environments. We demonstrate the merits of our method on scenes with difficult visibility, detailed geometry, and complex specular‐glossy light transport, achieving better performance than previous state‐of‐the‐art algorithms.  相似文献   

We present an importance sampling method for the bidirectional scattering distribution function (bsdf) of hair. Our method is based on the multi‐lobe hair scattering model presented by Sadeghi et al. [ [SPJT10] ]. We reduce noise by drawing samples from a distribution that approximates the bsdf well. Our algorithm is efficient and easy to implement, since the sampling process requires only the evaluation of a few analytic functions, with no significant memory overhead or need for precomputation. We tested our method in a research raytracer and a production renderer based on micropolygon rasterization. We show significant improvements for rendering direct illumination using multiple importance sampling and for rendering indirect illumination using path tracing.  相似文献   

This paper presents a reformulation of bidirectional path‐tracing that adequately divides the algorithm into processes efficiently executed in parallel on both the CPU and the GPU. We thus benefit from high‐level optimization techniques such as double buffering, batch processing, and asyncronous execution, as well as from the exploitation of most of the CPU, GPU, and memory bus capabilities. Our approach, while avoiding pure GPU implementation limitations (such as limited complexity of shaders, light or camera models, and processed scene data sets), is more than ten times faster than standard bidirectional path‐tracing implementations, leading to performance suitable for production‐oriented rendering engines.  相似文献   

The most common solutions to the light transport problem rely on either Monte Carlo (MC) integration or density estimation methods, such as uni‐ & bi‐directional path tracing or photon mapping. Recent gradient‐domain extensions of MC approaches show great promise; here, gradients of the final image are estimated numerically (instead of the image intensities themselves) with coherent paths generated from a deterministic shift mapping. We extend gradient‐domain approaches to light transport simulation based on density estimation. As with previous gradient‐domain methods, we detail important considerations that arise when moving from a primal‐ to gradient‐domain estimator. We provide an efficient and straightforward solution to these problems. Our solution supports stochastic progressive density estimation, so it is robust to complex transport effects. We show that gradient‐domain photon density estimation converges faster than its primal‐domain counterpart, as well as being generally more robust than gradient‐domain uni‐ & bi‐directional path tracing for scenes dominated by complex transport.  相似文献   

This paper presents an improvement to the stochastic progressive photon mapping (SPPM), a method for robustly simulating complex global illumination with distributed ray tracing effects. Normally, similar to photon mapping and other particle tracing algorithms, SPPM would become inefficient when the photons are poorly distributed. An inordinate amount of photons are required to reduce the error caused by noise and bias to acceptable levels. In order to optimize the distribution of photons, we propose an extension of SPPM with a Metropolis‐Hastings algorithm, effectively exploiting local coherence among the light paths that contribute to the rendered image. A well‐designed scalar contribution function is introduced as our Metropolis sampling strategy, targeting at specific parts of image areas with large error to improve the efficiency of the radiance estimator. Experimental results demonstrate that the new Metropolis sampling based approach maintains the robustness of the standard SPPM method, while significantly improving the rendering efficiency for a wide range of scenes with complex lighting.  相似文献   

Interactive computation of global illumination is a major challenge in current computer graphics research. Global illumination heavily affects the visual quality of generated images. It is therefore a key attribute for the perception of photo‐realistic images. Path tracing is able to simulate the physical behaviour of light using Monte Carlo techniques. However, the computational burden of this technique prohibits interactive rendering times on standard commodity hardware in high‐quality. Trying to solve the Monte Carlo integration with fewer samples results in characteristic noisy images. Global illumination filtering methods take advantage of the fact that the integral for neighbouring pixels may be very similar. Averaging samples of similar characteristics in screen‐space may approximate the correct integral, but may result in visible outliers. In this paper, we present a novel path tracing pipeline based on an edge‐aware filtering method for the indirect illumination which produces visually more pleasing results without noticeable outliers. The key idea is not to filter the noisy path traced images but to use it as a guidance to filter a second image composed from characteristic scene attributes that do not contain noise by default. We show that our approach better approximates the Monte Carlo integral compared to previous methods. Since the computation is carried out completely in screen‐space it is therefore applicable to fully dynamic scenes, arbitrary lighting and allows for high‐quality path tracing at interactive frame rates on commodity hardware.  相似文献   

Ray‐traced global illumination (GI) is becoming widespread in production rendering but incoherent secondary ray traversal limits practical rendering to scenes that fit in memory. Incoherent shading also leads to intractable performance with production‐scale textures forcing renderers to resort to caching of irradiance, radiosity, and other values to amortize expensive shading. Unfortunately, such caching strategies complicate artist workflow, are difficult to parallelize effectively, and contend for precious memory. Worse, these caches involve approximations that compromise quality. In this paper, we introduce a novel path‐tracing framework that avoids these tradeoffs. We sort large, potentially out‐of‐core ray batches to ensure coherence of ray traversal. We then defer shading of ray hits until we have sorted them, achieving perfectly coherent shading and avoiding the need for shading caches.  相似文献   

We investigate the use of two‐level nested grids as acceleration structure for ray tracing of dynamic scenes. We propose a massively parallel, sort‐based construction algorithm and show that the two‐level grid is one of the structures that is fastest to construct on modern graphics processors. The structure handles non‐uniform primitive distributions more robustly than the uniform grid and its traversal performance is comparable to those of other high quality acceleration structures used for dynamic scenes. We propose a cost model to determine the grid resolution and improve SIMD utilization during ray‐triangle intersection by employing a hybrid packetization strategy. The build times and ray traversal acceleration provide overall rendering performance superior to previous approaches for real time rendering of animated scenes on GPUs.  相似文献   

Ray–based representations can model complex light transport but are limited in modeling diffraction effects that require the simulation of wavefront propagation. This paper provides a new paradigm that has the simplicity of light path tracing and yet provides an accurate characterization of both Fresnel and Fraunhofer diffraction. We introduce the concept of a light field transformer at the interface of transmissive occluders. This generates mathematically sound, virtual, and possibly negative‐valued light sources after the occluder. From a rendering perspective the only simple change is that radiance can be temporarily negative. We demonstrate the correctness of our approach both analytically, as well by comparing values with standard experiments in physics such as the Young's double slit. Our implementation is a shader program in OpenGL that can generate wave effects on arbitrary surfaces.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号