期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Arsène Pérard‐Gayot Javor Kalojanov Philipp Slusallek 《Computer Graphics Forum》2017,36(2):477-486

We present a spatial index structure to accelerate ray tracing on GPUs. It is a flat, non‐hierarchical spatial subdivision of the scene into axis aligned cells of varying size. In order to construct it, we first nest an octree into each cell of a uniform grid. We then apply two optimization passes to increase ray traversal performance: First, we reduce the expected cost for ray traversal by merging cells together. This adapts the structure to complex primitive distributions, solving the “teapot in a stadium” problem. Second, we decouple the cell boundaries used during traversal for rays entering and exiting a given cell. This allows us to extend the exiting boundaries over adjacent cells that are either empty or do not contain additional primitives. Now, exiting rays can skip empty space and avoid repeating intersection tests. Finally, we demonstrate that in addition to the fast ray traversal performance, the structure can be rebuilt efficiently in parallel, allowing for ray tracing dynamic scenes. 相似文献

2.

Compact,Fast and Robust Grids for Ray Tracing

Ares Lagae Philip Dutré 《Computer Graphics Forum》2008,27(4):1235-1244

The focus of research in acceleration structures for ray tracing recently shifted from render time to time to image, the sum of build time and render time, and also the memory footprint of acceleration structures now receives more attention. In this paper we revisit the grid acceleration structure in this setting. We present two efficient methods for representing and building a grid. The compact grid method consists of a static data structure for representing a grid with minimal memory requirements, more specifically exactly one index per grid cell and exactly one index per object reference, and an algorithm for building that data structure in linear time. The hashed grid method reduces memory requirements even further, by using perfect hashing based on row displacement compression. We show that these methods are more efficient in both time and space than traditional methods based on linked lists and dynamic arrays. We also present a more robust grid traversal algorithm. We show that, for applications where time to image or memory usage is important, such as interactive ray tracing and rendering large models, the grid acceleration structure is an attractive alternative. 相似文献

3.

Performance Comparison of Bounding Volume Hierarchies and Kd‐Trees for GPU Ray Tracing

下载免费PDF全文

Marek Vinkler Vlastimil Havran Jiří Bittner 《Computer Graphics Forum》2016,35(8):68-79

We present a performance comparison of bounding volume hierarchies and kd‐trees for ray tracing on many‐core architectures (GPUs). The comparison is focused on rendering times and traversal characteristics on the GPU using data structures that were optimized for very high performance of tracing rays. To achieve low rendering times, we extensively examine the constants used in termination criteria for the two data structures. We show that for a contemporary GPU architecture (NVIDIA Kepler) bounding volume hierarchies have higher ray tracing performance than kd‐trees for simple and moderately complex scenes. On the other hand, kd‐trees have higher performance for complex scenes, in particular for those with high depth complexity. Finally, we analyse the causes of the performance discrepancies using the profiling characteristics of the ray tracing kernels. 相似文献

4.

Stackless Multi‐BVH Traversal for CPU,MIC and GPU Ray Tracing

Attila T. Áfra László Szirmay‐Kalos 《Computer Graphics Forum》2014,33(1):129-140

Stackless traversal algorithms for ray tracing acceleration structures require significantly less storage per ray than ordinary stack‐based ones. This advantage is important for massively parallel rendering methods, where there are many rays in flight. On SIMD architectures, a commonly used acceleration structure is the MBVH, which has multiple bounding boxes per node for improved parallelism. It scales to branching factors higher than two, for which, however, only stack‐based traversal methods have been proposed so far. In this paper, we introduce a novel stackless traversal algorithm for MBVHs with up to four‐way branching. Our approach replaces the stack with a small bitmask, supports dynamic ordered traversal, and has a low computation overhead. We also present efficient implementation techniques for recent CPU, MIC (Intel Xeon Phi) and GPU (NVIDIA Kepler) architectures. 相似文献

5.

Ray Classification for Accelerated BVH Traversal

J. Hendrich A. Pospí&#x;il D. Meister J. Bittner 《Computer Graphics Forum》2019,38(4):49-56

For ray tracing based methods, traversing a hierarchical acceleration data structure takes up a substantial portion of the total rendering time. We propose an additional data structure which cuts off large parts of the hierarchical traversal. We use the idea of ray classification combined with the hierarchical scene representation provided by a bounding volume hierarchy. We precompute short arrays of indices to subtrees inside the hierarchy and use them to initiate the traversal for a given ray class. This arrangement is compact enough to be cache‐friendly, preventing the method from negating its traversal gains by excessive memory traffic. The method is easy to use with existing renderers which we demonstrate by integrating it to the PBRT renderer. The proposed technique reduces the number of traversal steps by 42% on average, saving around 15% of time of finding ray‐scene intersection on average. 相似文献

6.

Sorted Deferred Shading for Production Path Tracing

Christian Eisenacher Gregory Nichols Andrew Selle Brent Burley 《Computer Graphics Forum》2013,32(4):125-132

Ray‐traced global illumination (GI) is becoming widespread in production rendering but incoherent secondary ray traversal limits practical rendering to scenes that fit in memory. Incoherent shading also leads to intractable performance with production‐scale textures forcing renderers to resort to caching of irradiance, radiosity, and other values to amortize expensive shading. Unfortunately, such caching strategies complicate artist workflow, are difficult to parallelize effectively, and contend for precious memory. Worse, these caches involve approximations that compromise quality. In this paper, we introduce a novel path‐tracing framework that avoids these tradeoffs. We sort large, potentially out‐of‐core ray batches to ensure coherence of ray traversal. We then defer shading of ray hits until we have sorted them, achieving perfectly coherent shading and avoiding the need for shading caches. 相似文献

7.

Interactive Glossy Reflections using GPU‐based Ray Tracing with Adaptive LOD

Xuan Yu Rui Wang Jingyi Yu 《Computer Graphics Forum》2008,27(7):1987-1996

We present an interactive GPU‐based algorithm for accurately rendering high‐quality, dynamic glossy reflection effects from both HDR environment maps and local scene objects. Our method uses hardware rasterization to produce primary pixels, and GPU‐based BRDF importance sampling [ [CK07] ] to quickly generate reflected rays. We utilize a fast GPU ray tracer proposed by Carr et al. [ [CHCH06] ] to compute reflection hits. Our main contribution is an adaptive level‐of‐detail (LOD) control algorithm that greatly improves ray tracing performance during reflection shading. Specifically, we use the solid angle represented by each reflected ray to adaptively pick the level of termination in the BVH traversal step during ray tracing. This leads to 2 ～ 3x speedup over an unmodified implementation of [ [CHCH06] ]. Based on the same solid angle measure, we derive a texture filtering formula to reduce reflection aliasing artifacts, taking advantage of hardware MIP mapping. This extends the filtering algorithm presented in [ [CK07] ] from environment mapping to local scene reflection. Using our algorithm, we demonstrate interactive rendering rates for several scenes featuring dynamic lighting and material changes, spatially varying BRDF parameters, and rigid‐body object movement. 相似文献

8.

MSKD: multi-split KD-tree design on GPU

Xin Yang Bing Yang Pengjie Wang Duanqing Xu 《Multimedia Tools and Applications》2016,75(2):1349-1364

We present a novel parallel acceleration structure construction and traversal algorithm designed to efficiently exploit the massive parallel computing cores on the Graphic Processing Unit(GPU) to improve the render performance. Our associated data structure is called multi-split KD-tree or MSKD, which focuses on fast generating and efficiently traveling multiple child nodes hierarchy in parallel. At build-time, we introduce a multi-split node generation method to split along three-dimension axes into eight child nodes once, and gather quickly high-quality child nodes even at early construction phase. During traversal, we propose a progressive traversal to fast decide the visiting order for multiple child nodes. Then, we use a dynamic ray transfer to adaptively drive the traversal tasks execution on the GPU. Our experiments with this hierarchy show the construction and traversal performance improvement for ray tracing using MSKD compared to previous methods. 相似文献

9.

gkDtree: A group-based parallel update kd-tree for interactive ray tracing

《Journal of Systems Architecture》2013,59(3):166-175

This paper proposes a new group-based acceleration data structure called gkDtree for interactive ray tracing of dynamic scenes. The main idea of the gkDtree is to construct the acceleration structure with a multi-level hierarchy, and to integrate a parallelization approach to result in a faster update and a more efficient tree traversal. A gkDtree can be viewed as a set of kd-trees, each of which is a local acceleration structure corresponding to a group. For a gkDtree, a scene is divided into several groups based on a scene graph. The local acceleration structure of each group involving only dynamic primitives is rebuilt. To achieve higher parallelization, dependencies among groups in different levels are removed before rebuilding occurs in parallel. To enhance the scalability of parallelization, a simple and fast load-balancing scheme is introduced. Furthermore, we apply a variety of accurate SAH (surface area heuristic) algorithms into tree generation for both static and dynamic groups. The experimental results show that a gkDtree has a real-time update performance. It has an update performance that is up to 166 times faster than a kd-tree for our test scenes in a six-core hardware system environment. Furthermore, the results also show that tree traversal performance of a gkDtree is competitive with that of a kd-tree. 相似文献

10.

Quantizing Intersections Using Compact Voxels

下载免费PDF全文

Y.‐Y. Chen Y.‐J. Chen S.‐Y. Chien 《Computer Graphics Forum》2017,36(6):76-85

Efficient intersection queries are important for ray tracing. However, building and maintaining the acceleration structures is demanding, especially for fully dynamic scenes. In this paper, we propose a quantized intersection framework based on compact voxels to quantize the intersection as an approximation. With high‐resolution voxels, the scene geometry can be well represented, which enables more accurate simulation of global illumination, such as detailed glossy reflections. In terms of memory usage in our graphics processing unit implementation, voxels are binarized and compactly encoded in a few 2D textures. We evaluate the rendering quality at various voxel resolutions. Empirically, high‐fidelity rendering can be achieved at the voxel resolution of 1 K³ or above, which produces images very similar to those of ray tracing. Moreover, we demonstrate the feasibility of our framework for various illumination effects with several applications, including first‐bounce indirect illumination, glossy refraction, path tracing, direct illumination, and ambient occlusion. 相似文献

11.

Interactive sound rendering in complex and dynamic scenes using frustum tracing

Lauterbach C Chandak A Manocha D 《IEEE transactions on visualization and computer graphics》2007,13(6):1672-1679

We present a new approach for simulating real-time sound propagation in complex, virtual scenes with dynamic sources and objects. Our approach combines the efficiency of interactive ray tracing with the accuracy of tracing a volumetric representation. We use a four-sided convex frustum and perform clipping and intersection tests using ray packet tracing. A simple and efficient formulation is used to compute secondary frusta and perform hierarchical traversal. We demonstrate the performance of our algorithm in an interactive system for complex environments and architectural models with tens or hundreds of thousands of triangles. Our algorithm can perform real-time simulation and rendering on a high-end PC. 相似文献

12.

RBF Volume Ray Casting on Multicore and Manycore CPUs

Aaron Knoll Ingo Wald Paul Navratil Anne Bowen Khairi Reda Michael E. Papka Kelly Gaither 《Computer Graphics Forum》2014,33(3):71-80

Modern supercomputers enable increasingly large N‐body simulations using unstructured point data. The structures implied by these points can be reconstructed implicitly. Direct volume rendering of radial basis function (RBF) kernels in domain‐space offers flexible classification and robust feature reconstruction, but achieving performant RBF volume rendering remains a challenge for existing methods on both CPUs and accelerators. In this paper, we present a fast CPU method for direct volume rendering of particle data with RBF kernels. We propose a novel two‐pass algorithm: first sampling the RBF field using coherent bounding hierarchy traversal, then subsequently integrating samples along ray segments. Our approach performs interactively for a range of data sets from molecular dynamics and astrophysics up to 82 million particles. It does not rely on level of detail or subsampling, and offers better reconstruction quality than structured volume rendering of the same data, exhibiting comparable performance and requiring no additional preprocessing or memory footprint other than the BVH. Lastly, our technique enables multi‐field, multi‐material classification of particle data, providing better insight and analysis. 相似文献

13.

Performance of Space Subdivision Techniques in Ray Tracing

M. D. J. McNeill B. C. Shah M.-P. Hebert P. F. Lister R. L. Grimsdale 《Computer Graphics Forum》1992,11(4):213-220

Whilst providing images of excellent quality, ray tracing is a computationally intensive task. The first part of this paper compares the speed-up achieved in ray tracing using various space subdivision algorithms and discusses the implications of implementing the algorithms on parallel processing systems. The second part addresses the problem of building the data structure within the rendering process, a situation which occurs when the rendering process is parallelised and dynamic scenes are rendered. Greater performance can be achieved with dynamic structure building compared to creation of the structure prior to rendering. The dynamic building algorithm proposed reduces the building time and storage cost of space subdivision structures, and decreases the data structure creation-render cycle time, thus enhancing image parallelism performance. 相似文献

14.

Accelerated ray tracing using an nCUBE2 multicomputer

I. J. Grimstead S. Hurley 《Concurrency and Computation》1995,7(6):571-586

Acceleration techniques for rendering a dynamic sequence of frames (animations) and static scenes using ray tracing are presented. The first technique discusses temporal acceleration for dynamic scenes which takes advantage of ray coherence, while the second technique discusses acceleration for complex static scenes based on parallelism. Several practical aspects of the parallel implementation on an nCUBE2 hypercube computer are included. 相似文献

15.

Interactive fragment tracing

Jan Meseth Michael Guthe Reinhard Klein 《The Visual computer》2005,21(8-10):591-600

One of the main challenges in real-time rendering is to enable more and more effects that were previously available in offline rendering only. An important effect among these is physically correct reflections of arbitrary objects in curved reflectors like windshields. In this paper we propose fragment tracing on the GPU as a solution to interactively realizing this effect for large scenes as employed in industrial applications. For each rasterized fragment, a ray is traced through an octree representing the original geometry and surface material. By introducing a GPU implementation of an octree traversal, for the first time hierarchical data structures can efficiently be used on the GPU. As a result, the approach allows both handling of large geometries such as those employed in virtual prototyping and accurate rendering. Several examples show the generality and achievable rendering quality of our method. 相似文献

16.

RTSAH Traversal Order for Occlusion Rays

Thiago Ize Charles Hansen 《Computer Graphics Forum》2011,30(2):297-305

We accelerate the finding of occluders in tree based acceleration structures, such as a packetized BVH and a single ray kd‐tree, by deriving the ray termination surface area heuristic (RTSAH) cost model for traversing an occlusion ray through a tree and then using the RTSAH to determine which child node a ray should traverse first instead of the traditional choice of traversing the near node before the far node. We further extend RTSAH to handle materials that attenuate light instead of fully occluding it, so that we can avoid superfluous intersections with partially transparent objects. For scenes with high occlusion, we substantially lower the number of traversal steps and intersection tests and achieve up to 2 × speedups. 相似文献

17.

ManyLoDs: Parallel Many‐View Level‐of‐Detail Selection for Real‐Time Global Illumination

Matthias Hollander Tobias Ritschel Elmar Eisemann Tamy Boubekeur 《Computer Graphics Forum》2011,30(4):1233-1240

Level‐of‐Detail structures are a key component for scalable rendering. Built from raw 3D data, these structures are often defined as Bounding Volume Hierarchies, providing coarse‐to‐fine adaptive approximations that are well‐adapted for many‐view rasterization. Here, the total number of pixels in each view is usually low, while the cost of choosing the appropriate LoD for each view is high. This task represents a challenge for existing GPU algorithms. We propose ManyLoDs, a new GPU algorithm to efficiently compute many LoDs from a Bounding Volume Hierarchy in parallel by balancing the workload within and among LoDs. Our approach is not specific to a particular rendering technique, can be used on lazy representations such as polygon soups, and can handle dynamic scenes. We apply our method to various many‐view rasterization applications, including Instant Radiosity, Point‐Based Global Illumination, and reflection/refraction mapping. For each of these, we achieve real‐time performance in complex scenes at high resolutions. 相似文献

18.

Automatic Hybrid Hierarchy Creation: a Cost-model Based Approach

J. P. MolinaMassó P. GonzálezLópez 《Computer Graphics Forum》2003,22(1):5-13

While using hierarchical search structures has been proved as one of the most efficient acceleration techniques when rendering complex scenes, automatic creation of appropriate hierarchies is not solved yet. Well‐known algorithms for automatic creation of bounding volume hierarchies are not enough. Higher performance is achieved by introducing spatial uniform subdivision, although algorithms proposed up to now are not truly automatic, as they need some parameters to be adjusted. In this paper we present a full‐automatic hierarchy creation scheme that structures the scene in a hybrid way, combining bounding volumes and voxel grids in the same tree, selecting the search structure that best fits to each scene region. It uses no parameters at all. This efficient proposal relies on a new cost model that estimates the goodness of a hybrid hierarchy if used for rendering the scene. ACM CSS: I.3.7 Computer Graphics—Three‐Dimensional Graphics and Realism 相似文献

19.

CHC+RT: Coherent Hierarchical Culling for Ray Tracing

O. Mattausch J. Bittner A. Jaspe E. Gobbetti M. Wimmer R. Pajarola 《Computer Graphics Forum》2015,34(2):537-548

We propose a new technique for in‐core and out‐of‐core GPU ray tracing using a generalization of hierarchical occlusion culling in the style of the CHC++ method. Our method exploits the rasterization pipeline and hardware occlusion queries in order to create coherent batches of work for localized shader‐based ray tracing kernels. By combining hierarchies in both ray space and object space, the method is able to share intermediate traversal results among multiple rays. We exploit temporal coherence among similar ray sets between frames and also within the given frame. A suitable management of the current visibility state makes it possible to benefit from occlusion culling for less coherent ray types like diffuse reflections. Since large scenes are still a challenge for modern GPU ray tracers, our method is most useful for scenes with medium to high complexity, especially since our method inherently supports ray tracing highly complex scenes that do not fit in GPU memory. For in‐core scenes our method is comparable to CUDA ray tracing and performs up to 5.94 × better than pure shader‐based ray tracing. 相似文献

20.

SATO: Surface Area Traversal Order for Shadow Ray Tracing

Jae‐Ho Nah Dinesh Manocha 《Computer Graphics Forum》2014,33(6):167-177

We present the surface area traversal order (SATO) metric to accelerate shadow ray traversal. Our formulation uses the surface area of each child node to compute the TO. In this metric, we give a traversal priority to the child node with the larger surface area to quickly find occluders. Our algorithm reduces the pre‐processing overhead significantly, and is much faster than other metrics. Overall, the SATO is useful for ray tracing large and complex dynamic scenes (e.g. a few million triangles) with shadows. 相似文献