共查询到18条相似文献,搜索用时 156 毫秒
1.
2.
目的 在基于分块渲染(TBR)架构的GPU中,三角形光栅化的速度对芯片的性能影响很大,采用传统的光栅化方法会产生大量多余的像素,无法发挥TBR架构的优势.方法 提出了一种该架构下的高效三角形光栅化算法,该算法充分利用了分块渲染的特点,通过预处理计算出三角形在每一个块内的绘制参数,得出三角形与块边界的位置关系,并将其随三角形的分块信息一起写入存储器,在光栅化阶段采用了Bresenham算法,利用生成的三角形边得到在每一个块内的扫描水平线,进而生成水平线上的每一个像素.结果 经过理论分析,该算法的光栅化效率可以达到83%以上,甚至接近100%,在FPGA原型验证系统上对该算法进行了功能和性能的验证.结论 提出的三角形光栅化算法,能够适应TBR的架构,实际测试像素填充率与频率高一倍的ATI M9相当,因此该算法能够达到较高的光栅化效率. 相似文献
3.
提出了一种面向嵌入式平台的图形光栅的硬件实现算法。将三角面包围盒内的像素分成多个规则像素块,在像素块基础上进行扫描转换和像素插值以及透视校正。在对算法做了大量优化后,用FPGA(现场可编程门阵列)对算法进行了实现和验证。与传统的光栅算法相比,提出的算法提高了像素命中率,减小了计算复杂度,降低了硬件成本。验证结果表明,算法渲染的图形质量达到OpenGLES 1.1渲染效果;在一般场景下的渲染速度达到30帧/秒,满足实时渲染要求;在Xilinx FPGA Vertex2P xc2vp30-7ff89上的综合资源为5 545个Slice,硬件消耗较小。 相似文献
4.
在高精度测量中,为了提高光栅细分精度,采用了一种基于FPGA的光栅信号细分及辨向方法。首先用Matlab分析读数头输出的两路原始信号和经过滤波且滤除直流分量的信号特点,并根据处理后的波形构造细分算法,既验证细分算法实现1024细分的可行性,也验证硬件电路实现细分算法的可行性。然后在Matlab对光栅信号的算法分析基础上,设计了一种基于幅值采样细分方法的电路,实现对光栅信号进行细分和辨向。细分硬件电路主要包括8细分电路和精细分电路,8细分电路主要对每个信号的一个周期进行8细分,精细分电路主要是对每1/8周期的信号进行细分。测试结果表明,该细分电路实现了光栅的1024细分,达到了高倍细分目的。 相似文献
5.
提出一个面向嵌入式平台的可编程三角形建立引擎,用来代替图形处理器中专用的固定功能的硬件引擎。该引擎采用3路并行的SIMD架构,S15.16定点数据通路,具有旁路功能的8级流水线结构和高精度的特殊功能单元。这些结构特性有效地提高了运算速度,降低了硬件成本。实验结果表明,通过编程实现自定义光栅算法,在Xilinx FPGA Vertex2P xc2vp30-7ff89上的综合频率达到78 MHz,综合资源为3 354个Slice,硬件消耗较少,能够满足嵌入式环境下的实时绘制。 相似文献
6.
7.
采用FPGA来加速应用软件的关键算法执行,是一种有效的提高计算机系统运算速度的方法.通过把高性能计算算法中固有的并行运算部分硬件化来实现应用加速.本文主要讨论使用FPGA来实现BIAS数学库的加速,对其中最耗时的dgemm算法做了加速,并且设计了基于FPGA的加速系统. 相似文献
8.
9.
讨论了FPGA图像处理算法的几种实现途径,在分析和研究中值滤波算法的基础上提出了一种优化的算法,该算法既能满足硬件的流水实现,又可在效率上得到明显提高。设计以FPGA为硬件平台,用Verilog语言实现了中值滤波的优化算法。通过与软件中值滤波进行比较,可以看到硬件实现的效率优势和算法可行性。 相似文献
10.
偏振光栅导航传感器的电子系统采用了适应图像处理的软硬件体系结构,使其能够实时处理全天域偏振信息,具有更高的分辨率和精度。为了验证这种新体系结构的可行性,并且提供研究图像提取航向信息算法的平台,在阐述偏振光栅导航传感器工作原理的基础上提出其电子系统的设计方案。此电子系统是在现场可编程门阵列(FPGA)上构建的可编程片上系统(SOPC),通过Cameralink协议获取前级偏振检测图像数据,经过硬件化的算法模块和Nios II处理器计算出航向方位,并将结果通过串行通信协议发送上位机。测试显示:此电子系统能够实现偏振图像采集和处理,输出数据更新率高达20 Hz,计算结果精度优于±0.01°,便于图像提取航向信息算法的硬件化验证。 相似文献
11.
Tomas Akenine‐Möller Robert Toth Jacob Munkberg Jon Hasselgren 《Computer Graphics Forum》2012,31(1):3-a18
For depth of field (DOF) rasterization, it is often desired to have an efficient tile versus triangle test, which can conservatively compute which samples on the lens that need to execute the sample‐in‐triangle test. We present a novel test for this, which is optimal in the sense that the region on the lens cannot be further reduced. Our test is based on removing half‐space regions of the (u, v) ‐space on the lens, from where the triangle definitely cannot be seen through a tile of pixels. We find the intersection of all such regions exactly, and the resulting region can be used to reduce the number of sample‐in‐triangle tests that need to be performed. Our main contribution is that the theory we develop provides a limit for how efficient a practical tile versus defocused triangle test ever can become. To verify our work, we also develop a conceptual implementation for DOF rasterization based on our new theory. We show that the number of arithmetic operations involved in the rasterization process can be reduced. More importantly, with a tile test, multi‐sampling anti‐aliasing can be used which may reduce shader executions and the related memory bandwidth usage substantially. In general, this can be translated to a performance increase and/or power savings. 相似文献
12.
We present a novel framework for real-time multi-perspective rendering. While most existing approaches are based on ray-tracing, we present an alternative approach by emulating multi-perspective rasterization on the classical perspective graphics pipeline. To render a general multi-perspective camera, we first decompose the camera into piecewise linear primitive cameras called the general linear cameras or GLCs. We derive the closed-form projection equations for GLCs and show how to rasterize triangles onto GLCs via a two-pass rendering algorithm. In the first pass, we compute the GLC projection coefficients of each scene triangle using a vertex shader. The linear raster on the graphics hardware then interpolates these coefficients at each pixel. Finally, we use these interpolated coefficients to compute the projected pixel coordinates using a fragment shader. In the second pass, we move the pixels to their actual projected positions. To avoid holes, we treat neighboring pixels as triangles and re-render them onto the GLC image plane. We demonstrate our real-time multi-perspective rendering framework in a wide range of applications including synthesizing panoramic and omnidirectional views, rendering reflections on curved mirrors, and creating multi-perspective faux animations. Compared with the GPU-based ray tracing methods, our rasterization approach scales better with scene complexity and it can render scenes with a large number of triangles at interactive frame rates. 相似文献
13.
We present an efficient algorithm for object‐space proximity queries between multiple deformable triangular meshes. Our approach uses the rasterization capabilities of the GPU to produce an image‐space representation of the vertices. Using this image‐space representation, inter‐object vertex‐triangle distances and closest points lying under a user‐defined threshold are computed in parallel by conservative rasterization of bounding primitives and sorted using atomic operations. We additionally introduce a similar technique to detect penetrating vertices. We show how mechanisms of modern GPUs such as mipmapping, Early‐Z and Early‐Stencil culling can optimize the performance of our method. Our algorithm is able to compute dense proximity information for complex scenes made of more than a hundred thousand triangles in real time, outperforming a CPU implementation based on bounding volume hierarchies by more than an order of magnitude. 相似文献
14.
A SIMD-efficient 14 instruction shader program for high-throughput microtriangle rasterization 总被引:1,自引:0,他引:1
Jordi Roca Victor Moya Carlos Gonzalez Vicente Escandell Albert Murciego Agustin Fernandez Roger Espasa 《The Visual computer》2010,26(6-8):707-719
This paper shows that breaking the barrier of 1 triangle/clock rasterization rate for microtriangles in modern GPU architectures in an efficient way is possible. The fixed throughput of the special purpose culling and triangle setup stages of the classic pipeline limits the GPU scalability to rasterize many triangles in parallel when these cover very few pixels. In contrast, the shader core counts and increasing GFLOPs in modern GPUs clearly suggests parallelizing this computation entirely across multiple shader threads, making use of the powerful wide-ALU instructions. In this paper, we present a very efficient SIMD-like rasterization code targeted at very small triangles that scales very well with the number of shader cores and has higher performance than traditional edge equation based algorithms. We have extended the ATTILA GPU shader ISA (del Barrioet al. in IEEE International Symposium on Performance Analysis of Systems and Software, pp. 231–241, 2006) with two fixed point instructions to meet the rasterization precision requirement. This paper also introduces a novel subpixel Bounding Box size optimization that adjusts the bounds much more finely, which is critical for small triangles, and doubles the 2×2-pixel stamp test efficiency. The proposed shader rasterization program can run on top of the original pixel shader program in such a way that selected fragments are rasterized, attribute interpolated and pixel shaded in the same pass. Our results show that our technique yields better performance than a classic rasterizer at 8 or more shader cores, with speedups as high as 4× for 16 shader cores. 相似文献
15.
When rendering effects such as motion blur and defocus blur, shading can become very expensive if done in a naïve way, i.e. shading each visibility sample. To improve performance, previous work often decouple shading from visibility sampling using shader caching algorithms. We present a novel technique for reusing shading in a stochastic rasterizer. Shading is computed hierarchically and sparsely in an object‐space texture, and by selecting an appropriate mipmap level for each triangle, we ensure that the shading rate is sufficiently high so that no noticeable blurring is introduced in the rendered image. Furthermore, with a two‐pass algorithm, we separate shading from reuse and thus avoid GPU thread synchronization. Our method runs at real‐time frame rates and is up to 3 × faster than previous methods. This is an important step forward for stochastic rasterization in real time. 相似文献
16.
Several algorithms have been introduced to render motion blur in real time by solving the visibility problem in the spatial-temporal domains. However, some algorithms render at interactive frame rates but have artifacts or noise. Therefore, we propose a new algorithm that renders real-time motion blur using extruded triangles. Our method uses two triangles in the previous frame and the current frame to make an extruded triangle then send it to rasterization. By using the standard rasterization, visibility determination is performed efficiently. To solve the occlusion between extruded triangles for a given pixel, we introduce a combination solution using a sorting in front-to-back order and bitwise operations in the spatial-temporal dimensions. This solution ensures that only non-occluded extruded triangles are shaded. We further improve performance of our algorithm using a coverage map. 相似文献
17.
We present a method for analytically calculating an anti‐aliased rasterization of arbitrary polygons or fonts bounded by Bézier curves in 2D as well as oriented triangle meshes in 3D. Our algorithm rasterizes multiple resolutions simultaneously using a hierarchical wavelet representation and is robust to degenerate inputs. We show that using the simplest wavelet, the Haar basis, is equivalent to performing a box‐filter to the rasterized image. Because we evaluate wavelet coefficients through line integrals in 2D, we are able to derive analytic solutions for polygons that have Bézier curve boundaries of any order, and we provide solutions for quadratic and cubic curves. In 3D, we compute the wavelet coefficients through analytic surface integrals over triangle meshes and show how to do so in a computationally efficient manner. 相似文献
18.
We present an image processing method that converts a raster image to a simplical two‐complex which has only a small number of vertices (base mesh) plus a parametrization that maps each pixel in the original image to a combination of the barycentric coordinates of the triangle it is finally mapped into. Such a conversion of a raster image into a base mesh plus parametrization can be useful for many applications such as segmentation, image retargeting, multi‐resolution editing with arbitrary topologies, edge preserving smoothing, compression, etc. The goal of the algorithm is to produce a base mesh such that it has a small colour distortion as well as high shape fairness, and a parametrization that is globally continuous visually and numerically. Inspired by multi‐resolution adaptive parametrization of surfaces and quadric error metric, the algorithm converts pixels in the image to a dense triangle mesh and performs error‐bounded simplification jointly considering geometry and colour. The eliminated vertices are projected to an existing face. The implementation is iterative and stops when it reaches a prescribed error threshold. The algorithm is feature‐sensitive, i.e. salient feature edges in the images are preserved where possible and it takes colour into account thereby producing a better quality triangulation. 相似文献