期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Image stylization with enhanced structure on GPU

LI Ping SUN HanQiu SHENG Bin SHEN JianBing 《中国科学:信息科学(英文版)》2012,(5):1093-1105

相似文献

2.

Density‐enhanced perceptual mosaic on GPU

Ping Li Hanqiu Sun 《Computer Animation and Virtual Worlds》2016,27(3-4):241-249

Image mosaic effects are wildly applied in print media, domestic decoration, and many image beautification applications. However, the current image mosaic methods are mostly based on fixed‐size image tiles, simple color adjustment, and irregular image segmentation, which are inaccurate and very time‐consuming. In this paper, we present a graphics processing unit‐accelerated perceptual mosaic using density tiles replacement and brightness lighting optimization, keeping original image structure details and providing more expressive visual effects. Automatic density replacement map segmentation and color‐based region tiles replacement are performed to facilitate the mosaic. Delicate brightness optimization and perceptual color correction are further applied to enhance expressive lighting effects. We also consider the salience perception of images and similarity correlation among neighboring tiles for our perceptual mosaic. The experimental results have shown the efficiency and high‐quality performance of our density‐enhanced perceptual mosaic on graphics processing unit. Copyright © 2016 John Wiley & Sons, Ltd. 相似文献

3.

An extended GPU radiosity solver

Günter Wallner 《The Visual computer》2009,25(5-7):529-537

In this paper we present an extended GPU progressive radiosity solver which integrates ideal diffuse as well as specular transmittance and reflection. The solver is capable to handle multiple specular reflections with correct mirror–object–mirror occlusions. The use of graphics hardware allows to consider attenuation of radiation due to reflections and/or transmissions on a per-pixel basis, enabling us to handle multiple specular triangles with different reflection coefficients at once. Alpha masks are used to replace complex geometry in certain cases to reduce computation times. Furthermore, the inclusion of ambient overshooting into the radiosity solver is discussed. 相似文献

4.

An efficient GPU version of the preconditioned GMRES method

Aliaga José I. Dufrechou Ernesto Ezzatti Pablo Quintana-Ortí Enrique S. 《The Journal of supercomputing》2019,75(3):1455-1469

The Journal of Supercomputing - In a large number of scientific applications, the solution of sparse linear systems is the stage that concentrates most of the computational effort. This situation... 相似文献

5.

一种改进的GPU虚拟化实施方法

下载免费PDF全文

陈志佳朱元昌邸彦强冯少冲《计算机工程与科学》2015,37(5):901-906

当前虚拟桌面实施方法中,终端用户对3D图形处理能力越来越高的要求与虚拟机GPU处理能力之间的矛盾逐渐凸显。为解决上述问题,对GPU虚拟化的典型实施方法进行了研究。在对上述虚拟化技术进行分析的基础上,介绍了一种改进的基于设备独占法和API remoting法的虚拟化方案。利用Hypervisor创建两种模式的虚拟机,分别为一台父虚拟机(GVM)和多台子虚拟机(DVM)。GVM完全独占物理GPU,而DVM与物理GPU无直接交互关系。两种模式虚拟机共享GPU内存以及指令通道,DVM中的GPU调用指令传递至GVM,通过GVM对物理GPU进行快速调用,将调用结果返回到共享内存空间,进而呈现给用户。最后对改进的GPU虚拟化方法与典型虚拟化方法进行了对比与分析,总结了其中的优缺点,梳理了将来的研究重点。相似文献

6.

一种基于GPU的改进光线投射算法

张阿关蒋慧琴马岭杨晓鹏刘玉敏《计算机工程与科学》2017,39(1):145-150

针对传统光线投射算法计算量大、速度慢、在没有硬件加速情况下难以实时重建的问题,提出了一种基于GPU编程的快速计算重采样点值的光线投射算法。首先,设计一个GPU程序确定投射光线的终点与方向;其次,采用加速度步长采样方法确定重采样点的位置并利用快速复合插值方法计算重采样点的颜色值;最后,采用不透明度提前截止法进一步加速重建过程。实验结果表明,该方法计算复杂度低、执行效率高。在保证重建图像质量的同时,与现有基于CPU的光线投射算法相比,重建速度提高6倍,与基于GPU的传统光线投射算法相比,速度提高2倍。相似文献

7.

An optimized approach to histogram computation on GPU

Juan Gómez-Luna José María González-Linares José Ignacio Benavides Nicolás Guil 《Machine Vision and Applications》2013,24(5):899-908

A histogram is a compact representation of the distribution of data in an image with a full range of applications in diverse fields. Histogram generation is an inherently sequential operation where every pixel votes in a reduced set of bins. This makes finding efficient parallel implementations very desirable but challenging, because on graphics processing units thousands of threads may be atomically updating a short number of histogram bins. Under these circumstances, collisions among threads will be very frequent and such collisions will serialize thread execution, seriously damaging the performance. In this paper we propose a highly optimized approach to histogram calculation, which tackles such performance bottlenecks. It uses histogram replication for eliminating position conflicts, padding to reduce bank conflicts, and an improved access to input data called interleaved read access. Our so-called ${\mathcal{R}}$ -per-block approach to histogram calculation has been successfully compared to the main state-of-the-art works using four histogram-based image processing kernels and two real image databases. Results show that our proposal is between 1.4 and 15.7 faster than every previous implementation for histograms of up to 4,096 bins. 相似文献

8.

GPU中的流体场景实时模拟算法 总被引：2，自引：0，他引：2

陈曦王章野何戬延诃彭群生《计算机辅助设计与图形学学报》2010,22(3)

为了实时模拟真实的大规模流体场景,提出一种基于平滑粒子流体力学(SPH)进行流体场景模拟的算法.首先提出了新的精细程度函数作为非均匀采样的依据,以减少实际模拟时所需的粒子数,提高模拟的速度;然后引入一种三维空间网格划分算法和改进的并行基数排序算法,以加快模拟过程中对邻域粒子和边界的查找及其相互作用的计算;最后使用最新的NVIDIA(CUDA(架构,将SPH的全部模拟计算分配到GPU流处理器中,充分利用GPU的高并行性和可编程性,使得对SPH方法的流体计算和模拟达到实时.实验结果表明,采用文中算法能对流体场景的计算模拟达到实时,并实现比较真实的模拟效果.与已有的SPH流体CPU模拟方法相比,其加速比达到2个数量级以上,同时相比已有GPUSPH方法,能模拟出更为丰富的细节效果. 相似文献

9.

An image generator based on neural networks in GPU

Silva Thiago W. Reis Halamo Melcher Elmar U. K. Lima Antonio M. N. Brito Alisson V. 《Multimedia Tools and Applications》2022,81(25):36353-36374

Multimedia Tools and Applications - Existing image databases contain a few diversity of images. Likewise, there is no specific image base available in other situations, leading to the need to... 相似文献

10.

一种基于GPU的标量场驱动物理变形算法

伍潇潇梁晓辉徐启迪赵沁平《计算机研究与发展》2010,47(11)

基于标量场的变形技术是计算机图形学中的研究热点之一,其时效性问题一直未得到很好的解决.从自适应采样距离场的表示方法和基于物理的建模技术的优点出发,提出了一种基于GPU的标量场驱动的物理变形算法.在GPU上构造基于八叉树的自适应采样距离场(adaptively sampled distance fields,ADFs)对模型进行表示,将质点弹簧物理模型与ADFs相结合,依据物理动力学原理直接对ADFs进行控制产生模型的变形,在变形过程中对ADFs进行动态的自适应调整.为了避免由非规则结构引起物理上的非同质性,根据ADFs局部的空间分辨率大小来调整非均匀弹簧的刚度大小.实验结果表明,该算法具有较高的时间和空间效率,比CPU上的算法在时间上快一个数量级,可有效用于基于物理的交互式雕刻等动态应用中. 相似文献

11.

An Infrastructure for Tackling Input-Sensitivity of GPU Program Optimizations

Xipeng Shen Yixun Liu Eddy Z. Zhang Poornima Bhamidipati 《International journal of parallel programming》2013,41(6):855-869

Graphic processing units (GPU) have become increasingly adopted for the enhancement of computing throughput. However, the development of a high-quality GPU application is challenging, due to the large optimization space and complex unpredictable effects of optimizations on GPU program performance. Many recent efforts have been employing empirical search-based auto-tuners to tackle the problem, but few of them have concentrated on the influence of program inputs on the optimizations. In this paper, based on a set of CUDA and OpenCL kernels, we report some evidences on the importance for auto-tuners to adapt to program input changes, and present a framework, G-ADAPT+, to address the influence by constructing cross-input predictive models for automatically predicting the (near-)optimal configurations for an arbitrary input to a GPU program. G-ADAPT+ is based on source-to-source compilers, specifically, Cetus and ROSE. It supports the optimizations of both CUDA and OpenCL programs. 相似文献

12.

基于强化正域的属性约简方法

史博文李国和吴卫江洪云峰周晓明《计算机应用研究》2017,34(1)

属性约简是粗糙集理论的核心内容之一。通过对多种约简方法进行比较,为了得到更好的结果,在传统基于属性依赖度的约简方法基础上,定义更精确的强化正域概念。通过对边界域的精确划分,得出各条件属性对决策属性的强化依赖度,并用自顶向下的启发式搜索算法得到约简结果。采用UCI标准数据集对基于强化正域约简方法REPR进行测试,约简数据后构建的决策树规模小,分类精度高。实验结果表明,相比于经典方法,REPR能更有效地对决策表进行属性约简。相似文献

13.

Secrets from the GPU

Eric Mahé Jean-Marie Chauvet 《Journal in Computer Virology》2014,10(3):205-210

In the current controversial context caused by the disclosure of classified details of several top-secret United States and British government mass surveillance programs to the press by former NSA contractor Edward Snowden, issues of data privacy, anonymity, unlinkability, forward secrecy and deniability have raised to public prominence. In this work we investigate how an alternate usage of state-of-the-art yet ubiquitous computing platforms might help sovereign, citizen and general public recovery of control over privacy. These goals are notoriously difficult to achieve on the Internet today due to the insufficient public-key infrastructure at the user level. Our approach leverages modern multi-core processors and general-purpose computing on graphics processing units, both as a source of true random entropy pools and computational engines for very fast elliptic curve cryptography (ECC). Such autonomous, high-frequency Diffie–Hellman-ready agents reside in a breadth of devices ranging from smartphones and tablets, to laptops and high-end servers in datacenters. In contrast to the current circumstance, this suggested infrastructure enables generalized symmetric exchanges with the Vernam cipher without compromising ease-of-use nor requiring revolutionary changes in today’s well-grounded ECC theory. 相似文献

14.

An efficient solution to the subset‐sum problem on GPU

V. V. Curtis C. A. A. Sanches 《Concurrency and Computation》2016,28(1):95-113

We present an algorithm to solve the subset‐sum problem (SSP) of capacity c and n items with weights w_i,1≤i≤n, spending O(n(m − w_min)/p) time and O(n + m − w_min) space in the Concurrent Read/Concurrent Write (CRCW) PRAM model with 1≤p≤m − w_min processors, where w_min is the lowest weight and , improving both upper‐bounds. Thus, when n≤c, it is possible to solve the SSP in O(n) time in parallel environments with low memory. We also show OpenMP and CUDA implementations of this algorithm and, on Graphics Processing Unit, we obtained better performance than the best sequential and parallel algorithms known so far. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献

15.

An GPU accelerated finite difference method for heat transfer simulation

ZHOU Yi HE Fazhi QIU Yimin 《计算机辅助绘图.设计与制造(英文版)》2013,(1):27-31

The heat transfer mathematic models are widely used in iron and steel industry area.Many computational models that represent this physical process is based on finite difference methods.The simulation of these phenomena demands a high computational cost.In this paper we employ GPU for the development of algorithm for a two-dimensional heat transfer problem with finite difference methods.The performance evaluation has been made and the comparison between CPU and GPU were discussed.The experimental result shows that GPU can solve this problem more efficiently when we need to divide calculation material into a large number of meshes. 相似文献

16.

An Analysis of Region Clustered BVH Volume Rendering on GPU

D. Ganter M. Manzke 《Computer Graphics Forum》2019,38(8):13-21

We present a Direct Volume Rendering method that makes use of newly available Nvidia graphics hardware for Bounding Volume Hierarchies. Using BVHs for DVR has been overlooked in recent research due to build times potentially impeding interactive rates. We indicate that this is not necessarily the case, especially when a clustering algorithm is applied before the BVH build to reduce leaf‐node complexity. Our results show substantial render time improvements for full‐resolution DVR on GPU in comparison to a recent state‐of‐the‐art approach for empty‐space‐skipping. Furthermore, the use of a BVH for DVR allows seamless integration into popular surface‐based path‐tracing technologies like Nvidia's OptiX. 相似文献

17.

Analysis of test suite reduction with enhanced tie-breaking techniques

Jun-Wei Lin Chin-Yu Huang 《Information and Software Technology》2009,51(4):679-690

Test suite minimization techniques try to remove redundant test cases of a test suite. However, reducing the size of a test suite might reduce its ability to reveal faults. In this paper, we present a novel approach for test suite reduction that uses an additional testing criterion to break the ties in the minimization process. We integrated the proposed approach with two existing algorithms and conducted experiments for evaluation. The experiment results show that our approach can improve the fault detection effectiveness of reduced suites with a negligible increase in the size of the suites. Besides, under specific conditions, the proposed approach can also accelerate the process of minimization. 相似文献

18.

Mass-spring systems on the GPU

Joachim Georgii Rüdiger Westermann 《Simulation Modelling Practice and Theory》2005,13(8):693-702

We present and analyze different implementations of mass-spring systems for interactive simulation of deformable surfaces on graphics processing units (GPUs). For the amount of springs we target, numerical time integration of spring displacements needs to be accelerated and the transfer of displaced point positions for rendering must be avoided. To fulfill these requirements, we exploit features of recent graphics accelerators to simulate spring elongation and compression on the GPU, saving displaced point masses in graphics memory, and then sending these positions through the GPU again to render the deformed surface. Two different simulation algorithms implementing scattering and gathering operations on the GPU are compared with respect to performance and numerical accuracy. We discuss GPU specific issues to be considered in simulation techniques showing similar computation and memory access patterns to mass-spring systems. 相似文献

19.

An enhanced lognormal selection procedure

E. Jack Chen 《Discrete Event Dynamic Systems》2011,21(2):205-218

John and Chen (IEEE Trans Reliab 55(1):135–148, 2006) propose an exact two-stage solution based on the Least Favorable Configuration (LFC) to the ranking and selection problem of determining the best system from k lognormal populations. Lognormal density is commonly used to model certain lifetimes in reliability and survival analysis. It is known that selection procedures that are developed based on the LFC are conservative and become inefficient when the number of systems is large. We propose to take into account the differences of sample values within the parameter of interest when determining sample sizes, which can significantly increase the efficiency of the selection procedures. We also sequentialize the selection procedure and provide a procedure for estimating the constant needed to apply the solution. An experimental performance evaluation demonstrates the validity and efficiency of the enhanced lognormal selection procedure. 相似文献

20.

一种改进的模糊PD 控制器 总被引：2，自引：1，他引：2

苏玉鑫郑春红段宝岩《控制与决策》2004,19(2):175-178

充分利用非线性跟踪微分器获得高质量微分信号的特性，将跟踪微分器与传统的简单模糊PD控制器相结合，提出一种简单的高性能的改进的模糊PD控制器．该改进模糊控制器的最显著特点是对测量噪声的强鲁棒性和工程易实现性．数值仿真证明了其有效性和高效性．相似文献