首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper we present a new radiosity algorithm, based on the notion of a well distributed ray set (WDRS). A WDRS is a set of rays, connecting mutually visible points and patches, that forms an approximate representation of the radiosity operator and the radiosity distribution. We propose an algorithm that constructs an optimal WDRS for a given accuracy and mesh. The construction is based on discrete importance sampling as in previously proposed stochastic radiosity algorithms, and on quasi Monte Carlo sampling. Quasi Monte Carlo sampling leads to faster convergence rates and the fact that the sampling is deterministic makes it possible to represent the well distributed ray set very efficiently in computer memory. Like previously proposed stochastic radiosity algorithms, the new algorithm is well suited for computing the radiance distribution in very complex diffuse scenes, when it is not feasible to explicitly compute and store form factors as in classical radiosity algorithms. Experiments show that the new algorithm is often more efficient than previously proposed Monte Carlo radiosity algorithms by half an order of magnitude and more.  相似文献   

2.
提出了一种新的并行增量式支持向量机算法来解决图形处理单元(GPU)中大规模数据集的分类问题。SVM以及核相关方法可以用来创建精确分类模型,但学习过程需要大量内存和很长时间。扩展了Suykens和Vandewalle提出的最少次方SVM(LS-SVM)方法来建立增量和并行算法。新算法使用图形处理器以低代价获得高系统性能。实现表明,在UCI和Delve数据集上,基于GPU并行增量算法较CPU实现方法快130倍,而且比现行算法,如LibSVM、SVM-perf和CB-SVM等快的多(超过2500倍)。  相似文献   

3.
Since wavelets were introduced in the radiosity algorithm 5, surprisingly little research has been devoted to higher order wavelets and their use in radiosity algorithms. A previous study 13 has shown that wavelet radiosity, and especially higher order wavelet radiosity was not bringing significant improvements over hierarchical radiosity and was having a very important extra memory cost, thus prohibiting any effective computation. In this paper, we present a new implementation of wavelets in the radiosity algorithm, that is substantially different from previous implementations in several key areas (refinement oracle, link storage, resolution algorithm). We show that, with this implementation, higher order wavelets are actually bringing an improvement over standard hierarchical radiosity and lower order wavelets.  相似文献   

4.
Galerkin radiosity solves the integral rendering equation by projecting the illumination functions into a set of higher-order basis functions. This paper presents a Monte Carlo approach for Galerkin radiosity to compute the coefficients of the basis functions. The new approach eliminates the problems with edge singularities between adjacent surfaces present in conventional Galerkin radiosity, the time complexity is reduced fromO(K 4) toO(K 2) for aK-order basis, and ideally specular energy transport can be simulated. As in conventional Galerkin radiosity, no meshing is required even for large or curved surfaces, thus reducing memory requirements, and no a posteriori Gouraud interpolation is necessary. The new algorithm is simple and can be parallelized on any parallel computer, including massively parallel systems.  相似文献   

5.
支持向量机(support vector machine, SVM)是一种广泛应用于统计分类以及回归分析的监督学习方法.基于内点法(interior point method, IPM)的SVM训练具有空间占用小、迭代趋近快等优点,但随着训练数据集规模的增大,仍面临处理速度与存储空间所带来的双重挑战.针对此问题,提出利用CPU-GPU异构系统进行大规模SVM训练的混合并行机制.首先利用计算统一设备架构(compute unified device architecture, CUDA)对基于内点法的SVM训练算法的计算密集部分并行化,并改进算法使其适合利用cuBLAS线性代数库加以实现,提高训练速度;然后利用消息传递接口(message passing interface, MPI)在集群系统上实现CUDA加速后算法的分布并行化,利用分布存储有效地增加所处理数据集规模并减少训练时间;进而基于Fermi架构支持的页锁定内存技术,打破了GPU设备存储容量不足对数据集规模的限制.结果表明,利用消息传递接口(MPI)和CUDA混合编程模型以及页锁定内存数据存储策略,能够在CPU-GPU异构系统上实现大规模数据集的高效并行SVM训练,提升其在大数据处理领域的计算性能和应用能力.  相似文献   

6.
随着数据规模的不断增加,支持向量机(SVM)的并行化设计成为数据挖掘领域的一个研究热点。针对SVM算法训练大规模数据时存在寻优速度慢、内存占用大等问题,提出了一种基于Spark平台的并行支持向量机算法(SP-SVM)。该方法通过调整层叠支持向量机(Cascade SVM)的合并策略和训练结构,并利用Spark分布式计算框架实现;其次,进一步分析并行操作算子的性能,优化算法并行化实现方案,有效克服了层叠模型训练效率低的缺点。实验结果表明,新的并行训练方法在损失较小精度的前提下,在一定程度上减少了训练时间,能够很好地提高模型的学习效率。  相似文献   

7.
姜雪  陶亮  王华彬  武杰 《微机发展》2007,17(11):92-95
在增量学习过程中,随着训练集规模的增大,支持向量机的学习过程需要占用大量内存,寻优速度非常缓慢。在现有的一种支持向量机增量学习算法的基础上,结合并行学习思想,提出了一种分层并行筛选训练样本的支持向量机增量学习算法。理论分析和实验结果表明:与原有的算法相比,新算法能在保证支持向量机的分类能力的前提下显著提高训练速度。  相似文献   

8.
支持向量机(SVM)是最为流行的分类工具,但处理大规模的数据集时,需要大量的内存资源和训练时间,通常在大集群并行环境下才能实现。提出一种新的并行SVM算法,RF-CCASVM,可在有限计算资源上求解大规模SVM。通过随机傅里叶映射,应用低维显示特征映射一致近似高斯核对应的无限维隐式特征映射,从而用线性SVM一致近似高斯核SVM。提出一致中心调节的并行化方法。具体地,将数据集划分成若干子数据集,多个进程并行地在各自的子数据集上独立训练SVM。当各个子数据集上的最优超平面即将求出时,用由各个子集上获得的一致中心解取代当前解,继续在各子集上训练直到一致中心解在各个子集上达到最优。标准数据集的对比实验验证了RF-CCASVM的正确性和有效性。  相似文献   

9.
针对大数据环境下并行支持向量机(SVM)算法存在冗余数据敏感、参数选取困难、并行化效率低等问题,提出了一种基于Relief和BFO算法的并行SVM算法RBFO-PSVM。首先,基于互信息和Relief算法设计了一种特征权值计算策略MI-Relief,剔除数据集中的冗余特征,有效地降低了冗余数据对并行SVM分类的干扰;接着,提出了基于MapReduce的MR-HBFO算法,并行选取SVM的最优参数,提高SVM的参数寻优能力;最后,提出核聚类策略KCS,减小参与并行化训练的数据集规模,并提出改进CSVM反馈机制的交叉融合级联式并行支持向量机CFCPSVM,结合MapReduce编程框架并行训练SVM,提高了并行SVM的并行化效率。实验表明,RBFO-PSVM算法对大型数据集的分类效果更佳,更适用于大数据环境。  相似文献   

10.
Hierarchical radiosity with clustering has positioned itself as one of the most efficient algorithms for computing global illumination in non-trivial environments. However, using hierarchical radiosity for complex scenes is still problematic due to the necessity of storing a large number of transport coefficients between surfaces in the form of links. In this paper, we eliminate the need for storage of links through the use of a modified shooting method for solving the radiosity equation. By distributing only unshot radiosity in each step of the iteration, the number of links decreases exponentially. Recomputing these links instead of storing them increases computation time, but reduces memory consumption dramatically. Caching may be used to reduce the time overhead. We analyze the error behavior of the new algorithm in comparison with the normal gathering approach for hierarchical radiosity. In particular, we consider the relation between the global error of a hierarchical radiosity solution and the local error threshold for each link.  相似文献   

11.
Since a static work distribution does not allow for satisfactory speed‐ups of parallel irregular algorithms, there is a need for a dynamic distribution of work and data that can be adapted to the runtime behavior of the algorithm. Task pools are data structures which can distribute tasks dynamically to different processors where each task specifies computations to be performed and provides the data for these computations. This paper discusses the characteristics of task‐based algorithms and describes the implementation of selected types of task pools for shared‐memory multiprocessors. Several task pools have been implemented in C with POSIX threads and in Java. The task pools differ in the data structures to store the tasks, the mechanism to achieve load balance, and the memory manager used to store the tasks. Runtime experiments have been performed on three different shared‐memory systems using a synthetic algorithm, the hierarchical radiosity method, and a volume rendering algorithm. Copyright © 2004 John Wiley & Sons, Ltd.  相似文献   

12.
一种基于GPU硬件加速计算的辐射度实现方法   总被引:2,自引:0,他引:2  
提出一种新的基于GPU(graphics processing unit)的辐射度方法.该方法利用可编程图形处理单元GPU的并行计算能力,将辐射度方法中形状因子计算以及线性方程组求解的全过程完全在可编程图形硬件中完成,避免了原有基于GPU的辐射度方法需要CPU参与的问题,绕开了计算机主内存与GPU纹理内存之间数据交换的瓶颈;在基于半立方体法的形状因子计算和绘制过程中,解决了基于GPU硬件加速的遍历、分类和累加问题.此外,该方法采用新的矩阵和向量在GPU中的存储方法,利用GPU实现Jacobi迭代法快速求解线性方程组.实验结果证明。该方法能够快速有效地实现辐射度的计算和绘制.  相似文献   

13.
The authors describe a novel algorithm for computing view-independent finite element radiosity solutions on distributed shared memory parallel architectures. Our approach is based on the notion of a subiteration being the transfer of energy from a single source to a subset of the scene's receiver patches. By using an efficient queue based scheduling system to process these subiterations, we show how radiosity solutions can be generated without the need for processor synchronization between iterations of the progressive refinement algorithm. The only significant source of interprocessor communication required by our method is for visibility calculations. We also describe a perceptually driven approach to visibility estimation, which employs an efficient volumetric grid structure and attempts to reduce the amount of interprocessor communication by approximating visibility queries between distant patches. Our algorithm also eliminates the need for dynamic load balancing until the end of the solution process and is shown to achieve a superlinear speedup in many situations  相似文献   

14.
A parallel randomized support vector machine (PRSVM) and a parallel randomized support vector regression (PRSVR) algorithm based on a randomized sampling technique are proposed in this paper. The proposed PRSVM and PRSVR have four major advantages over previous methods. (1) We prove that the proposed algorithms achieve an average convergence rate that is so far the fastest bounded convergence rate, among all SVM decomposition training algorithms to the best of our knowledge. The fast average convergence bound is achieved by a unique priority based sampling mechanism. (2) Unlike previous work (Provably fast training algorithm for support vector machines, 2001) the proposed algorithms work for general linear-nonseparable SVM and general non-linear SVR problems. This improvement is achieved by modeling new LP-type problems based on Karush–Kuhn–Tucker optimality conditions. (3) The proposed algorithms are the first parallel version of randomized sampling algorithms for SVM and SVR. Both the analytical convergence bound and the numerical results in a real application show that the proposed algorithm has good scalability. (4) We present demonstrations of the algorithms based on both synthetic data and data obtained from a real word application. Performance comparisons with SVMlight show that the proposed algorithms may be efficiently implemented.  相似文献   

15.
This paper describes the design and implementation of a shared virtual memory (SVM) system for the nCUBE 2 machine. The SVM system provides the user a single coherent address space across all nodes. It is implemented at the user level in a C programming environment using high level constructs to support data sharing. Shared variables are treated as objects rather than pages. We have improved upon an existing algorithm for maintaining coherency in the SVM system, thus achieving a reduction in the number of internode messages required in coherency maintenance. Detailed timing analysis is conducted to analyze the feasibility of this shared environment. Experimental results indicate that parallel programs running under an SVM system show linear speedup, suggesting that SVM systems could provide an effective programming environment for the next generation of distributed memory parallel computers. The bottleneck of this implementation is associated with the expensive interrupt handling capability of the nCUBE 2.  相似文献   

16.
In this paper we propose a SPMD parallel hierarchical radiosity algorithm relying on a novel partitioning method which may apply to any kind of architectural scene. This algorithm is based on MPI (Message Passing Interface), a communication library which allows the use of either a heterogeneous set of concurrent computers or a parallel computer or both. The database is stored on a common directory and accessed by all the processors (through NFS in case of a network of computers). As the objective is to handle complex scenes such as building interiors, to cope with the problem of memory size, only a subset of the database resides in memory of each processor. This subset is determined with the help of a partitioning into 3D cells, clustering and visibility calculations. A graph expressing visibility between the resulting clusters is determined, partitioned (with a new method based on classification of K-means type) and distributed amongst all the processors. Each processor is responsible for gathering energy (using the Gauss-Seidel method) only for its subset of clusters. In order to reduce the disk transfers due to downloading these subsets of clusters, we use an ordering strategy based on the traveling salesman algorithm. Dynamic load balancing relies on a task stealing approach while termination is detected by configuring the processors into a ring and moving a token around this ring. The parallel iterative resolution is of group iterative type. Its mathematical convergence is proven in the appendix.  相似文献   

17.
Cities numerical simulation including physical phenomena generates highly complex computational challenges. In this paper, we focus on the radiation exchange simulation on an urban scale, considering different types of cities. Observing that the matrix representing the view factors between buildings is sparse, we propose a new numerical model for radiation computation. This solution is based on the radiosity method. We show that the radiosity matrix associated with models composed of up to 140k patches can be stored in main memory, providing a promising avenue for further research. Moreover, a new technique is proposed for estimating the inverse of the radiosity matrix, accelerating the computation of radiation exchange. These techniques could help to consider the characteristics of the environment in building design, as well as assessing in the definition of city regulations related to urban construction.  相似文献   

18.
Radiosity has been a popular method for photorealistic image generation.But the determination of form factors between curved patches is the most difficult and time consuming procedure,and also the errors caused by approximating source patch‘s radiosity with average values are obvious.In this paper,a radiosity algorithm for rendering curved surfaces represented by parameters is described.The contributed radiosity from differential areas on four vertices of the source patch to a receiving point is calculated firstly,then the contribution from the inner area of the source patch is evaluated by interpolating the values on four corners.Both the difficult problem of determining form-factors between curved surfaces and errors mentioned above have been avoided.Comparison of the experimental results using the new algorithm has been made with the ones obtained by traditional method.Some associated techniques such as the visibility test and the adaptive subdivision are also described.  相似文献   

19.
针对支持向量机对大样本学习占用内存多、训练速度慢等不足,本文归纳总结出分治、约减训练集、增量学习、并行化等四种解决策略。四种策略基于两个改进方向:其它算法结合、改变支持向量机算法结构,最终目的是减少支持向量机训练占用内存,提高训练速度。  相似文献   

20.
In this paper, we propose a novel formfactor calculation algorithm for acceleration radiosity solutions in complex environments. Our basic algorithm is an improved version of Spencer's (S.N. Spencer, ‘The hemisphere radiosity method: a tale of two algorithms’, in Photorealism in Computer Graphics, Spencer, 1992, pp. 127–135) and Van Wyk's (G.C. Van Wyk Jr., ‘A geometry-based insolution model for computer-aided design,’ Ph.D. Thesis, The University of Michigan, 1998.) methods, which fail to remove hidden surfaces for relatively large patches and cause large discretization errors in formfactors. We also demonstrate that our technique is superior to the hemi-cube method in terms of the computation time. Moreover, we parallelize our approach on a parallel computer with shared memory, and obtain a high performance with our radiosity rendering system. Our method divides a hemisphere-base into regions, and assigns a region to each processor. The approach can be applied to geometrical data generated by CAD systems, and is evaluated in terms of the computation time, the visual effects, and the parallelization performance. © 1998 John Wiley & Sons, Ltd  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号