共查询到20条相似文献,搜索用时 46 毫秒
1.
基于CUDA的并行粒子群优化算法的设计与实现 总被引:1,自引:0,他引:1
针对处理大量数据和求解大规模复杂问题时粒子群优化(PSO)算法计算时间过长的问题, 进行了在显卡(GPU)上实现细粒度并行粒子群算法的研究。通过对传统PSO算法的分析, 结合目前被广泛使用的基于GPU的并行计算技术, 设计实现了一种并行PSO方法。本方法的执行基于统一计算架构(CUDA), 使用大量的GPU线程并行处理各个粒子的搜索过程来加速整个粒子群的收敛速度。程序充分使用CUDA自带的各种数学计算库, 从而保证了程序的稳定性和易写性。通过对多个基准优化测试函数的求解证明, 相对于基于CPU的串行计算方法, 在求解收敛性一致的前提下, 基于CUDA架构的并行PSO求解方法可以取得高达90倍的计算加速比。 相似文献
2.
针对传统并行计算方法实现结构拓扑优化快速计算的硬件成本高、程序开发效率低的问题,提出了一种基于Matlab和图形处理器(GPU)的双向渐进结构优化(BESO)方法的全流程并行计算策略。首先,探讨了Matlab编程环境中实现GPU并行计算的三种途径的优缺点和适用范围;其次,分别采用内置函数直接并行的方式实现了拓扑优化算法中向量和稠密矩阵的并行化计算,采用MEX函数调用CUSOLVER库的形式实现了稀疏格式有限元方程组的快速求解,采用并行线程执行(PTX)代码的方式实现了拓扑优化中单元敏度分析等优化决策的并行化计算。数值算例表明,基于Matlab直接开发GPU并行计算程序不仅编程效率高,而且还可以避免不同编程语言间的计算精度差异,最终使GPU并行程序可以在保持计算结果不变的前提下取得可观的加速比。 相似文献
3.
GPU通用计算平台上中心差分格式显式有限元并行计算 总被引:3,自引:0,他引:3
显式有限元是解决平面非线性动态问题的有效方法.由于显式有限元算法的条件稳定性,对于大规模的有限元问题的求解需要很长的计算时间.图形处理器(GPU)作为一种高度并行化的通用计算处理器,可以很好解决大规模科学计算的速度问题.统一计算架构(CUDA)为实现GPU通用计算提供了高效、简便的方法.因此,建立了基于GPU通用计算平台的中心差分格式的显式有限元并行计算方法.该方法针对GPU计算的特点,对串行算法的流程进行了优化和调整,通过采用线程与单元或节点的一一映射策略,实现了迭代过程的完全并行化.通过数值算例表明,在保证计算精度一致的前提下,采用NVIDIA GTX 460显卡,该方法能够大幅度提高计算效率,是求解平面非线性动态问题的一种高效简便的数值计算方法. 相似文献
4.
5.
为了研究GPU的通用计算能力和适合SMP集群的编程模型,首次提出MPI+CUDA多粒度混合并行编程的新方法,节点间采用MPI实现粗粒度并行,节点内采用CUDA实现细粒度并行的混合编程方式.利用此方法在搭建的3节点SMP集群环境中,测试了大规模矩阵乘问题的并行计算能力.实验结果表明,该方法能够显著提升并行效率,同时证明MPI+CUDA混合编程模型能够充分发挥SMP集群节点间分布式存储和节点内共享内存的优势,为装有CUDA-enabled GPU的SMP集群提供了一种有效的并行策略. 相似文献
6.
7.
8.
图形处理器(GPU)作为一种高度并行化的处理器架构,已得到越来越多的重视,目前已诞生了以NVIDIA CUDA为代表的各种GPU通用计算技术,同时多GPU并行计算也已有了实际的应用.多GPU并行计算涉及GPU与CPU两者之间的协调和交互,对程序员有着更高的要求.为此,提出一种基于模板的源代码生成技术,通过模板转化来支持单GPU程序的并行化移植.最后通过一个实例表明使用提出的CUDA源代码移植框架能够自动生成与手写程序等价的代码,可以显著降低多GPU下CUDA程序的开发代价,提高CUDA应用程序员的生产效率. 相似文献
9.
基于CUDA的快速图像压缩 总被引:1,自引:0,他引:1
为了进一步提高JPEG编码效率,对JPEG压缩算法进行研究,分析得出JPEG核心步骤可以并行化处理.因此,实现平台宜采用以并行计算为优势的GPU,而不是以串行计算为主的CPU.NVIDIA新推出的CUDA(计算统一设备架构)为此实现提供了软硬件环境.CUDA是基于GPU进行通用计算的开发平台,非常适合大规模的并行数据计算.在GPU流处理器架构下用CUDA技术实现编码并行化,并针对流处理器架构特点进行内存读写等方面的优化,提高了JPEG编码的速度.实验结果表明了CUDA技术在并行处理方面的优越性,JPEG编码效率得到了极大提高. 相似文献
10.
MRRR(Multiple Relatively Robust Representations)算法是求解对称三对角矩阵本征值问题高效、精确的算法之一。在分析MRRR算法及CUDA(Compute Unified Device Architecture)并行体系结构的基础上,针对算法的可并行性,采用单指令多线程并行方式实现了基于CUDA的MRRR算法并行,并从存储结构方面优化算法。实验结果显示,与LAPACK库中串行MRRR实现相比,并行方法在保证精度的基础上获得了20倍的加速比,进而从计算精度和计算时间上说明MRRR算法适合在GPU上并行。 相似文献
11.
S. Shaw 《Journal of Computer Assisted Learning》1993,9(2):93-99
Abstract This paper describes an approach to the design of interactive multimedia materials being developed in a European Community project. The developmental process is seen as a dialogue between technologists and teachers. This dialogue is often problematic because of the differences in training, experience and culture between them. Conditions needed for fruitful dialogue are described and the generic model for learning design used in the project is explained. 相似文献
12.
European Community policy and the market 总被引:1,自引:0,他引:1
C. Lloyd 《Journal of Computer Assisted Learning》1993,9(2):86-91
Abstract This paper starts with some reflections on the policy considerations and priorities which are shaping European Commission (EC) research programmes. Then it attempts to position the current projects which seek to capitalise on information and communications technologies for learning in relation to these priorities and the apparent realities of the marketplace. It concludes that while there are grounds to be optimistic about the contribution EC programmes can make to the efficiency and standard of education and training, they are still too technology driven. 相似文献
13.
融合集成方法已经广泛应用在模式识别领域,然而一些基分类器实时性能稳定性较差,导致多分类器融合性能差,针对上述问题本文提出了一种新的基于多分类器的子融合集成分类器系统。该方法考虑在度量层融合层次之上通过对各类基多分类器进行动态选择,票数最多的类别作为融合系统中对特征向量识别的类别,构成一种新的自适应子融合集成分类器方法。实验表明,该方法比传统的分类器以及分类融合方法识别准确率明显更高,具有更好的鲁棒性。 相似文献
14.
Wayne O’Brien Author Vitae 《Journal of Systems and Software》2008,81(11):1997-2013
Development of software intensive systems (systems) in practice involves a series of self-contained phases for the lifecycle of a system. Semantic and temporal gaps, which occur among phases and among developer disciplines within and across phases, hinder the ongoing development of a system because of the interdependencies among phases and among disciplines. Such gaps are magnified among systems that are developed at different times by different development teams, which may limit reuse of artifacts of systems development and interoperability among the systems. This article discusses such gaps and a systems development process for avoiding them. 相似文献
15.
This paper presents control charts models and the necessary simulation software for the location of economic values of the control parameters. The simulation program is written in FORTRAN, requires only 10K of main storage, and can run on most mini and micro computers. Two models are presented - one describes the process when it is operating at full capacity and the other when the process is operating under capacity. The models allow the product quality to deteriorate to a further level before an existing out-of-control state is detected, and they can also be used in situations where no prior knowledge exists of the out-of-control causes and the resulting proportion defectives. 相似文献
16.
Going through a few examples of robot artists who are recognized worldwide, we try to analyze the deepest meaning of what
is called “robot art” and the related art field definition. We also try to highlight its well-marked borders, such as kinetic
sculptures, kinetic art, cyber art, and cyberpunk. A brief excursion into the importance of the context, the message, and
its semiotics is also provided, case by case, together with a few hints on the history of this discipline in the light of
an artistic perspective. Therefore, the aim of this article is to try to summarize the main characteristics that might classify
robot art as a unique and innovative discipline, and to track down some of the principles by which a robotic artifact can
or cannot be considered an art piece in terms of social, cultural, and strictly artistic interest.
This work was presented in part at the 13th International Symposium on Artificial Life and Robotics, Oita, Japan, January
31–February 2, 2008 相似文献
17.
David Poole 《Computational Intelligence》1989,5(2):97-110
Although there are many arguments that logic is an appropriate tool for artificial intelligence, there has been a perceived problem with the monotonicity of classical logic. This paper elaborates on the idea that reasoning should be viewed as theory formation where logic tells us the consequences of our assumptions. The two activities of predicting what is expected to be true and explaining observations are considered in a simple theory formation framework. Properties of each activity are discussed, along with a number of proposals as to what should be predicted or accepted as reasonable explanations. An architecture is proposed to combine explanation and prediction into one coherent framework. Algorithms used to implement the system as well as examples from a running implementation are given. 相似文献
18.
Watts S. Humphrey 《Annals of Software Engineering》2002,14(1-4):39-72
This paper provides the author's personal views and perspectives on software process improvement. Starting with his first work on technology assessment in IBM over 20 years ago, Watts Humphrey describes the process improvement work he has been directly involved in. This includes the development of the early process assessment methods, the original design of the CMM, and the introduction of the Personal Software Process (PSP)SM and Team Software Process (TSP){SM}. In addition to describing the original motivation for this work, the author also reviews many of the problems he and his associates encountered and why they solved them the way they did. He also comments on the outstanding issues and likely directions for future work. Finally, this work has built on the experiences and contributions of many people. Mr. Humphrey only describes work that he was personally involved in and he names many of the key contributors. However, so many people have been involved in this work that a full list of the important participants would be impractical. 相似文献
19.
基于复小波噪声方差显著修正的SAR图像去噪 总被引:4,自引:1,他引:3
提出了一种基于复小波域统计建模与噪声方差估计显著性修正相结合的合成孔径雷达(Synthetic Aperture Radar,SAR)图像斑点噪声滤波方法。该方法首先通过对数变换将乘性噪声模型转化为加性噪声模型,然后对变换后的图像进行双树复小波变换(Dualtree Complex Wavelet Transform,DCWT),并对复数小波系数的统计分布进行建模。在此先验分布的基础上,通过运用贝叶斯估计方法从含噪系数中恢复原始系数,达到滤除噪声的目的。实验结果表明该方法在去除噪声的同时保留了图像的细节信息,取得了很好的降噪效果。 相似文献