首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
由于复杂网络的规模越来越大, 在大规模的复杂网络中快速、准确地挖掘出隐藏的社区结构是当前该领域研究的热点问题。目前社区结构挖掘常用的基于快速Newman算法的社区结构挖掘算法之一是一般概率框架方法。以规模日益增大的复杂网络为研究对象, 提出了基于GPGPU的一般概率框架并行算法, 有效地解决了在大规模的复杂网络中快速、准确地挖掘出隐藏的社区结构问题。实验证明, 随着节点数的增加, 该并行算法在不损失准确性的前提下运行效率有所提高, 为复杂网络社区结构挖掘的研究提供了一种高效的解决方案。  相似文献   

2.
Considers the use of massively parallel architectures to execute a trace-driven simulation of a single cache set. A method is presented for the least-recently-used (LRU) policy, which, regardless of the set size C, runs in time O(log N) using N processors on the EREW (exclusive read, exclusive write) parallel model. A simpler LRU simulation algorithm is given that runs in O(C log N) time using N/log N processors. We present timings of this algorithm's implementation on the MasPar MP-1, a machine with 16384 processors. A broad class of reference-based line replacement policies are considered, which includes LRU as well as the least-frequently-used (LFU) and random replacement policies. A simulation method is presented for any such policy that, on any trace of length N directed to a C line set, runs in O(C log N) time with high probability using N processors on the EREW model. The algorithms are simple, have very little space overhead, and are well suited for SIMD implementation  相似文献   

3.
We discuss the effective implementation of parallel processing for linear prediction-based uniform state sampling (LPUSS). In previous work, we proposed LPUSS as an optimization algorithm for mechanical motions that assures high optimality of the solutions and computational efficiency. In parallel computation, LPUSS requires balanced memory allocation and managed processing timing. In this paper, we propose an effective parallel computing method that assures high optimality and calculation efficiency in parallel processing using GPU processor. We conducted two experiments to validate the proposed method. In the first experiment, we compared single-thread processing for LPUSS and the proposed parallel processing. As a result of this experiment, calculation speed of LPUSS was about 4–20 times faster than that with single-thread CPU. In the second experiment, we applied the proposed method to the optimization of sixtuple inverted pendulum. As a result, the proposed method optimized the motion within 40 minutes. According to our survey, there is no other optimization method that is applicable to higher than quadruple inverted pendulum models with standard constraints.  相似文献   

4.
In this paper the recent developments of lattice kinetic models are discussed followed by the fundamental concepts of Evolutionary Algorithms and their application to optimization, machine learning, Neural networks and many other areas. Finally, the application of Evolutionary Algorithms to the biological eco-system is presented.  相似文献   

5.
In this paper we describe how to apply fine grain parallelism to augmenting path algorithms for the dense linear assignment problem. We prove by doing that the technique we suggest, can be efficiently implemented on commercial available, massively parallel computers. Using n processors, our method reduces the computational complexity from the sequentialO(n 3) to the parallel complexity ofO(n 2). Exhaustive experiments are performed on a Maspar MP-2 in order to determine which of the algorithmic flavors that fits best onto this kind of architecture.  相似文献   

6.
7.
The accuracy of stereo vision has been considerably improved in the last decade, but real-time stereo matching is still a challenge for embedded systems where the limited resources do not permit fast operation of sophisticated approaches. This work presents an evaluation of area-based algorithms used for calculating distance in stereoscopic vision systems, their hardware architectures for implementation on FPGA and the cost of their accuracies in terms of FPGA hardware resources. The results show the trade-off between the quality of such maps and the hardware resources which each solution demands, so they serve as a guide for implementing stereo correspondence algorithms in real-time processing systems.  相似文献   

8.
The Hough Transform (HT) is a digital image processing method for the detection of shapes which has multiple uses today. A disadvantage of this method is its sequential computational complexity, particularly when a single processor is used. An optimized algorithm of HT for straight lines detection in an image is presented in this article. Optimization is realized by using a decomposition of the input image recently proposed via central processing unit (CPU), and the technique known as segment decomposition. Optimized algorithms improve execution times significantly. In this paper, the optimization is implemented in parallel using graphics processing unit (GPU) programming, allowing a reduction of total run time and achieving a performance more than 20 times better than the sequential method and up to 10 times better than the implementation recently proposed. Additionally, we introduce the concept of Performance Ratio, to emphasize the outperforming of the GPU over the CPUs.  相似文献   

9.
10.
S. G. Akl 《Computing》1984,32(1):1-11
Nonlinear equations are considered, where some input parameters are subjected to errors. By a class of monotone enclosing methods sequences of intervals are constructed, containing for each value of the perturbation parameter at least one zero of the problem. In finite dimensional spaces concrete realizations are given, e. g. of Newton-, Regula falsi- and Jacobi-Newton-type.  相似文献   

11.
12.
General Purpose Graphic Processing Unit (GPGPU) computing with CUDA has been effectively used in scientific applications, where huge accelerations have been achieved. However, while today’s traditional GPGPU can reduce the execution time of parallel code by many times, it comes at the expense of significant power and energy consumption. In this paper, we propose ubiquitous parallel computing approach for construction of decision tree on GPU. In our approach, we exploit parallelism of well-known ID3 algorithm for decision tree learning by two levels: at the outer level of building the tree node-by-node, and at the inner level of sorting data records within a single node. Thus, our approach not only accelerates the construction of decision tree via GPU computing, but also does so by taking care of the power and energy consumption of the GPU. Experiment results show that our approach outperforms purely GPU-based implementation and CPU-based sequential implementation by several times.  相似文献   

13.
基于GPU的多数据流相关系数并行计算方法研究*   总被引:1,自引:1,他引:1  
为了满足多数据流处理的实时性需求,提出一种跨PCIE总线的四层滑动窗口模型和基于图形处理器的多数据流并行处理框架模型,在此框架模型下可以并行维护数量巨大的滑动实时多数据流统计信息,同时采用精确方法并行计算多数据流间任意两条的相关系数。通过对比在同样的实验环境下只使用CPU的计算处理方法,验证了新方法的实时计算性能具有显著的提高。  相似文献   

14.
Hopfield network for stereo vision correspondence   总被引:5,自引:0,他引:5  
An optimization approach is used to solve the correspondence problem for a set of features extracted from a pair of stereo images. A cost function is defined to represent the constraints on the solution, which is then mapped onto a two-dimensional Hopfield neural network for minimization. Each neuron in the network represents a possible match between a feature in the left image and one in the right image. Correspondence is achieved by initializing (exciting) each neuron that represents a possible match and then allowing the network to settle down into a stable state. The network uses the initial inputs and the compatibility measures between the matched points to find a stable state.  相似文献   

15.
The computation of a scalar correspondence error is the fundamental step in most stereo correspondence algorithms. The quality of the results obtained by the reconstruction algorithm directly depends on the characteristics of such error. We have developed a procedure to evaluate different methods proposed for the computation of the correspondence error. The evaluation is based on exploring the shape of the error surface generated and testing it for uniqueness, isolation and compatibility. The scheme presented makes it possible to recognise the known characteristics of the tested methods for the computation of a correspondence error from the results of the evaluations. Our results show that, for the tested scenes, the evaluation scheme allows us to identify the most appropriate method to compute the correspondence error.  相似文献   

16.
The graphics processing unit (GPU) is used to solve large linear systems derived from partial differential equations. The differential equations studied are strongly convection-dominated, of various sizes, and common to many fields, including computational fluid dynamics, heat transfer, and structural mechanics. The paper presents comparisons between GPU and CPU implementations of several well-known iterative methods, including Kaczmarz’s, Cimmino’s, component averaging, conjugate gradient normal residual (CGNR), symmetric successive overrelaxation-preconditioned conjugate gradient, and conjugate-gradient-accelerated component-averaged row projections (CARP-CG). Computations are preformed with dense as well as general banded systems. The results demonstrate that our GPU implementation outperforms CPU implementations of these algorithms, as well as previously studied parallel implementations on Linux clusters and shared memory systems. While the CGNR method had begun to fall out of favor for solving such problems, for the problems studied in this paper, the CGNR method implemented on the GPU performed better than the other methods, including a cluster implementation of the CARP-CG method.  相似文献   

17.
根据某大型双层柱面网壳风致静力响应计算的有限元模型,建立基于GPU的MATLAB快速并行计算平台,实现CUDA框架下多自由度结构风致静力位移响应的快速求解.数值计算表明,与传统的CPU串行计算相比,通过GPU实现的大型矩阵的求逆、乘法、除法等运算速度得到大幅提高,位移计算获得23倍的最大加速比;结果误差对比分析也表明基于GPU的计算结果能够满足工程精度要求.  相似文献   

18.
Ring, torus and hypercube architectures/algorithms for parallel computing   总被引:1,自引:0,他引:1  
This paper provides a survey of both architectural and algorithmic aspects of solving problems using parallel processors with ring, torus and hypercube interconnection.  相似文献   

19.
Cooperation of multi-domain massively parallel processor systems in com- puting grid environment provides new opportunities for multisite job scheduling. At the same time, in the area of co-allocation, heterogeneity, network adaptability and scalability raise the challenge for the international design of multisite job scheduling models and algorithms. It presents multisite job scheduling schema through the introduction of mul- tisite job scheduling model and the performance model under the grid environment. It introduces two job multisite and cooperative scheduling models and algorithms with the core of the optimal and greedy-heuristic resource selection strategies. Meanwhile, com- pared with single and multisite cooperative scheduling models and algorithms introduced by Sabin, Yahyapour and other persons, the validity and advance of the scheduling model and the performance model herein are proved.  相似文献   

20.
The Journal of Supercomputing - The emergence of GPU-CPU heterogeneous architecture has led to a significant paradigm shift in parallel programming. How to effectively implement Parallel Genetic...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号