期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

李建中都薇《软件学报》1998,9(4):256-262

并行数据库在多处理机之间的分布方法(简称数据分布方法)对并行数据操作算法的性能影响很大.如果在设计并行数据操作算法时充分利用数据分布方法的特点,可以得到十分有效的并行算法.本文研究如何充分利用数据分布方法的特点,设计并行数据操作算法的问题,提出了基于CMD多维数据分布方法的并行CMD－Join算法.理论分析和实验结果表明,并行CMD－Join算法的效率高于其它并行Join算法. 相似文献

2.

基于快速多极边界元法的沥青混凝土弹性模量预测

朱兴一陈伟球赵兴刚陈龙《计算机辅助工程》2011,20(1):66-69

由于对沥青混凝土材料的研究无论是试验法还是经验法均建立在宏观层面上,无法与其细观结构建立本质的联系,因此利用快速多极边界元法(Fast Multipole Boundary Element Method,FMBEM),结合数字图像处理技术,实现沥青混凝土二维几何建模及弹性模量预测.通过数字图像处理技术识别拍摄得到的原始... 相似文献

3.

数据不规则问题并行计算的负载平衡策略的研究 总被引：2，自引：0，他引：2

刘鑫陆林生《计算机应用》2004,24(10):108-111

讨论以边缘通信为特征的数据不规则问题并行计算的静态负载平衡策略。从图论的角度讨论了静态负载平衡问题，给出三个优化目标，即点集等分，最短通路和通信量最小。对于以边缘通信为特征的一般数值计算问题，论述了二维问题正方形划分总通信量最小、并行效率最高，三维问题立方体划分总通信量最小、并行效率最高的结论。基于以上结论和实际课题特点，提出一种一维优先的规则分块算法和基于自动重分块的不规则分块算法相结合的方法。实验证明，该方法实现简单，能够处理不同规模的数据不规则问题，达到较优的负载平衡和较高的通信效率，提高并行程序的整体效率．相似文献

4.

Numerical methods for heat equation with variable coefficients

《国际计算机数学杂志》2012,89(9):1612-1623

In this paper, two methods are developed for linear parabolic partial differential equation with variable coefficients, which are based on rational approximation to the matrix exponential functions. These methods are L-stable, third-order accurate in space and time. In the development of these methods, second-order spatial derivatives are approximated by third-order finite-difference approximations, which give a system of ordinary differential equations whose solution satisfies a recurrence relation that leads to the development of algorithms. These algorithms are tested on heat equation with variable coefficients, subject to homogeneous and/or time-dependent boundary conditions, and no oscillations are observed in the experiments. The method is also modified for a nonlinear problem. All these methods do not require complex arithmetic, and based on partial fraction technique, which is very useful for parallel processing. 相似文献

5.

基于K-D树的多维数据分布方法

王碧霍红卫《计算机工程》2003,29(3):105-107

系统地介绍了基于并行化k-d树的多维数据分布方法。给出了几种构造k-d树的策略和相应算法，并从理论上分析和比较了各种策略的通信花费及其应用范围。相似文献

6.

A parallel Broyden approach to the Toeplitz inverse eigenproblem

Jesús Peinado Antonio M. Vidal 《Concurrency and Computation》2004,16(6):587-610

In this work we show a portable sequential and a portable parallel algorithm for solving the inverse eigenproblem for real symmetric Toeplitz matrices. Both algorithms are based on Broyden's method for solving nonlinear systems. We reduced the computational cost for some problem sizes, and furthermore we managed to reduce spatial cost considerably, compared in both cases with parallel algorithms proposed by other authors and by us, although sometimes quasi‐Newton methods (as Broyden) do not reach convergence in all the test cases. We have implemented the parallel algorithm using the parallel numerical linear algebra library SCALAPACK based on the MPI environment. Experimental results have been obtained using two different architectures: a shared memory multiprocessor, the SGI PowerChallenge, and a cluster of Pentium II PCs connected through a myrinet network. The algorithms obtained are scalable in all the cases. Copyright © 2004 John Wiley & Sons, Ltd. 相似文献

7.

A CFD-based high-order Discontinuous Galerkin solver for three dimensional electromagnetic scattering problems

《Advances in Engineering Software》2015

In this paper, a CFD (Computational Fluid Dynamics) based DG (Discontinuous Galerkin) method is introduced to solve the three-dimensional Maxwell’s equations for complex geometries on unstructured grids. In order to reduce the computing expense, both the quadrature-free implementation method and the parallel computing based on domain decomposition are employed. On the far-field boundary, the non-reflecting boundary condition is implemented. Numerical integration rather than the quadrature-free implementation is used over the faces on the solid boundary to implement the electromagnetic solid boundary condition for perfectly conducting objectives. Both benchmark examples and complex geometry case are tested with the CFD-based DG solver. Numerical results indicate that highly accurate results can be obtained when using high order even on coarse grid and the present method is very suitable for complex geometries. Furthermore, the costs of CPU time and the speedup of the parallel computation are also evaluated. 相似文献

8.

基于核方法的并行模糊聚类算法 总被引：1，自引：0，他引：1

彭秋生魏文红《计算机工程与设计》2008,29(8):1881-1883

介绍并分析了模糊C-均值聚类算法、基于核方法的模糊C-均值聚类算法以及硬聚类算法.将硬聚类算法和模糊聚类算法结合起来,利用硬聚类算法初始化聚类中心,有效的减少模糊聚类算法的迭代次数.针对海量数据处理问题,将改进后的算法并行化,有效地提高了数据处理速度和效率,并在分布式互联PC环境下进行了性能测试.测试结果表明,基于核方法的并行模糊聚类算法具有很好的规模增长性和加速比. 相似文献

9.

Run-length chain coding and scalable computation of a shape''s moments using reconfigurable optical buses

Chin-Hsiung Wu Shi-Jinn Horng 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》2004,34(2):845-855

The main contribution of this paper is the design of several efficient algorithms for modified run-length chain coding and for computing a shape's moments on arrays with reconfigurable optical buses. The proposed algorithms are based on the boundary representation of an object. Instead of using chain code, the boundary can be represented by a modified run-length chain code, where each entity represents a line segment (two adjacent corner pixels). The sequential nature of the chain code makes it difficult to be parallelized. We first propose two constant time algorithms for boundary extraction and run-length chain coding. To the authors' knowledge, these are the most time efficient algorithms yet published. Based on the modified run-length chain coding, and the advantages of both optical transmission and electronic computation, a constant time parallel algorithm for computing a shape's moments using N x N processors is proposed. Additionally, instead of using N x N processors, a scalable moment algorithm using r x r processors is also derived, where r < N. Based on the product of time and the number of processors used, both proposed parallel algorithms are time and cost optimal. 相似文献

10.

基于Level Set方法的点采样曲面测地线计算及区域分解 总被引：10，自引：1，他引：10

肖春霞冯结青缪永伟郑文庭彭群生《计算机学报》2005,28(2):250-258

点采样物体的几何处理是当前造型领域中的研究热点之一,如何有效地对点采样曲面进行区域分解是几何处理的基础性工作．该文首先提出了一种基于Level Set方法的点采样曲面上两点间最短路径的计算方法,用以解决区域分解中的边界曲线生成问题．为了保证求解Level Set微分方程的稳定性,文章采用移动最小平方(MLS)方法对点采样曲面进行均匀重采样和去噪音处理．在此基础上,又进一步提出了一个基于Level Set方法的点采样曲面区域拾取算法．最后给出了上述算法在点采样物体的几何处理中的应用实例．实验结果表明该文提出的算法稳定、快速且容易实现。相似文献

11.

A full multigrid method for the Steklov eigenvalue problem

《国际计算机数学杂志》2012,89(12):2371-2386

ABSTRACT

This paper introduces a kind of parallel multigrid method for solving Steklov eigenvalue problem based on the multilevel correction method. Instead of the common costly way of directly solving the Steklov eigenvalue problem on some fine space, the new method contains some boundary value problems on a series of multilevel finite element spaces and some steps of solving Steklov eigenvalue problems on a very low dimensional space. The linear boundary value problems are solved by some multigrid iteration steps. We will prove that the computational work of this new scheme is truly optimal, the same as solving the corresponding linear boundary value problem. Besides, this multigrid scheme has a good scalability by using parallel computing technique. Some numerical experiments are presented to validate our theoretical analysis. 相似文献

12.

Parallel solution of American option derivatives on GPU clusters

Lilia Ziane Khodja Ming Chau Raphaël Couturier Jacques Bahi Pierre Spitéri 《Computers & Mathematics with Applications》2013,65(11):1830-1848

This paper deals with the numerical solution of financial applications, more specifically the computation of American option derivatives modeled by nonlinear boundary values problems. In such applications we have to solve large-scale algebraic systems. We concentrate on synchronous and asynchronous parallel iterative algorithms carried out on CPU and GPU networks. The properties of the operators arising in the discretized problem ensure the convergence of the parallel iterative synchronous and asynchronous algorithms. Computational experiments performed on CPU and GPU networks are presented and analyzed. 相似文献

13.

基于二维结构化网格的可压缩流体并行算法研究

皇甫永硕刘杰龚春叶《计算机工程与科学》2017,39(9):1602-1609

基于二维/轴对称高精度可压缩多相流计算流体力学方法 MuSiC-CCASSIM的结构化网格部分,设计了区域并行分解方法;针对各处理器边界数据的通信,设计了阻塞式通信与非阻塞式通信并行算法;为了减少通信开销,设计了MPI/OpenMP混合并行优化算法。在天河二号超级计算机上进行了测试,每个核固定网格规模为625*250,最多调用8 192核。测试数据表明,采用MPI/OpenMP混合并行算法、纯MPI非阻塞式通信并行算法和纯MPI阻塞式通信并行算法的程序的平均并行效率分别达到86%、83%和77%,三种算法都具有良好的可扩展性。相似文献

14.

Ternary Sparse Matrix Representation for Volumetric Mesh Subdivision and Processing on GPUs

下载免费PDF全文

J. S. Mueller‐Roemer C. Altenhofen A. Stork 《Computer Graphics Forum》2017,36(5):59-69

In this paper, we present a novel volumetric mesh representation suited for parallel computing on modern GPU architectures. The data structure is based on a compact, ternary sparse matrix storage of boundary operators. Boundary operators correspond to the first‐order top‐down relations of k‐faces to their (k ? 1)‐face facets. The compact, ternary matrix storage format is based on compressed sparse row matrices with signed indices and allows for efficient parallel computation of indirect and bottom‐up relations. This representation is then used in the implementation of several parallel volumetric mesh algorithms including Laplacian smoothing and volumetric Catmull‐Clark subdivision. We compare these algorithms with their counterparts based on OpenVolumeMesh and achieve speedups from 3× to 531×, for sufficiently large meshes, while reducing memory consumption by up to 36%. 相似文献

15.

基于PRAM并行模型最大值查找的方法与改进

贺成施华君《计算机系统应用》2019,28(10):138-144

随着多处理器的出现,并行技术受到了广泛的关注,成为了加速处理问题速度的重要技术.但是使用并行技术在加速计算的同时也带来了对处理器数量需求的急剧提升,并行成本的显著增加.针对这一问题,通过研究基于PRAM （Parallel Random Access Machine）下的3种最大值查找并行算法中的不足,提出了一种比平衡树算法,快速查找法,双对数深度树方法并行成本（cost）更优的基于数据划分方法的最大值查找并行算法.基于数据划分方法的最大值查找算法有效的解决了现有并行方法中处理器工作量分配不均,对处理器需求过大,实现条件苛刻等问题.为此后类似并行算法降低并行成本提供一个方向. 相似文献

16.

求解半线性椭圆型方程组的并行算法

徐焱宋君强《计算机工程与科学》1999,21(2):51-55

本文结合区域分裂技术、多重网格方法、加速Ｓｃｈｗａｒｚ收敛方法、高低解方法、非线性Ｊａｃｏｂｉ迭代方法和Ｎｅｗｔｏｎ线性化迭代方法,设计了三种求解半线性椭圆型方程（组）的并行算法：并行Ｎｅｗｔｏｎ多重网格算法、并行非线性多重网格算法和并行加速Ｓｃｈｗａｒｚ收敛算法。数值试验说明这三种算法的并行计算是可行的。相似文献

17.

GPU-based parallel algorithms for sparse nonlinear systems

V. Galiano H. Migallón V. Migallón J. Penadés 《Journal of Parallel and Distributed Computing》2012

In this work we describe some parallel algorithms for solving nonlinear systems using CUDA (Compute Unified Device Architecture) over a GPU (Graphics Processing Unit). The proposed algorithms are based on both the Fletcher–Reeves version of the nonlinear conjugate gradient method and a polynomial preconditioner type based on block two-stage methods. Several strategies of parallelization and different storage formats for sparse matrices are discussed. The reported numerical experiments analyze the behavior of these algorithms working in a fine grain parallel environment compared with a thread-based environment. 相似文献

18.

Comparison of thinning algorithms on a parallel processor 总被引：1，自引：0，他引：1

CJ Hilditch 《Image and vision computing》1983,1(3):115-132

A number of parallel algorithms for thinning elongated shapes are contrasted and compared on a Clip 4 parallel processor. These algorithms all work in rectangular tessellation and they thin elongated objects to lines one pixel thick while retaining their connectivity. Existing algorithms for use on binary pictures are considered first and new algorithms are proposed which produce more satisfactory results, but are more expensive in terms of speed and space requirements. Two methods of extending these algorithms to grey pictures are then considered. In one method, binary algorithms are used but are directed by the grey pixels; in the other the binary algorithms are generalized to the grey case. Both methods result in arcs which are not wholly determined by the original boundary of the object but lie along darker ridges. The former is faster and produces results which are easier to interpret, but the results from the latter contain more information. 相似文献

19.

Comparative investigation of GPU-accelerated triangle-triangle intersection algorithms for collision detection

Xiao Lei Mei Gang Cuomo Salvatore Xu Nengxiong 《Multimedia Tools and Applications》2022,81(3):3165-3180

Efficient collision detection is critical in 3D geometric modeling. In this paper, we first implement three parallel triangle-triangle intersection algorithms on a GPU and then compare the computational efficiency of these three GPU-accelerated parallel triangle-triangle intersection algorithms in an application that detects collisions between triangulated models. The presented GPU-based parallel collision detection method for triangulated models has two stages: first, we propose a straightforward and efficient parallel approach to reduce the number of potentially intersecting triangle pairs based on AABBs, and second, we conduct intersection tests with the remaining triangle pairs in parallel based on three triangle-triangle intersection algorithms, i.e., the Möller’s algorithm, Devillers’ and Guigue’s algorithm, and Shen’s algorithm. To evaluate the performance of the presented GPU-based parallel collision detection method for triangulated models, we conduct four groups of benchmarks. The experimental results show the following: (1) the time required to detect collisions for the triangulated model consisting of approximately 1.5 billion triangle pairs is less than 0.5 s; (2) the GPU-based parallel collision detection method speedup over the corresponding serial version is 50x - 60x, and (3) Devillers’ and Guigue’s algorithm is comparatively and comprehensively the best of the three GPU-based parallel triangle-triangle intersection algorithms. The presented GPU-accelerated method is capable of efficiently detecting the potential collisions of triangulated models. Overall, the GPU-accelerated parallel Devillers’ and Guigue’s triangle-triangle intersection algorithm is recommended when performing practical collision detections between large triangulated models.

相似文献

20.

Behavioral clusters in dynamic graphs

《Parallel Computing》2015

This paper contributes a method for combining sparse parallel graph algorithms with dense parallel linear algebra algorithms in order to understand dynamic graphs including the temporal behavior of vertices. Our method is the first to cluster vertices in a dynamic graph based on arbitrary temporal behaviors. In order to successfully implement this method, we develop a feature based pipeline for dynamic graphs and apply Nonnegative Matrix Factorization (NMF) to these features. We demonstrate these steps with a sample of the Twitter mentions graph as well as a CAIDA network traffic graph. We contribute and analyze a parallel NMF algorithm presenting both theoretical and empirical studies of performance. This work can be leveraged by graph/network analysts to understand the temporal behavior cluster structure and segmentation structure of dynamic graphs. 相似文献