首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We develop an optimal algorithm for the numerical solution of semi-coercive variational inequalities by combining dual-primal FETI algorithms with recent results for bound and equality constrained quadratic programming problems. The discretized version of the model problem, obtained by using the FETI-DP methodology, is reduced by the duality theory of convex optimization to a quadratic programming problem with bound and equality constraints, which is solved by a new algorithm with a known rate of convergence given in terms of the spectral condition number of the quadratic problem. We present convergence bounds that guarantee the scalability of the algorithm. These results are confirmed by numerical experiments.  相似文献   

2.
为了对提花织物图像进行矢量化,针对其颜色少、色块大的特点,提出了改进的Potrace图像矢量化算法。原始的Potrace算法只能实现对二值图像的矢量化,改进后的算法将位图中的色块逐个分解生成一个个的闭合路径,之后将这些闭合路径按照其各自分布拼接成树状结构并矢量化,最终生成一个完整的矢量图形。该算法在实际的应用中取得较好效果。  相似文献   

3.
提出了一种新的带状线性方程组的分布式并行算法(New Distributed Parallel Algorithm for Banded Linear Equations,简称为NDPAB算法)。当带状线性方程组的系数矩阵满足对角占优时,算法在运行过程中不会中断,算法的加速比接近于处理器数目。给出了基于局域网的MPI异构环境下数值实验结果,数值实验结果表明算法是高效的。  相似文献   

4.
During recent years, the Internet has witnessed rapid advancement in peer-to-peer (P2P) media streaming. In these applications, an important issue has been the block scheduling problem, which deals with how each node requests the media data blocks from its neighbors. In most streaming systems, peers are likely to have heterogeneous upload/download bandwidths, leading to the fact that different peers probably perceive different streaming quality. Layered (or scalable) streaming in P2P networks has recently been proposed to address the heterogeneity of the network environment. In this paper, we propose a novel block scheduling scheme that is aimed to address the P2P layered video streaming. We define a soft priority function for each block to be requested by a node in accordance with the block’s significance for video playback. The priority function is unique in that it strikes good balance between different factors, which makes the priority of a block well represent the relative importance of the block over a wide variation of block size between different layers. The block scheduling problem is then transformed to an optimization problem that maximizes the priority sum of the delivered video blocks. We develop both centralized and distributed scheduling algorithms for the problem. Simulation of two popular scalability types has been conducted to evaluate the performance of the algorithms. The simulation results show that the proposed algorithm is effective in terms of bandwidth utilization and video quality.  相似文献   

5.
《国际计算机数学杂志》2012,89(1-4):153-158
Motivated by the recursive partitioning algorithm of Evans [2], we present a new algorithm for inverting tridiagonal matrices. Our derivation of the algorithm is different but elementary. The present algorithm has potential for its vector and parallel implementation.  相似文献   

6.
为提高可伸缩视频编码(SVC)在丢包的网络传输环境下的抗误码性能,提出了一种基于自适应遗传算法的SVC非均等错误保护算法。首先针对可伸缩视频编码的网络抽象层单元数据包头的特点,设计了一种新的网络抽象层单元的封装方案。然后将前向纠错编码的校验位在各层的分配转化为多约束条件下的优化问题,再引入惩罚函数将多约束优化问题转化为无约束优化问题,进而采用自适应遗传算法进行求解。仿真实验结果表明,与目前典型的非均等错误保护算法相比,该算法使重建的可伸缩视频编码的峰值信噪比的平均值提高了0.8dB~1.95dB,并有效提高了可伸缩视频编码在接收端的解码速度和重建质量。  相似文献   

7.
Linear Temporal Logic (LTL) Model Checking is a very important and popular technique for the automatic verification of safety-critical hardware and software systems, aiming at ensuring their quality. However, it is well known that LTL model checking suffers from the state explosion problem, often leading to insurmountable scalability problems when applying it to real-world systems. While there has been work on distributed algorithms for explicit on-the-fly LTL model checking, these are not sufficiently scalable and capable of tolerating faults during computation, significantly limiting their usefulness in huge cluster environments. Moreover, implementing these algorithms is generally viewed as a very challenging, error-prone task. In this paper, we instead rely on Pregel, a simple yet powerful model for distributed computation on large graphs. Pregel has from the start been designed for efficient, scalable and fault tolerant operation on clusters of thousands of computers, including large cloud setups. To harness Pregel’s power, we propose a new vertex centric distributed algorithm for explicit LTL model checking of concurrent systems. Experimental results illustrate feasibility and scalability of the proposed algorithm. Compared with other distributed algorithms, our algorithm is more scalable, reliable and efficient.  相似文献   

8.
This paper gives a theoretical foundation for using parallel processing to search an alpha/beta minimax game tree. A mathematical discussion predicts the behavior of a particular parallel algorithm, Principle Variation Splitting (PVS), along with an enhancement that improves performance for most test cares. After the theoretical discussion, some practical results obtained by running two implementations of the PVS algorithm on a 30-processor tightly coupled shared memory machine are given to show the speedup obtained by parallel processing.  相似文献   

9.
In this paper, we introduce a new numerical scheme, based on the ADI (alternating direction implicit) method, to price American put options with a stochastic volatility model. Upon applying a front-fixing transformation to transform the unknown free boundary into a known and fixed boundary in the transformed space, a predictor-corrector finite difference scheme is then developed to solve for the optimal exercise price and the option values simultaneously. Based on the local von Neumann stability analysis, a stability requirement is theoretically obtained first and then tested numerically. It is shown that the instability introduced by the predictor can be damped, to some extent, by the ADI method that is used in the corrector. The results of various numerical experiments show that this new approach is fast and accurate, and can be easily extended to other types of financial derivatives with an American-style exercise.Another key contribution of this paper is the proposition of a set of appropriate boundary conditions, particularly in the volatility direction, upon realizing that appropriate boundary conditions in the volatility direction for stochastic volatility models appear to be controversial in the literature. A sound justification is also provided for the proposed boundary conditions mathematically as well as financially.  相似文献   

10.
11.
The problem of maximal clique enumeration (MCE) is to enumerate all of the maximal cliques in a graph. Once enumerated, maximal cliques are widely used to solve problems in areas such as 3-D protein structure alignment, genome mapping, gene expression analysis, and detection of social hierarchies. Even the most efficient serial MCE algorithms require large amounts of time to enumerate the maximal cliques in networks arising from these problems that contain hundreds, thousands, or larger numbers of vertices. The previous attempts to provide practical solutions to the MCE problem through parallel implementation have had limited success, largely due to a number of challenges inherent to the nature of the MCE combinatorial search space. On the one hand, MCE algorithms often create a backtracking search tree that has a highly irregular and hard-or-impossible to predict structure; therefore, almost any static decomposition of the search tree by parallel processors results in highly unbalanced processor execution times. On the other hand, the data-intensive nature of the MCE problem often makes naive dynamic load distribution strategies that require extensive data movement prohibitively expensive. As a result, good scaling of the overall execution time of parallel MCE algorithms has been reported for only up to a couple hundred processors. In this paper, we propose a parallel, scalable, and memory-efficient MCE algorithm for distributed and/or shared memory high performance computing architectures, whose runtime scales linearly for thousands of processors on real-world application graphs with hundreds and thousands of nodes. Its scalability and efficiency are attributed to the proposed: (a) representation of the search tree decomposition to enable parallelization; (b) parallel depth-first backtracking search to both constrain the search space and minimize memory requirement; (c) least stringent synchronization to minimize data movement; and (d) on-demand work stealing intelligently coupled with work stack splitting to minimize computing elements’ idle time. To the best of our knowledge, the proposed parallel MCE algorithm is the first to achieve a linear scaling runtime using up to 2048 processors on Cray XT machines for a number of real-world biological networks.  相似文献   

12.
We present an optimal distributed algorithm to record a global state of a distributed system with causally ordered message delivery. The message complexity of our algorithm is O(n) bits where n is the number of processes in the system.  相似文献   

13.
In this paper,an optimum tactic of multi-grid parallel algorithm with virtual boundary forecast method is disscussed,and a two-stage implementation is presented.The numerical results of solving a non-linear heay transfer equation show that the optimum implementation is much better than the non-optimum one.  相似文献   

14.
并行最短路径搜索算法的设计与实现   总被引:3,自引:0,他引:3       下载免费PDF全文
针对串行最短路径搜索算法本身固有的局限性,难以随着网络规模的增大而提高搜索速度的问题,设计并实现了一种基于并行Dijkstra思想的并行最短路径搜索算法,使算法复杂度由ON2)减少到ON2/p+N*(p-1)),提高了算法的效率。实验结果表明,该算法搜索速度快且性能稳定,当结点数目相当庞大时,算法的优越性更加明显。  相似文献   

15.
Based on a fourth-order compact difference formula for the spatial discretization, which is currently proposed for the one-dimensional (1D) steady convection–diffusion problem, and the Crank–Nicolson scheme for the time discretization, a rational high-order compact alternating direction implicit (ADI) method is developed for solving two-dimensional (2D) unsteady convection–diffusion problems. The method is unconditionally stable and second-order accurate in time and fourth-order accurate in space. The resulting scheme in each ADI computation step corresponds to a tridiagonal matrix equation which can be solved by the application of the 1D tridiagonal Thomas algorithm with a considerable saving in computing time. Three examples supporting our theoretical analysis are numerically solved. The present method not only shows higher accuracy and better phase and amplitude error properties than the standard second-order Peaceman–Rachford ADI method in Peaceman and Rachford (1959) [4], the fourth-order ADI method of Karaa and Zhang (2004) [5] and the fourth-order ADI method of Tian and Ge (2007) [23], but also proves more effective than the fourth-order Padé ADI method of You (2006) [6], in the aspect of computational cost. The method proposed for the diffusion–convection problems is easy to implement and can also be used to solve pure diffusion or pure convection problems.  相似文献   

16.
This paper is concerned with an existing compact finite difference ADI method, published in the paper by Liao et al. (2002) [3], for solving systems of two-dimensional reaction-diffusion equations with nonlinear reaction terms. This method has an accuracy of fourth-order in space and second-order in time. The existence and uniqueness of its solution are investigated by the method of upper and lower solutions, without any monotone requirement on the nonlinear reaction terms. The convergence of the finite difference solution to the continuous solution is proved. An efficient monotone iterative algorithm is presented for solving the resulting discrete system, and some techniques for the construction of upper and lower solutions are discussed. An application using a model problem gives numerical results that demonstrate the high efficiency and advantages of the method.  相似文献   

17.
The Journal of Supercomputing - In this work, we propose a highly scalable parallel double binned ghost particle (DBGP) algorithm for direct-forcing immersed boundary spectral element method for...  相似文献   

18.
生物序列比对是生物信息领域的重要课题,比对结果的合理性和正确性关系到基于比对结果研究的正确性。在保证正确性的前提下利用并行计算充分挖掘计算潜力对提高比对效率有重要意义。针对双序列的全局比对问题,提出了基于蚁群算法的双序列比对并行化方案。对耗时最多的搜索比对路径和信息素更新两个步骤给出了基于共享内存模型的并行化方法。"天河二号"上OpenMP实验结果表明,8线程并行情况下,加速比可达5.03,且序列越长性能越高。  相似文献   

19.
基于递归耦合方法的三对角线性方程组分布式并行算法   总被引:1,自引:2,他引:1  
方蓉  赵瑛 《计算机工程与设计》2006,27(4):670-671,687
提出了一种在分布式计算机上用递归倍增方法解三对角线性方程组的并行算法。通过研究算法中的额外开销达到优化标量算法的执行和通讯,并减少了存储开销。当三对角线性方程组的系数矩阵满足对角占优时,该算法在运行过程中不会中断。最后,在采用消息传递编程模型的基于局域网MPI并行环境下对算法进行了评价。数值实验结果表明,该算法是高效的。  相似文献   

20.
A scalable parallel algorithm has been designed to perform multimillion-atom molecular dynamics (MD) simulations, in which first principles-based reactive force fields (ReaxFF) describe chemical reactions. Environment-dependent bond orders associated with atomic pairs and their derivatives are reused extensively with the aid of linked-list cells to minimize the computation associated with atomic n-tuple interactions (n?4 explicitly and ?6 due to chain-rule differentiation). These n-tuple computations are made modular, so that they can be reconfigured effectively with a multiple time-step integrator to further reduce the computation time. Atomic charges are updated dynamically with an electronegativity equalization method, by iteratively minimizing the electrostatic energy with the charge-neutrality constraint. The ReaxFF-MD simulation algorithm has been implemented on parallel computers based on a spatial decomposition scheme combined with distributed n-tuple data structures. The measured parallel efficiency of the parallel ReaxFF-MD algorithm is 0.998 on 131,072 IBM BlueGene/L processors for a 1.01 billion-atom RDX system.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号