首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
With the growth of data and necessity for distributed optimization methods, solvers that work well on a single machine must be re-designed to leverage distributed computation. Recent work in this area has been limited by focusing heavily on developing highly specific methods for the distributed environment. These special-purpose methods are often unable to fully leverage the competitive performance of their well-tuned and customized single machine counterparts. Further, they are unable to easily integrate improvements that continue to be made to single machine methods. To this end, we present a framework for distributed optimization that both allows the flexibility of arbitrary solvers to be used on each (single) machine locally and yet maintains competitive performance against other state-of-the-art special-purpose distributed methods. We give strong primal–dual convergence rate guarantees for our framework that hold for arbitrary local solvers. We demonstrate the impact of local solver selection both theoretically and in an extensive experimental comparison. Finally, we provide thorough implementation details for our framework, highlighting areas for practical performance gains.  相似文献   

2.
The design, development, and application of Traceview, a general-purpose trace-visualization tool that implements the trace-management and I/O features usually found in special-purpose trace-analysis systems, are described. The aspects of trace visualization that can be incorporated into a reusable tool are identified. The tradeoff in general-purpose design versus semantically based, detailed trace-data analysis is evaluated. Display methods and Traceview applications are discussed  相似文献   

3.
A Microsoft Windows-based indoor air quality (IAQ) simulation software package has been developed and has completed a small-scale beta test and quality assurance review. Tentatively named Simulation Tool Kit for Indoor Air Quality and Inhalation Exposure, or STKi for short, this package complements and supplements existing IAQ simulation packages and is designed mainly for advanced users. STKi Version 1 consists of a general-purpose simulation program and four stand-alone, special-purpose programs. The general-purpose program performs multi-zone, multi-pollutant simulations and allows gas-phase chemical reactions. With a large collection of models for sources, sinks, and air filters/cleaners, it can perform simulations for a wide range of indoor air pollution scenarios. The special-purpose programs implement fundamentally based models, which are often excluded from existing IAQ simulation programs despite their improved performance over statistical models. In addition to performing conventional IAQ simulation, which generates time–concentration profiles, STKi can estimate the adequate ventilation rate when certain air quality criteria are given, a unique feature useful for product stewardship and risk management. STKi will be developed in a cumulative manner. More special-purpose simulation programs will be added to the package. Key numerical methods used in STKi are discussed. Ways to convert the STKi programs into language-independent simulation modules that can be used by multi-pathway exposure models are also being explored.  相似文献   

4.
《Real》2000,6(4):313-324
This paper discusses the main architectural issues of a challenging application of real-time image processing: the vision-based automatic guidance of road vehicles. Two algorithms for lane detection and obstacle localization, currently implemented on the ARGO autonomous vehicle developed at the University of Parma, are used as examples to compare two different computing engines — a massively parallel special-purpose SIMD architecture and a general-purpose system — while future trends in this field are proposed, based on the experience of the ARGO project.  相似文献   

5.
工程数据库语言EDL/3   总被引:1,自引:1,他引:0  
根据建筑CAD领域的要求,并以此为背景,设计和实现了工程数据库语言EDL/3。该语言不仅包括一般工程数据库管理系统所具有的通用功能,而且还包括用于建筑CAD领域的特殊功能,该语言在整体功能上优于CAD*I/EDL和EDL/2。  相似文献   

6.
Noran Engineering, Inc. has recently added two new solvers, Vector Sparse Solver (VSS) and Vector Iterative Solver (VIS), to its general-purpose finite element analysis engine, NE/Nastran. One solver uses a direct approach while the other uses an iterative Preconditioned Conjugate Gradient (PCG) approach. Both solvers are fully sparse and store and operate only on nonzero matrix elements. This paper looks at the effect these solvers have on the performance of NE/Nastran for various finite element model and solution types. In many cases performance has increased by a factor of 10, thus allowing jobs that took days to be solved in minutes.  相似文献   

7.
The parallelization of sophisticated applications has dramatically increased in recent years. As machine capabilities rise, greater emphasis on modeling complex phenomena can be expected. Many of these applications require the solution of large sparse matrix equations which approximate systems of partial differential equations (PDEs). Therefore we consider parallel iterative solvers for large sparse non-symmetric systems and issues related to parallel sparse matrix software. We describe a collection of parallel iterative solvers which use a distributed sparse matrix format that facilitates the interface between specific applications and a variety of Krylov subspace techniques and multigrid methods. These methods have been used to solve a number of linear and non-linear PDE problems on a 1024-processor NCUBE 2 hypercube. Over 1 Gflop sustained computation rates are achieved with many of these solvers, demonstrating that high performance can be attained even when using sparse matrix data structures.  相似文献   

8.
Robust registration of 2D and 3D point sets   总被引:3,自引:0,他引:3  
This paper introduces a new method of registering point sets. The registration error is directly minimized using general-purpose non-linear optimization (the Levenberg–Marquardt algorithm). The surprising conclusion of the paper is that this technique is comparable in speed to the special-purpose Iterated Closest Point algorithm, which is most commonly used for this task. Because the routine directly minimizes an energy function, it is easy to extend it to incorporate robust estimation via a Huber kernel, yielding a basin of convergence that is many times wider than existing techniques. Finally, we introduce a data structure for the minimization based on the chamfer distance transform, which yields an algorithm that is both faster and more robust than previously described methods.  相似文献   

9.
Johnson  K.T. Hurson  A.R. Shirazi  B. 《Computer》1993,26(11):20-31
The extension of systolic array architecture from fixed- or special-purpose architectures to general-purpose, SIMD (single-instruction stream, multiple-data stream), MIMD (multiple-instruction stream, multiple-data stream) architectures, and hybrid architectures that combine both commercial and FPGA (field-programmable gate array) technologies is chronicled. The authors present a taxonomy for systolic organizations, discuss each architecture's methods of exploiting concurrencies, and compare performance attributes of each. The authors also describe a number of implementation issues that determine a systolic array's performance efficiency, such as algorithms and mapping, system integration through memory subsystems, cell granularity, and extensibility to a wide variety of topologies  相似文献   

10.
In this paper, a class of linear and nonlinear nth-order initial value problems (IVPs) is considered. The solutions of these IVPs are obtained by the homotopy-perturbation method (HPM). The HPM can be considered as one of the new methods belonging to the general classification of perturbation methods. Generally, the HPM deals with exact solvers for linear differential equations and approximative solvers for nonlinear equations. Several test cases are chosen to demonstrate the efficiency of HPM.  相似文献   

11.
A processor architecture for 3D graphics   总被引:1,自引:0,他引:1  
The DLX/3DCP architecture that uses a method of parallel processing on 3-D vectors to overcome the problem of the large number of floating-point operations required in 3-D graphics which limits the performance of graphics systems is described. The architecture's design offers general-purpose programmability from the high-level object-oriented language C++ and generates performance expected only from dedicated special-purpose hardware. Results that show the architecture's performance on graphics operations are presented and compared to the performance of other RISC processors  相似文献   

12.
The Laplace–Beltrami system of nonlinear, elliptic, partial differential equations has utility in the generation of computational grids on complex and highly curved geometry. Discretization of this system using the finite-element method accommodates unstructured grids, but generates a large, sparse, ill-conditioned system of nonlinear discrete equations. The use of the Laplace–Beltrami approach, particularly in large-scale applications, has been limited by the scalability and efficiency of solvers. This paper addresses this limitation by developing two nonlinear solvers based on the Jacobian-Free Newton–Krylov (JFNK) methodology. A key feature of these methods is that the Jacobian is not formed explicitly for use by the underlying linear solver. Iterative linear solvers such as the Generalized Minimal RESidual (GMRES) method do not technically require the stand-alone Jacobian; instead its action on a vector is approximated through two nonlinear function evaluations. The preconditioning required by GMRES is also discussed. Two different preconditioners are developed, both of which employ existing Algebraic Multigrid (AMG) methods. Further, the most efficient preconditioner, overall, for the problems considered is based on a Picard linearization. Numerical examples demonstrate that these solvers are significantly faster than a standard Newton–Krylov approach; a speedup factor of approximately 26 was obtained for the Picard preconditioner on the largest grids studied here. In addition, these JFNK solvers exhibit good algorithmic scaling with increasing grid size.  相似文献   

13.
The authors concentrate on the simulation study of two distributed task allocation procedures: the load balancing and the LOCO procedure. The first is widely used in general-purpose processing. The second was recently introduced and analyzed in the processing environment corresponding to the complex multitask jobs typical of some supercomputing and artificial-intelligence-oriented systems. Both procedures are simulated and compared in realistic situations, where both processing resources and interconnection network may represent the system bottleneck. Both were tested using carrier-sense multiple-access communications protocols. The CSMA/CD protocol performed better than TDMA with load balancing, but no difference was found with the LOCO procedure. For a large number of special-purpose processing resources and a large number of jobs in the system, LOCO produced better results than load balancing  相似文献   

14.
Efficient algorithms for the solution of partial differential equations on parallel computers are often based on domain decomposition methods. Schwarz preconditioners combined with standard Krylov space solvers are widely used in this context, and such a combination is shown here to perform very well in the case of the Wilson-Dirac equation in lattice QCD. In particular, with respect to even-odd preconditioned solvers, the communication overhead is significantly reduced, which allows the computational work to be distributed over a large number of processors with only small parallelization losses.  相似文献   

15.
In this paper we present a product quadrature rule for Volterra integral equations with weakly singular kernels based on the generalized Adams methods. The formulas represent numerical solvers for fractional differential equations, which inherit the linear stability properties already known for the integer order case. The numerical experiments confirm the valuable properties of this approach.  相似文献   

16.
Data redundancy methods evaluate the output of a program on a given input by examining the outputs produced by the same program on additional inputs. This papers explores the use of data redundancy to detect and/or tolerate failures in differential equation solvers. Our first goal is to show that data redundancy techniques are applicable to a wide class of differential equations. Our second task is to identify circumstances in which an independence model of the sort used in program checking can be exploited to build highly reliable solvers from moderately reliable components. We conclude with illustrative examples of applying various data redundancy techniques to a standard differential equation solver. The method has potential for critical systems in which the application’s control laws are specified as sets of differential equations.  相似文献   

17.
The integration of software into special-purpose systems (e.g. for gene sequence analysis) can be a difficult task. We describe a general-purpose software integration tool, the BCE program, that facilitates assembly of VAX-based software into application systems and provides an easy-to-use, intuitive user interface. We describe the use of BCE to integrate a heterogeneous collection of sequence analysis tools. Many BCE design features are generally applicable and can be implemented in other language or hardware environments.  相似文献   

18.
The standard BDDC (balancing domain decomposition by constraints) preconditioner is shown to be equivalent to a preconditioner built from a partially subassembled finite element model. This results in a system of linear algebraic equations which is much easier to solve in parallel than the fully assembled model; the cost is then often dominated by that of the problems on the subdomains. An important role is also played, both in theory and practice, by an averaging operator and in addition exact Dirichlet solvers are used on the subdomains in order to eliminate the residual in the interior of the subdomains. The use of inexact solvers for these problems and even the replacement of the Dirichlet solvers by a trivial extension are considered. It is established that one of the resulting algorithms has the same eigenvalues as the standard BDDC algorithm, and the connection of another with the FETI-DP algorithm with a lumped preconditioner is also considered. Multigrid methods are used in the experimental work and under certain assumptions, it is established that the iteration count essentially remains the same as when exact solvers are used, while considerable gains in the speed of the algorithm can be realized since the cost of the exact solvers grows superlinearly with the size of the subdomain problems while the multigrid methods are linear.  相似文献   

19.
化学振荡反应作为非线性科学研究的真实典型范例,采用传统的计算机语言编程,作理论分析模拟烦琐而困难。本文建议用MATLAB功能强大的solve,jacobian,eig,ode,plot等内部函数,对Lotka-Volterra和Belousov-Zhabotinsky两个典型的化学振荡反应进行理论分析和数值模拟。结果表明MATLAB能够可靠、简便地寻找出振荡动力学区域并对化学动力学方程组数值积分,可以成为化学振荡理论研究的有力工具。  相似文献   

20.
建立了典型的带有一、二级化学反应气液吸收过程的数学模型,然后采用MATLAB的dsolve和ode功能函数,对模型中的液膜内扩散-反应方程和液相主体吸收方程进行求解,部分计算结果还与实验值进行了比较,结果表明MATLAB可实现化学吸收动态过程的快速、高效、准确模拟。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号