共查询到20条相似文献,搜索用时 15 毫秒
1.
Structural characterization of deformed crystals by analysis of common atomic neighborhood 总被引:1,自引:0,他引:1
Simulations of crystal deformation and structural transformation may generate complex datasets involving networks with million to billion chemical bonds which makes local structure analysis a challenge. An ideal analysis method must recognize perfect crystal structures, such as face-centered cubic, body-centered cubic and hexagonal close packed, and differentiate structural defects such as dislocations, stacking faults, grain boundaries, cracks and surfaces. Currently a few methods are used for this purpose, e.g., the Common Neighbor Analysis (CNA) and the Centrosymmetry Parameter (CSP). This paper proposes an alternative method based on the calculation of a single parameter that depends on the common atomic neighborhood. We validate the method characterizing local structures in complex molecular-dynamics datasets, clarifying its advantages over the CNA and the CSP methods. 相似文献
2.
An events based algorithm for distributing concurrent tasks on multi-core architectures 总被引:1,自引:0,他引:1
In this paper, a programming model is presented which enables scalable parallel performance on multi-core shared memory architectures. The model has been developed for application to a wide range of numerical simulation problems. Such problems involve time stepping or iteration algorithms where synchronization of multiple threads of execution is required. It is shown that traditional approaches to parallelism including message passing and scatter-gather can be improved upon in terms of speed-up and memory management. Using spatial decomposition to create orthogonal computational tasks, a new task management algorithm called H-Dispatch is developed. This algorithm makes efficient use of memory resources by limiting the need for garbage collection and takes optimal advantage of multiple cores by employing a “hungry” pull strategy. The technique is demonstrated on a simple finite difference solver and results are compared to traditional MPI and scatter-gather approaches. The H-Dispatch approach achieves near linear speed-up with results for efficiency of 85% on a 24-core machine. It is noted that the H-Dispatch algorithm is quite general and can be applied to a wide class of computational tasks on heterogeneous architectures involving multi-core and GPGPU hardware. 相似文献
3.
4.
Grid computing is distributed computing performed transparently across multiple administrative domains. Grid middleware, which is meant to enable access to grid resources, is currently widely seen as being too heavyweight and, in consequence, unwieldy for general scientific use. Its heavyweight nature, especially on the client-side, has severely restricted the uptake of grid technology by computational scientists. In this paper, we describe the Application Hosting Environment (AHE) which we have developed to address some of these problems. The AHE is a lightweight, easily deployable environment designed to allow the scientist to quickly and easily run legacy applications on distributed grid resources. It provides a higher level abstraction of a grid than is offered by existing grid middleware schemes such as the Globus Toolkit. As a result, the computational scientist does not need to know the details of any particular underlying grid middleware and is isolated from any changes to it on the distributed resources. The functionality provided by the AHE is ‘application-centric’: applications are exposed as web services with a well-defined standards-compliant interface. This allows the computational scientist to start and manage application instances on a grid in a transparent manner, thus greatly simplifying the user experience. We describe how a range of computational science codes have been hosted within the AHE and how the design of the AHE allows us to implement complex workflows for deployment on grid infrastructure. 相似文献
5.
We present a generally applicable method for the modeling of covalent amorphous networks. The algorithm proceeds by generating random close packings of anions, followed by an optimal placement of the cations. As examples, we apply the algorithm to a-SiO2, a-Si3N4, a-SiO3/2N1/3, and a-B2O3. 相似文献
6.
7.
A fitting procedure for one trap and one recombination centre kinetic model is described here. The procedure makes use of a grid in the parameters space obtained by changing each parameter back and forth and calculating robust cost functions on the surfaces of this grid. The lengths of the changes are determined empirically. The best set of parameters is calculated by the projection on the grid surface with smallest cost function. The fitting procedure applied to the fit of one, two and three parameters of the kinetic model is analyzed. In all cases the optimization procedure shows reliable fitting within a feasible interval of processing time. 相似文献
8.
In this paper we present an inversion algorithm for ill-posed problems arising in atmospheric remote sensing. The proposed method is an iterative Runge-Kutta type regularization method. Those methods are better well known for solving differential equations. We adapted them for solving inverse ill-posed problems. The numerical performances of the algorithm are studied by means of simulations concerning the retrieval of aerosol particle size distributions from lidar observations. 相似文献
9.
In a common approach for parallel processing applied to simulations of many-particle systems with short-ranged interactions and uniform density, the cubic simulation box is partitioned into domains of equal shape and size, each of which is assigned to one processor. We compare the commonly used simple-cubic (SC) domain shape to domain shapes chosen as the Voronoi cells of BCC, FCC, and HCP sphere packings. The latter three are found to result in superior partitionings with respect to communication overhead. Scaling of the domain shape is used to extend the range of applicability of these partitionings to a large set of processor numbers. The higher efficiency with BCC and FCC partitionings is demonstrated in simulations of the sillium model for amorphous silicon. 相似文献
10.
Given the resurgent attractiveness of single-instruction-multiple-data (SIMD) processing, it is important for high-performance computing applications to be SIMD-capable. The Hartree-Fock SCF (HF-SCF) application, in it's canonical form, cannot fully exploit SIMD processing. Prior attempts to implement Electron Repulsion Integral (ERI) sorting functionality to essentially “SIMD-ify” the HF-SCF application have met frustration because of the low throughput of the sorting functionality. With greater awareness of computer architecture, we discuss how the sorting functionality may be practically implemented to provide high-performance. Overall system performance analysis, including memory locality analysis, is also conducted, and further emphasises that a system with ERI sorting is capable of very high throughput. We discuss two alternative implementation options, with one immediately accessible software-based option discussed in detail. The impact of workload characteristics on expected performance is also discussed, and it is found that in general as basis set size increases the potential performance of the system also increases. Consideration is given to conventional CPUs, GPUs, FPGAs, and the Cell Broadband Engine architecture. 相似文献
11.
S. AlmehedCh. Driouichi P. EerolaU. Mjörnmark O. Smirnova Ch. Zacharatou JarlskogT. Åkesson 《Computer Physics Communications》2002,145(3):341-350
A simulation study to evaluate the required computing resources for a research exploitation of the Large Hadron Collider (LHC) has been performed. The evaluation was done as a case study, assuming existence of a Nordic regional centre and using the requirements for performing a specific physics analysis as a yard-stick. Other input parameters were: assumption for the distribution of researchers at the institutions involved, an analysis model, and two different functional structures of the computing resources. 相似文献
12.
Electron Repulsion Integrals (ERIs) are a common bottleneck in ab initio computational chemistry. It is known that sorted/reordered execution of ERIs results in efficient SIMD/vector processing. This paper shows that reconfigurable computing and heterogeneous processor architectures can also benefit from a deliberate ordering of ERI tasks. However, realizing these benefits as net speedup requires a very rapid sorting mechanism. This paper presents two such mechanisms. Included in this study are analytical, simulation-based, and experimental benchmarking approaches to consider five use cases for ERI sorting, i.e. SIMD processing, reconfigurable computing, limited address spaces, instruction cache exploitation, and data cache exploitation. Specific consideration is given to existing cache-based processors, FPGAs, and the Cell Broadband Engine processor. It is proposed that the analyses conducted in this work should be built upon to aid the development of software autotuners which will produce efficient ab initio computational chemistry codes for a variety of computer architectures. 相似文献
13.
14.
Ioannis G. Tsoulos 《Computer Physics Communications》2006,174(2):152-159
A new stochastic method for locating the global minimum of a multidimensional function inside a rectangular hyperbox is presented. A sampling technique is employed that makes use of the procedure known as grammatical evolution. The method can be considered as a “genetic” modification of the Controlled Random Search procedure due to Price. The user may code the objective function either in C++ or in Fortran 77. We offer a comparison of the new method with others of similar structure, by presenting results of computational experiments on a set of test functions.
Program summary
Title of program: GenPriceCatalogue identifier:ADWPProgram summary URL:http://cpc.cs.qub.ac.uk/summaries/ADWPProgram available from: CPC Program Library, Queen's University of Belfast, N. IrelandComputer for which the program is designed and others on which it has been tested: the tool is designed to be portable in all systems running the GNU C++ compilerInstallation: University of Ioannina, GreeceProgramming language used: GNU-C++, GNU-C, GNU Fortran-77Memory required to execute with typical data: 200 KBNo. of bits in a word: 32No. of processors used: 1Has the code been vectorized or parallelized?: noNo. of lines in distributed program, including test data, etc.:13 135No. of bytes in distributed program, including test data, etc.: 78 512Distribution format: tar. gzNature of physical problem: A multitude of problems in science and engineering are often reduced to minimizing a function of many variables. There are instances that a local optimum does not correspond to the desired physical solution and hence the search for a better solution is required. Local optimization techniques are frequently trapped in local minima. Global optimization is hence the appropriate tool. For example, solving a nonlinear system of equations via optimization, employing a “least squares” type of objective, one may encounter many local minima that do not correspond to solutions, i.e. minima with values far from zero.Method of solution: Grammatical Evolution is used to accelerate the process of finding the global minimum of a multidimensional, multimodal function, in the framework of the original “Controlled Random Search” algorithm.Typical running time: Depending on the objective function. 相似文献15.
Flavius Guia? 《Mathematics and computers in simulation》2010,81(4):820-836
In this paper a scheme for approximating solutions of convection-diffusion-reaction equations by Markov jump processes is studied. The general principle of the method of lines reduces evolution partial differential equations to semi-discrete approximations consisting of systems of ordinary differential equations. Our approach is to use for this resulting system a stochastic scheme which is essentially a direct simulation of the corresponding infinitesimal dynamics. This implies automatically the time adaptivity and, in one space dimension, stable approximations of diffusion operators on non-uniform grids and the possibility of using moving cells for the transport part, all within the framework of an explicit method. We present several results in one space dimension including free boundary problems, but the general algorithm is simple, flexible and on uniform grids it can be formulated for general evolution partial differential equations in arbitrary space dimensions. 相似文献
16.
In the paper we present compact library for analysis of nuclear spectra. The library consists of sophisticated functions for background elimination, smoothing, peak searching, deconvolution, and peak fitting. The functions can process one- and two-dimensional spectra. The software described in the paper comprises a number of conventional as well as newly developed methods needed to analyze experimental data.
Program summary
Program title: SpecAnalysLib 1.1Catalogue identifier: AEDZ_v1_0Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEDZ_v1_0.htmlProgram obtainable from: CPC Program Library, Queen's University, Belfast, N. IrelandLicensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.htmlNo. of lines in distributed program, including test data, etc.: 42 154No. of bytes in distributed program, including test data, etc.: 2 379 437Distribution format: tar.gzProgramming language: C++Computer: Pentium 3 PC 2.4 GHz or higher, Borland C++ Builder v. 6. A precompiled Windows version is included in the distribution packageOperating system: Windows 32 bit versionsRAM: 10 MBWord size: 32 bitsClassification: 17.6Nature of problem: The demand for advanced highly effective experimental data analysis functions is enormous. The library package represents one approach to give the physicists the possibility to use the advanced routines simply by calling them from their own programs. SpecAnalysLib is a collection of functions for analysis of one- and two-parameter γ-ray spectra, but they can be used for other types of data as well. The library consists of sophisticated functions for background elimination, smoothing, peak searching, deconvolution, and peak fitting.Solution method: The algorithms of background estimation are based on Sensitive Non-linear Iterative Peak (SNIP) clipping algorithm. The smoothing algorithms are based on the convolution of the original data with several types of filters and algorithms based on discrete Markov chains. The peak searching algorithms use the smoothed second differences and they can search for peaks of general form. The deconvolution (decomposition - unfolding) functions use the Gold iterative algorithm, its improved high resolution version and Richardson-Lucy algorithm. In the algorithms of peak fitting we have implemented two approaches. The first one is based on the algorithm without matrix inversion - AWMI algorithm. It allows it to fit large blocks of data and large number of parameters. The other one is based on the calculation of the system of linear equations using Stiefel-Hestens method. It converges faster than the AWMI, however it is not suitable for fitting large number of parameters.Restrictions: Dimensionality of the analyzed data is limited to two.Unusual features: Dynamically loadable library (DLL) of processing functions users can call from their own programs.Running time: Most processing routines execute interactively or in a few seconds. Computationally intensive routines (deconvolution, fitting) execute longer, depending on the number of iterations specified and volume of the processed data. 相似文献17.
The simulation of fabrics, clothes, and flexible materials is an essential topic in computer animation of realistic virtual humans and dynamic sceneries. New emerging technologies, as interactive digital TV and multimedia products, make necessary the development of powerful tools to perform real-time simulations. Parallelism is one of such tools. When analyzing computationally fabric simulations we found these codes belonging to the complex class of irregular applications. Frequently this kind of codes includes reduction operations in their core, so that an important fraction of the computational time is spent on such operations. In fabric simulators these operations appear when evaluating forces, giving rise to the equation system to be solved. For this reason, this paper discusses only this phase of the simulation. This paper analyzes and evaluates different irregular reduction parallelization techniques on ccNUMA shared memory machines, applied to a real, physically-based, fabric simulator we have developed. Several issues are taken into account in order to achieve high code performance, as exploitation of data access locality and parallelism, as well as careful use of memory resources (memory overhead). In this paper we use the concept of data affinity to develop various efficient algorithms for reduction parallelization exploiting data locality. 相似文献
18.
Rui P.S. Fartaria Pedro C.R. Rodrigues Fernando M.S. Silva Fernandes 《Computer Physics Communications》2006,175(2):116-121
A time saving algorithm for the Monte Carlo method of Metropolis is presented. The technique is tested with different potential models and number of particles. The coupling of the method with neighbor lists, linked lists, Ewald sum and reaction field techniques is also analyzed. It is shown that the proposed algorithm is particularly suitable for computationally heavy intermolecular potentials. 相似文献
19.
Stereo mini-jet cells will be indispensable components of a future e+e− linear collider central tracker such as JLC-CDC. There is, however, no official Geant4 solid available at present to describe such geometrical objects, which had been a major obstacle for us to develop a full Geant4-based simulator with stereo cells built in. We have thus extended Geant4 to include a new solid (TwistedTubs), which consists of three kinds of surfaces: two end planes, inner and outer hyperboloidal surfaces, and two so-called twisted surfaces that make slant and twisted φ-boundaries. Design philosophy and its realization in the Geant4 framework are described together with algorithmic details. We have implemented stereo cells with the new solid, and tested them using geantinos and Pythia events (e+e−→ZH at GeV). The performance was found reasonable: the stereo cells consumed only 25% more CPU time than ordinary axial cells. 相似文献
20.
The Particle Flow Analysis (PFA) is currently under intense studies as the most promising way to achieve precision jet energy measurements required at the future linear e+e− collider. In order to optimize detector configurations and to tune up the PFA it is crucial to identify factors that limit the PFA performance and clarify the fundamental limits on the jet energy resolution that remain even with the perfect PFA and an infinitely granular calorimeter. This necessitates a tool to connect each calorimeter hit in particle showers to its parent charged track, if any, and eventually all the way back to its corresponding primary particle, while identifying possible interactions and decays along the way. In order to realize this with a realistic memory space, we have developed a set of C++ classes that facilitates history keeping of particle tracks within the framework of Geant4. This software tool, hereafter called J4HistoryKeeper, comes in handy in particular when one needs to stop this history keeping for memory space economy at multiple geometrical boundaries beyond which a particle shower is expected to start. In this paper this software tool is described and applied to a generic detector model to demonstrate its functionality. 相似文献