首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
We present a new algorithm, called linked neighbour list (LNL), useful to substantially speed up off-lattice Monte Carlo simulations of fluids by avoiding the computation of the molecular energy before every attempted move. We introduce a few variants of the LNL method targeted to minimise memory footprint or augment memory coherence and cache utilisation. Additionally, we present a few algorithms which drastically accelerate neighbour finding. We test our methods on the simulation of a dense off-lattice Gay-Berne fluid subjected to periodic boundary conditions observing a speedup factor of about 2.5 with respect to a well-coded implementation based on a conventional link-cell. We provide several implementation details of the different key data structures and algorithms used in this work.  相似文献   

2.
Density estimation techniques such as the photon map method rely on a particle transport simulation to reconstruct indirect illumination, which is proportional to the particle density. In the photon map framework, particles are usually located using nearest‐neighbour methods due to their generality. However, these methods have an inherent tradeoff between local bias and noise in the reconstructed illumination, which depends on the density estimate bandwidth. This paper presents a bias compensating operator for nearest‐neighbour density estimation which adapts the bandwidth according to the estimated bias in the reconstructed illumination. ACM CSS: I.3.7 Three‐Dimensional Graphics and Realism Raytracing  相似文献   

3.
A concept of vectorization of molecular dynamics Fortran programs for the use of the Cyber 205 machine is presented. It is shown that for calculations with larger particle systems the program runs faster on the 205 than on the Cray-1 by about a factor of two. Against conventional computers like the Cyber 175 an acceleration by a factor 10–15 is expected. A bit control vector is used instead of a neighbour list, which in principal provides calculations up to 6912 particles for the memory capacity of the Cyber 205. However, because the application of the bit vector requires computation times which grow proportional to N2, the CPU time for particle numbers of more than 2048 becomes prohibitively large.  相似文献   

4.
This paper discusses techniques for the computation of global illumination in environments with a participating medium using a Monte Carlo simulation of the particle model of light. Efficient algorithms and data structures for tracking the particles inside the volume have been developed. The necessary equation for computing the illumination along any given direction has been derived for rendering a scene with a participating medium. A major issue in any Monte Carlo simulation is the uncertainty in the final simulation results. Various steps of the algorithm have been analysed to identify major sources of uncertainty. To reduce the uncertainty, suitable modifications to the simulation algorithm have been suggested using variance reduction methods of forced collision, absorption suppression and particle divergence. Some sample scenes showing the results of applying these methods are also included.  相似文献   

5.
We present a scalable dissipative particle dynamics simulation code, fully implemented on the Graphics Processing Units (GPUs) using a hybrid CUDA/MPI programming model, which achieves 10–30 times speedup on a single GPU over 16 CPU cores and almost linear weak scaling across a thousand nodes. A unified framework is developed within which the efficient generation of the neighbor list and maintaining particle data locality are addressed. Our algorithm generates strictly ordered neighbor lists in parallel, while the construction is deterministic and makes no use of atomic operations or sorting. Such neighbor list leads to optimal data loading efficiency when combined with a two-level particle reordering scheme. A faster in situ generation scheme for Gaussian random numbers is proposed using precomputed binary signatures. We designed custom transcendental functions that are fast and accurate for evaluating the pairwise interaction. The correctness and accuracy of the code is verified through a set of test cases simulating Poiseuille flow and spontaneous vesicle formation. Computer benchmarks demonstrate the speedup of our implementation over the CPU implementation as well as strong and weak scalability. A large-scale simulation of spontaneous vesicle formation consisting of 128 million particles was conducted to further illustrate the practicality of our code in real-world applications.  相似文献   

6.
We present a randomized parallel list ranking algorithm for distributed memory multiprocessors, using a BSP type model. We first describe a simple version which requires, with high probability, log(3p)+log ln(n)=Õ(logp+log logn) communication rounds (h-relations withh=Õ(n/p)) andÕ(n/p)) local computation. We then outline an improved version that requires high probability, onlyr?(4k+6) log(2/3p)+8=Õ(k logp) communication rounds wherek=min{i?0 |ln(i+1)n?(2/3p)2i+1}. Notekn) is an extremely small number. Forn andp?4, the value ofk is at most 2. Hence, for a given number of processors,p, the number of communication rounds required is, for all practical purposes, independent ofn. Forn?1, 500,000 and 4?p?2048, the number of communication rounds in our algorithm is bounded, with high probability, by 78, but the actual number of communication rounds observed so far is 25 in the worst case. Forn?10010100 and 4?p?2048, the number of communication rounds in our algorithm is bounded, with high probability, by 118; and we conjecture that the actual number of communication rounds required will not exceed 50. Our algorithm has a considerably smaller member of communication rounds than the list ranking algorithm used in Reid-Miller’s empirical study of parallel list ranking on the Cray C-90.(1) To our knowledge, Reid-Miller’s algorithm(1) was the fastest list ranking implementation so far. Therefore, we expect that our result will have considerable practical relevance.  相似文献   

7.
Finding the set of nearest images of a point in a simulation cell with periodic (torus) boundary conditions is of central importance for molecular dynamics algorithms. To compute all pairwise distances closer than a given cutoff in linear time requires region-based neighbor-listing algorithms. Available algorithms encounter increasing difficulties when the cutoff distance exceeds half the shortest cell length. This work provides details on two ways to directly and efficiently generate region–region interaction lists in n-dimensional space, free from the minimum image restriction. The solution is based on a refined version of existing algorithms solving the closest vector problem. A self-contained discussion of lattice reduction methods for efficient higher-dimensional searches is also provided. In the MD setting, these reduction criteria provide useful guidelines for lattice compaction.  相似文献   

8.
To generate the structure and parameters of fuzzy rule base automatically, a particle swarm optimization algorithm with different length of particles (DLPPSO) is proposed in the paper. The main finding of the proposed approach is that the structure and parameters of a fuzzy rule base can be generated automatically by the proposed PSO. In this method, the best fitness (fgbest) and the number (Ngbest) of active rules of the best particle in current generation, the best fitness (fpbesti) which ith particle has achieved so far and the number (Npbesti) of active rules of it when the best position emerged are utilized to determine the active rules of ith particle in each generation. To increase the diversity of structure, mutation operator is used to change the number of active rules for particles. Compared with some other PSOs with different length of particles, the algorithm has good adaptive performance. To indicate the effectiveness of the give algorithm, a nonlinear function and two time series are used in the simulation experiments. Simulation results demonstrate that the proposed method can approximate the nonlinear function and forecast the time series efficiently.  相似文献   

9.
Obtaining reliable estimates of the statistical properties of complex macromolecules by computer simulation is a task that requires high computational effort as well as the development of highly efficient simulation algorithms. We present here an algorithm combining local moves, the pivot algorithm, and an adjustable simulation lattice box for simulating dilute systems of bottle-brush polymers with a flexible backbone and flexible side chains under good solvent conditions. Applying this algorithm to the bond fluctuation model, very precise estimates of the mean square end-to-end distances and gyration radii of the backbone and side chains are obtained, and the conformational properties of such a complex macromolecule are studied. Varying the backbone length (from Nb=67 to Nb=1027), side chain length (from N=0 to N=24 or 48), the scaling predictions for the backbone behavior as well as the side chain behavior are checked. We are also able to give a direct comparison of the structure factor between experimental data and the simulation results.  相似文献   

10.
11.
In this paper we describe some of the salient features of our search program for finding good lattices. The reciprocals of these lattices are used in lattice integration rules, of which number theoretic rules form a major subset. We describe algorithms for ?(?), the Zaremba index (or figure of merit) of an integer lattice ?. We describe a search algorithm that finds ?(N), the maximum of ?(?) over lattices of orderN. One feature of our search is that it can exploit the symmetry of ? without significantly slowing down the program to list symmetric copies. We have also developed other interactions between the search algorithm and the algorithm for ?(?) that have a significant effect on the speed of the program. The paper is theoretical, providing the mathematical basis for these algorithms. However, we give a list of all the three-dimensional good lattices of order not exceedingN=4,000. This list has 68 entries, 40 of which are new.  相似文献   

12.
The evolution simulation of dust particles provides an important way to analyze the impact of dust on human beings and the environment. Kinetic Monte Carlo (KMC) method is one of the important researches that carry out dynamic simulation of particle motion. Based on the KMC method, a simulation algorithm of the evolution of dust particles under the influence of the natural factors and human factors in a virtual campus is proposed. The experimental results show the accuracy and effectiveness of the simulation algorithm by contrast with the actual results. The simulation and visualization results will provide a reference for city planning and pollution prediction.  相似文献   

13.
Atomistic simulations of thin film deposition, based on the lattice Monte Carlo method, provide insights into the microstructure evolution at the atomic level. However, large-scale atomistic simulation is limited on a single computer—due to memory and speed constraints. Parallel computation, although promising in memory and speed, has not been widely applied in these simulations because of the intimidating overhead. The key issue in achieving optimal performance is, therefore, to reduce communication overhead among processors. In this paper, we propose a new parallel algorithm for the simulation of large-scale thin film deposition incorporating two optimization strategies: (1) domain decomposition with sub-domain overlapping and (2) asynchronous communication. This algorithm was implemented both on message-passing-processor systems (MPP) and on cluster computers. We found that both architectures are suitable for parallel Monte Carlo simulation of thin film deposition in either a distributed memory mode or a shared memory mode with message-passing libraries.  相似文献   

14.
The linked cell algorithm is an essential part of molecular simulation software, both molecular dynamics and Monte Carlo. Though it scales linearly with the number of particles, there has been a constant interest in increasing its performance, because a large part of CPU time is spent to identify the interacting particles. Several recent publications proposed improvements to the algorithm and investigated their efficiency by applying them to particular setups. Here we develop a general method to evaluate the efficiency of these algorithms which is mostly independent of the parameters of the simulation, and test it for a number of linked cell algorithms. We also propose a combination of linked cell reordering and interaction sorting that performs well for a broad range of simulation setups.  相似文献   

15.
An image encryption algorithm based on hyper-chaos and DNA sequence   总被引:2,自引:0,他引:2  
A novel image encryption algorithm making using of hyper-chaos and DNA sequence is presented here. A four-dimensional hyper-chaos system is used to generate the pseudo-random sequence, which is transformed into a biologic DNA sequence to diffuse the image blocks. A circular permutation is performed on the plain-image when it is in DNA status. Together with classical structure of permutation plus diffusion, the simulation results show that the proposed image encryption algorithm has a satisfactory performance. Moreover, our method can resist the known-plaintext and chosen-plaintext attacks with four parameters r i (i?=?1,2,3,4) dependent on the plain-image. These parameters generate different key streams for different plain-image even if the initial conditions are the same.  相似文献   

16.
We propose a novel algorithm, called REGGAE, for the generation of momenta of a given sample of particle masses, evenly distributed in Lorentz-invariant phase space and obeying energy and momentum conservation. In comparison to other existing algorithms, REGGAE is designed for the use in multiparticle production in hadronic and nuclear collisions where many hadrons are produced and a large part of the available energy is stored in the form of their masses. The algorithm uses a loop simulating multiple collisions which lead to production of configurations with reasonably large weights.

Program summary

Program title: REGGAE (REscattering-after-Genbod GenerAtor of Events)Catalogue identifier: AEJR_v1_0Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEJR_v1_0.htmlProgram obtainable from: CPC Program Library, Queen?s University, Belfast, N. IrelandLicensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.htmlNo. of lines in distributed program, including test data, etc.: 1523No. of bytes in distributed program, including test data, etc.: 9608Distribution format: tar.gzProgramming language: C++Computer: PC Pentium 4, though no particular tuning for this machine was performed.Operating system: Originally designed on Linux PC with g++, but it has been compiled and ran successfully on OS X with g++ and MS Windows with Microsoft Visual C++ 2008 Express Edition, as well.RAM: This depends on the number of particles which are generated. For 10 particles like in the attached example it requires about 120 kB.Classification: 11.2Nature of problem: The task is to generate momenta of a sample of particles with given masses which obey energy and momentum conservation. Generated samples should be evenly distributed in the available Lorentz-invariant phase space.Solution method: In general, the algorithm works in two steps. First, all momenta are generated with the GENBOD algorithm. There, particle production is modeled as a sequence of two-body decays of heavy resonances. After all momenta are generated this way, they are reshuffled. Each particle undergoes a collision with some other partner such that in the pair center of mass system the new directions of momenta are distributed isotropically. After each particle collides only a few times, the momenta are distributed evenly across the whole available phase space. Starting with GENBOD is not essential for the procedure but it improves the performance.Running time: This depends on the number of particles and number of events one wants to generate. On a LINUX PC with 2 GHz processor, generation of 1000 events with 10 particles each takes about 3 s.  相似文献   

17.
针对扩展卡尔曼粒子滤波算法滤波精度较低和粒子退化的问题,将马尔可夫链蒙特卡罗(MCMC)方法和扩展卡尔曼粒子滤波相结合,应用于目标跟踪。该算法利用扩展卡尔曼滤波来构造粒子滤波的建议分布函数,使建议分布函数能够融入最新的观测信息,以便得到更符合真实状态的后验概率分布,同时引入MCMC方法对所选的建议分布进行优化处理,使抽样粒子更加多样性。仿真结果表明,该算法能有效地解决粒子贫化问题并提高滤波精度。  相似文献   

18.
One of the most efficient non-perturbative methods for the calculation of thermal properties of quantum systems is the Hybrid Monte Carlo algorithm, as evidenced by its use in large-scale lattice quantum chromodynamics calculations. The performance of this algorithm is determined by the speed at which the fermion operator is applied to a given vector, as it is the central operation in the preconditioned conjugate gradient iteration. We study a simple implementation of these operations for the fermion matrix of the Hubbard model in d+1 spacetime dimensions, and report a performance comparison between a 2.66 GHz Intel Xeon E5430 CPU and an NVIDIA Tesla C1060 GPU using double-precision arithmetic. We find speedup factors ranging between 30 and 350 for d=1, and in excess of 40 for d=3. We argue that such speedups are of considerable impact for large-scale simulational studies of quantum many-body systems.  相似文献   

19.
Application of variable time-step and unstructured adaptive mesh refinement in parallel three-dimensional Direct Simulation Monte Carlo (DSMC) method is presented. A variable time-step method using the particle fluxes conservation (mass, momentum and energy) across the cell interface is implemented to reduce the number of simulated particles and the number of iterations of transient period towards steady state, without sacrificing the solution accuracy. In addition, a three-dimensional h-refined unstructured adaptive mesh with simple but effective mesh-quality control, obtained from a preliminary parallel DSMC simulation, is used to increase the accuracy of the DSMC solution. Completed code is then applied to compute several external and internal flows, and compared with previous results wherever available.  相似文献   

20.
The list marking problem involves marking the nodes of an ℓ-node linked list stored in the memory of a (p, n)-PRAM, when only the position of the head of the list is initially known, while the remaining list nodes are stored in arbitrary memory locations. Under the assumption that cells containing list nodes bear no distinctive tags distinguishing them from other cells, we establish anΩ(min{ℓ, n/p}) randomized lower bound for ℓ-node lists and present a deterministic algorithm whose running time is within a logarithmic additive term of this bound. Such a result implies that randomization cannot be exploited in any significant way in this setting. For the case where list cells are tagged in a way that differentiates them from other cells, the above lower bound still applies to deterministic algorithms, while we establish a tight

bound for randomized algorithms. Therefore, in the latter case, randomization yields a better performance for a wide range of parameter values.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号