首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The Transient Reactor Analysis Code (TRAC), which features a two-fluid treatment of thermal-hydraulics, is designed to model transients in water reactors and related facilities. One of the major computational costs associated with TRAC and similar codes is calculating constitutive coefficients. Although the formulations for these coefficients are local, the costs are flow-regime- or data-dependent; i.e., the computations needed for a given spatial node often vary widely as a function of time. Consequently, a fixed, uniform assignment of nodes to parallel processors will result in degraded computational efficiency due to the poor load balancing.A standard method for treating data-dependent models on vector architectures has been to use gather operations (or indirect addressing) to sort the nodes into subsets that (temporarily) share a common computational model. However, this method is not effective on distributed memory data parallel architectures, where indirect addressing involves expensive communication overhead. Another serious problem with this method involves software engineering challenges in the areas of maintainability and extensibility. For example, an implementation that was hand-tuned to achieve good computational efficiency would have to be rewritten whenever the decision tree governing the sorting was modified. Using an example based on the calculation of the wall-to-liquid and wall-to-vapor heat-transfer coefficients for three nonboiling flow regimes, we describe how the use of the Fortran 90 WHERE construct and automatic inlining of functions can be used to ameliorate this problem while improving both efficiency and software engineering.Unfortunately, a general automatic solution to the load-balancing problem associated with data-dependent computations is not yet available for massively parallel architectures. We discuss why developers should either wait for such solutions or consider alternative numerical algorithms, such as a neural network representation, that do not exhibit load-balancing problems.  相似文献   

2.
于锐  赵强 《原子能科学技术》2015,49(10):1833-1838
特征线法是目前求解反应堆中子输运方程的主要计算方法之一。本文开发了基于OpenMP的中子输运方程特征线法并行计算程序,以提高特征线法的计算效率。OpenMP是共享存储体系结构上的一个并行编程模型,采用Fork-Join并行执行方式,适合于SMP共享内存多处理系统和多核处理器体系结构。通过相关基准题测试验证,表明所开发的程序在有效增殖因数以及相对中子通量(归一化栅元功率)分布等参数上都能取得良好的精度,且使用OpenMP能取得良好的加速效果,使计算时间显著减少。  相似文献   

3.
In this paper we consider the implementation of particle tracking Monte Carlo on two different types of parallel computer architectures, namely, (i) the AMT DAP-610 array of processors which is a SIMD machine and (ii) the transputer based Meiko Computing Surface which is a MIMD machine. An analogue, research-level, fixed source, particle transport Monte Carlo code for studying the attenuation and leakage of gamma ray photons in simple, multilayer, shielding configurations, originally written for serial computers, is modified and re-written to run efficiently on these two different types of parallel machine. The philosophy adopted and algorithms developed in transferring the code to the parallel machines are described. Two illustrative problems are solved using realistic cross-section data, involving a 9 MeV source of gamma photons in a lead-void-water sphere and slab. Integral quantities, such as fraction of particles absorbed and escaped, and differential quantities, such as flux distributions and leakage spectra, computed by the parallel codes, are presented in tables and graphs. For equivalent calculations, the CPU times on the DAP and a number of serial computers (e.g. ICL-3900, CRAY XMP/28 in scalar mode, SUN 4, and VAX 11/750) are compared, and the resulting speedup factors quoted. For the Meiko Computing Surface, the performance obtained from running the code when the number of transputers used is varied, is tabulated. Finally, the performance of the two parallel machines is compared.  相似文献   

4.
This paper describes and analyzes various distributed processor architectures using commercially available CAMAC components. The general orientation is toward distributed control systems using Digital Equipment Corporation LSI-11 processors in a CAMAC environment. The paper describes in detail software tools available to simplify the development of applications software and to provide a high-level runtime environment both at the host and the remote processors. Discussion focuses on techniques for downloading of operating systems from a large host and applications tasks written in high-level languages. It also discusses software tools which enable tasks in the remote processors to exchange messages and data with tasks in the host in a simple and elegant way.  相似文献   

5.
复杂物理现象通常由多类复杂的物理过程紧耦合构成,其数值模拟也通常由适用不同物理过程的多类并行应用程序紧耦合完成。如何设计这些物理过程之间的联接算法,既要保证程序之间数据传递的高效,又要保证程序各自运行和总体模拟的高效,还要保证程序各自开发的独立,是一个值得研究的课题。文章基于广泛应用于高温高压物理研究中的辐射流体力学和中子输运多物理并行数值模拟,在非结构网格上,提出了两种联接算法:完全松散联接算法和两层紧耦合联接算法。前者侧重于实现程序各自运行的高效和开发的独立,后者在前者的基础上,还权衡了数据传递和总体模拟的高效。在两台并行机的数百个处理机上,通信复杂度分析和数值实验结果表明,两种算法均是有效的,可推广应用到其他多物理并行数值模拟。特别是,两层紧耦合联接算法是高效可扩展的,取得了近似最优的并行性能。  相似文献   

6.
This paper describes a Data Acquisition System which has been specifically designed to take advantage of modern operating systems. It is modular, structured as a set of independent tasks communicating via a shared data area. The design is based on the concept of circular buffers with associated data producer and (parallel) consumer tasks. By using privileged tasks in time critical areas, a fast and efficient system has been obtained: interrupt latency of less than 100 microseconds, and data transfer speeds essentially limited by hardware (CAMAC DMA or magnetic tape recording). The tasks may be distributed over different processors. For example, 16/32 bit multi-processors with a shared multi-port memory are used to implement systems where powerful data reduction and/or monitoring tasks are required. The system is in use at over 25 high energy and nuclear physics experiments at CERN and in other European laboratories.  相似文献   

7.
Fusion experiments place high demands on real-time control systems. Within the fusion community two modern framework-based software architectures have emerged as powerful tools for developing algorithms for real-time control of complex systems while maintaining the flexibility required when operating a physics experiment. The two frameworks are known as DCS (Discharge Control System), from ASDEX Upgrade and MARTe (Multithreaded Application Real-Time executor), originally from JET.Based on the success of DCS and MARTe, ITER has chosen to develop a framework architecture for its Plasma Control System which will adopt major design concepts from both the existing frameworks.This paper describes a coupling of the two existing frameworks, which was undertaken to explore the degree of similarity and compliance between the concepts, and to extend their capabilities. DCS and MARTe operate in parallel with synchronised state machines and a common message logger. Configuration data is exchanged before the real-time phase. During the real-time phase, structured data is exchanged via shared memory and an existing DCS algorithm is replicated within MARTe. The coupling tests the flexibility and identifies the respective strengths of the two frameworks, providing a well-informed basis on which to move forward and design a new ITER real-time framework.  相似文献   

8.
In order to obtain diagnostic data with physical meaning,the acquired raw data must be processed through a series of physical formulas or processing algorithms.Some diagnostic data are acquired and processed by the diagnostic systems themselves.The data processing programs are specific and usually run manually,and the processed results of the analytical data are stored in their local disk,which is unshared and unsafe.Thus,it is necessary to integrate all the specific process programs and build an automatic and unified data analysis system with shareable data storage.This paper introduces the design and implementation of the online analysis system.Based on the MDSplus event mechanism,this system deploys synchronous operations for different processing programs.According to the computational complexity and real-time requirements,combined with the programmability of parallel algorithms and hardware costs,the OpenMP parallel processing technology is applied to the EAST analysis system,and significantly enhances the processing efficiency.  相似文献   

9.
Multiprocessing is the most exciting development in computer technology in recent years, opening the door for enormous computational power limited only by the imagination and patience of the hardware developer, and the wit and perseverance of the algorithm designer. The potential benefit of this innovation was immediately recognized by researchers in the nuclear field, among others, who explored and reported their experience with concurrent implementation of algorithms, standard and new, to solve neutron diffusion and transport problems. The accumulated experience in, as well as the current status of, this rapidly changing area is reviewed. Collectively these illustrate the strong coupling between parallel algorithms and their target architectures to the extent that standard sequential schemes in nuclear applications are best suited to coarse grained, e.g. shared memory supercomputers, and medium grained, e.g. a few hundred networked powerful CPUs, platforms. The full potential of parallel computing will be better realized by novel, non-traditional, solution algorithms that best utilize all the components of a multiprocessor machine.  相似文献   

10.
An in-house development of an Advanced Telecommunications Computing Architecture (ATCA) board for fast control and data acquisition, with Input/Output (IO) processing capability, is presented. The architecture, compatible with the ATCA (PICMG 3.4) and ATCA eXtensions for Instrumentation (AXIe) specifications, comprises a passive Rear Transition Module (RTM) for IO connectivity to ease hot-swap maintenance and simultaneously to increase cabling life cycle.The board complies with ITER Fast Plant System Controller (FPSC) guidelines for rear IO connectivity and redundancy, in order to provide high levels of reliability and availability to the control and data acquisition systems of nuclear fusion devices with long duration plasma discharges.Simultaneously digitized data from all Analog to Digital Converters (ADC) of the board can be filtered/decimated in a Field Programmable Gate Array (FPGA), decreasing data throughput, increasing resolution, and sent through Peripheral Component Interconnect (PCI) Express to multi-core processors in the ATCA shelf hub slots. Concurrently the multi-core processors can update the board Digital to Analog Converters (DAC) in real-time. Full-duplex point-to-point communication links between all FPGAs, of peer boards inside the shelf, allow the implementation of distributed algorithms and Multi-Input Multi-Output (MIMO) systems. Support for several timing and synchronization solutions is also provided.Some key features are onboard ADC or DAC modules with galvanic isolation, Xilinx Virtex 6 FPGA, standard Dual Data Rate (DDR) 3 SODIMM memory, standard CompactFLASH memory card, Intelligent Platform Management Controller (IPMC), two PCI Express x4 (generation 2) ATCA Fabric channels (dual-star topology), eleven Xilinx Aurora x1 (or other ATCA compatible communications protocol) ATCA fabric channels (full-mesh topology) and two Fast Ethernet (Precision Time Protocol – PTP IEEE1588-V2 and Lan eXtensions for Instrumentation – LXI compatible) ATCA base channels (dual-star topology).  相似文献   

11.
特征线方法在应用于全堆芯三维输运计算时面临着计算时间长、内存需求量大的问题,而大规模并行是最有效的解决办法。传统的并行策略是进行空间的区域分解,但当问题的几何规模较小时,其并行度有限,无法充分利用并行资源。本文在高保真物理计算程序NECP-X空间区域分解的基础上研究了角度和特征线的三重并行计算。为实现角度并行的负载平衡,采用了考虑权重的贪婪算法角度并行策略;为节省内存,在共享内存的并行模式下采用动态调度的分配方案实现特征线并行。数值结果表明,NECP-X中的角度和特征线并行效率较高,可在空间区域分解并行的基础上进一步扩大并行规模,提高计算速度。  相似文献   

12.
The Nuclear Structure Research Laboratory at the University of Rochester is developing a VAX-11/750 computer system for use in a data acquisition and analysis system. The system consists of the VAX which is networked to two LSI-11/23's which are in turn connected to CAMAC branch drivers. The CAMAC branch drivers operate both parallel and 5 MHz byte-serial highways. The network is a high speed DMA interface and is identical to the system implemented at the W.K. Kellogg Radiation Laboratory[l]. The data acquisition system has been designed to allow the user to select standard program modules as building blocks to construct a system to suit the particular needs of the experiment. The division of the realtime analysis between the VAX and the LSI-11s is flexible. The LSI-11s are equipped with small array processors tlo permit high speed analysis on the satellite processors.  相似文献   

13.
特征线方法在应用于全堆芯三维输运计算时面临着计算时间长、内存需求量大的问题,而大规模并行是最有效的解决办法。我国超级计算机的快速发展使大规模并行计算逐渐成为可能,而如何发展相应的并行算法成为当务之急。本文基于数值反应堆物理计算程序NECP-X研究特征线方法的空间、角度和特征线多重并行策略。为实现高效并行,空间并行采用了区域分解的并行方式;为充分考虑角度并行的负载平衡,采用了“贪婪算法”角度区域分解算法;为节省内存和提高效率,应用并分析了共享式内存并行模式下动态调度的特征线并行方案。数值结果表明,NECP-X中的空间、角度和特征线并行效率较高,可充分利用并行资源,实现大规模并行。  相似文献   

14.
The existing parallel computing schemes for Monte Carlo criticality calculations suffer from a low efficiency when applied on many processors. We suggest a new fission matrix based scheme for efficient parallel computing. The results are derived from the fission matrix that is combined from all parallel simulations. The scheme allows for a practically ideal parallel scaling as no communication among the parallel simulations is required, and inactive cycles are not needed.  相似文献   

15.
The paper investigates a new technique to predict error rates in digital architectures based on microprocessors. Three studied cases are presented concerning three different processors. Two of them are included in the instruments of a satellite project. The actual space applications of these two instruments were implemented using the capabilities of a dedicated system. Results of the fault injection and radiation testing experiments and discussions about the potentialities of this technique are presented  相似文献   

16.
High data rate in nuclear spectroscopy can be achieved by using digital shaping techniques. Currently, the use of held-programmable gate arrays as digital processors is hugely increasing, especially in real time applications. In this paper, we deal with the problem of the minimization of the computation burden, in order to allow the use of the least possible hardware, which in turn allows the minimization of power dissipation, size etc., of the processing machine. In order to get at-most advantages of spatial computing in programmable devices, data-path structures of temporal computing process techniques have been revised. Among the improvements consequent to the optimization of architectures, we address three topics: reduction of processing speed, resource saving, and adaptive dynamic management of digital filters length for increasing resolution  相似文献   

17.
Conclusion Variational-synthesis methods have found application in many areas of research on reactor physics. These methods are used successfully to describe both the steady states of reactors and the dynamics of the neutron field [56, 57].Despite the differences in the classes of approximation of the solution and the algorithms of the equations variational-synthesis methods are characterized by a compact representation of the basic data arrays and a saving of computing time (by a factor of no less than 7–8) in comparison with iterative methods. The synthesis methods are especially effective in solving reactor problems with many independent variables, since they permit a considerable reduction of the volume of data stored and processed by a computer.In view of what has been said above variational-synthesis methods can be considered as computational instruments for the expeditious study of neutron-physical processes in reactors. Another important area of application of these methods is that of optimization of the neutron-physical parameters on reactors, considered as a constituent part of the comprehensive optimization of the technical and economic indicators of an atomic power plant. In this area the application of variational-synthesis methods is determined primarily by the speed of the programs. Because of their economy these methods can be used successfully in mathematical models of existing reactors for their operational servicing.Concurrently with the development of computing technique, improvements are being made in the mathematical methods of solving reactor problems. In the new computing technique that has now appeared, qualitatively new approaches are taken to the execution of computations, making it possible to consider existing mathematical methods in a different way. The use of multiprocessor computers as well as computers with processors that include matrix modules will open up the possibilities of variational-synthesis methods much more widely, since the iden of carrying out parallel computations corresponds to the approach of synthesis methods with separation of the initial problem into several connected subproblems which have a matrix structure.Translated from Atomnaya Énergiya, Vol. 58, No. 5, pp. 360–369, May, 1985.  相似文献   

18.
The possible application of algorithms derived from neural networks to the D0 experiment is discussed. The D0 data acquisition system is based on a large farm of MicroVaxes, each independently performing real-time event filtering. Advanced multiport memories in each MicroVAX node will enable special function processors to have direct access to event data. An exploratory study of back-propagation neural networks, such as might be configured in the nodes, for more efficient event filtering is described  相似文献   

19.
Monte Carlo machine, Monte-4 has been developed to realize high performance computing of Monte Carlo codes for particle transport. The calculation for particle tracking in a complex geometry requires (1) classification of particles by the region types using multi-way conditional branches, and (2) determination whether intersections of particle paths with surfaces of the regions are on the boundaries of the regions or not, using nests of conditional branches. How-ever, these procedures require scalar operations or unusual vector operations. Thus the speedup ratios have been low, i.e. nearly two times, in vector processing of Monte Carlo codes for particle transport on conventional vector processors. The Monte Carlo machine Monte-4 has been equipped with the special hardware called Monte Carlo pipelines to process these procedures with high performance. Additionally Monte-4 has been equipped with enhanced load/store pipelines to realize fast transfer of indirectly addressed data for the purpose of resolving imbalances between the performance of data transfers and arithmetic operations in vector processing of Monte Carlo codes on conventional vector processors. Finally, Monte-4 has a parallel processing capability with four processors to multiply the performance of vector processing. We have evaluated the effective performance of Monte-4 using production-level Monte Carlo codes such as vectorized KENO-IV and MCNP. In the performance evaluation, nearly ten times speedup ratios have been obtained, compared with scalar processing of the original codes.  相似文献   

20.
EURATOM/CIEMAT and the Technical University of Madrid UPM are involved in the development of a FPSC (fast plant system control) prototype for ITER based on PXIe form factor. The FPSC architecture includes a GPU-based real time high performance computing service which has been integrated under EPICS (experimental physics and industrial control system). In this work we present the design of this service and its performance evaluation with respect to other solutions based in multi-core processors. Plasma pre-processing algorithms, illustrative of the type of tasks that could be required for both control and diagnostics, are used during the performance evaluation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号