Pure data-parallel languages such as High Performance Fortran version 1 (HPF) do not allow efficient expression of mixed task/data-parallel computations or the coupling of separately compiled data-parallel modules. In this paper, we show how these common parallel program structures can be represented, with only minor extensions to the HPF model, by using a coordination library based on the Message Passing Interface (MPI). This library allows data-parallel tasks to exchange distributed data structures using calls to simple communication functions. We present microbenchmark results that characterize the performance of this library and that quantify the impact of optimizations that allow reuse of communication schedules in common situations. In addition, results from two-dimensional FFT, convolution, and multiblock programs demonstrate that the HPF/MPI library can provide performance superior to that of pure HPF. We conclude that this synergistic combination of two parallel programming standards represents a useful approach to task parallelism in a data-parallel framework, increasing the range of problems addressable in HPF without requiring complex compiler technology. 相似文献
One of the key components of a multiuser multimedia-on-demand system is the data server. Digitalization of traditionally analog data such as video and audio, and the feasibility of obtaining network bandwidths above the gigabit-per-second range, are two important advances that have made possible the realization, in the near future, of interactive distributed multimedia systems. Secondary-to-main memory I/O technology has not kept pace with advances in networking, main memory, and CPU processing power. Consequently, the performance of the server has a direct bearing on the overall performance of such a system. In this paper, we present a highperformance solution to the I/O retrieval problem in a distributed multimedia system. We develop a model for the architecture of a server for such a system. Parallelism of data retrieval is achieved by striping the data across multiple disks. We present the algorithms for server operation when servicing a constant number of streams, as well as the admission control policy for accepting requests for new streams. The performance of any server ultimately depends on the data access patterns. Two modifications of the basic retrieval algorithm are presented to exploit data access patterns in order to improve system throughput and response time. Finally, we present preliminary performance results of these algorithms on the IBM SP1 and Intel Paragon parallel computers. 相似文献
Back break is an unsolicited phenomenon caused due to rock condition, blast geometry, explosive and initiation system in mines. It does not help in creating a smooth high wall and free face for next blasting due to cracks, overhang and under-hang. It can cause rockfall during drilling due to the cracks present in the in situ rock mass at the perimeter. Due to improper free face created from the previous blast and the presence of loose strata in the face increases the overall cost of production. Therefore, predicting and subsequently optimising back break shall reduce their problems to some extent. In this paper, an attempt is made to predict back break using the random forest method. The variables used for the study was such as burden to spacing ratio, stemming to hole-depth ratio, p-wave velocity and the density of explosive. For the random forest model, R2 0.9791 and RMSE 0.87899 and for linear regression was R2 was 0.824 and root mean square error (RMSE) 0.72, respectively. From the field trials, it was evident that the use of low-density emulsion can help in reducing the back break and optimise the overall cost of the blasting process. The same results were validated using Random forest method wherein the model R2 was 0.9791 and RMSE was 0.8799.
Microsystem Technologies - Due to fast technological development, human beings generally depend upon computer and other digital equipments in different areas of concern/applications. Therefore,... 相似文献
Parallel sorting algorithms are widely studied nowadays. After the introduction of parallel processors such as graphics processing unit (GPU) and easy to use parallel programming languages such as CUDA and OpenCL, literature on parallel sorting algorithms has become vast and richer with new ideas and techniques applied to solve the famous problem of sorting. This paper presents a survey of GPU based sorting algorithms. Four sorting algorithms have been selected for this survey: Radix sort, Merge sort, Sample sort and Quick sort. Methods used in those algorithms are described in brief. The performance of these algorithms as claimed by their authors is also presented. A comparative analysis based on the literature is depicted. 相似文献
Tensile flow behaviour of P9 steel with different silicon content has been examined in the framework of Hollomon, Ludwik, Swift, Ludwigson and Voce relationships for a wide temperature range (300–873 K) at a strain rate of 1.3 × 10?3 s?1. Ludwigson equation described true stress (σ)–true plastic strain (ε) data most accurately in the range 300–723 K. At high temperatures (773–873 K), Ludwigson equation reduces to Hollomon equation. The variations of instantaneous work hardening rate (θ = dσ/dε) and θσ with stress indicated two-stage work hardening behaviour. True stress–true plastic strain, flow parameters, θ vs. σ and θσ vs. σ with respect to temperature exhibited three distinct temperature regimes and displayed anomalous behaviour due to dynamic strain ageing at intermediate temperatures. Rapid decrease in flow stress and flow parameters, and rapid shift in θ–σ and θσ–σ towards lower stresses with increase in temperature indicated dominance of dynamic recovery at high temperatures. 相似文献
Thin films of Fe3O4 have been deposited on single crystal MgO(1 0 0) and Si(1 0 0) substrates using pulsed laser deposition. Films grown on MgO substrate are epitaxial with c-axis orientation whereas, films on Si substrate are highly 〈1 1 1〉 oriented. Film thicknesses are 150 nm. These films have been irradiated with 200 MeV Ag ions. We study the effect of the irradiation on structural and electrical transport properties of these films. The fluence value of irradiation has been varied in the range of 5 × 1010 ions/cm2 to 1 × 1012 ions/cm2. We compare the irradiation induced modifications on various physical properties between the c-axis oriented epitaxial film and non epitaxial but 〈1 1 1〉 oriented film. The pristine film on Si substrate shows Verwey transition (TV) close to 125 K, which is higher than generally observed in single crystals (121 K). After the irradiation with the 5 × 1010 ions/cm2 fluence value, TV shifts to 122 K, closer to the single crystal value. However, with the higher fluence (1 × 1012 ions/cm2) irradiation, TV again shifts to 125 K. 相似文献