共查询到20条相似文献,搜索用时 0 毫秒
1.
R.K. Böck 《Computer Physics Communications》1975,9(4):221-229
This paper discusses the computing problems in high-energy physics and reviews the portability achieved in software in the field. The techniques employed at CERN for developing programs are described and suggestions put forward for the future solution of the problems of portability. 相似文献
2.
Classical molecular dynamics simulation for atomistic systems is implemented in OpenCL and benchmarked on a variety of different hardware platforms. Modifying the number of particles and system size in the study provides insight into characteristics of parallel compute platforms, where latency, data transfer, memory access characteristics and compute intense work can be identified as fingerprints in benchmark runs. Data layouts are compared, for which the access of structure-of-arrays shows best performance in most cases. It is demonstrated that function portability can be achieved straightforwardly with OpenCL, while performance portability lacks behind as various architectures strongly depend on specific vectorisation optimisation. 相似文献
3.
The areas in which programs are most unlikely to be portable are discussed. Attention is paid to programming languages, operating systems, file systems, I/O device characteristics, machine architecture and documentation. Pitfalls are indicated and in some cases solutions are suggested. 相似文献
4.
5.
《Expert systems with applications》2014,41(2):655-662
Data mining is most commonly used in attempts to induce association rules from transaction data. In the past, we used the fuzzy and GA concepts to discover both useful fuzzy association rules and suitable membership functions from quantitative values. The evaluation for fitness values was, however, quite time-consuming. Due to dramatic increases in available computing power and concomitant decreases in computing costs over the last decade, learning or mining by applying parallel processing techniques has become a feasible way to overcome the slow-learning problem. In this paper, we thus propose a parallel genetic-fuzzy mining algorithm based on the master–slave architecture to extract both association rules and membership functions from quantitative transactions. The master processor uses a single population as a simple genetic algorithm does, and distributes the tasks of fitness evaluation to slave processors. The evolutionary processes, such as crossover, mutation and production are performed by the master processor. It is very natural and efficient to run the proposed algorithm on the master–slave architecture. The time complexities for both sequential and parallel genetic-fuzzy mining algorithms have also been analyzed, with results showing the good effect of the proposed one. When the number of generations is large, the speed-up can be nearly linear. The experimental results also show this point. Applying the master–slave parallel architecture to speed up the genetic-fuzzy data mining algorithm is thus a feasible way to overcome the low-speed fitness evaluation problem of the original algorithm. 相似文献
6.
Strategies for supporting application portability 总被引:1,自引:0,他引:1
A range of system-design strategies that can be used to support portability and the ways in which these strategies have been employed by past and present systems are examined. The strategies are grouped into three categories: (1) strategies that maintain identical execution-time interfaces by porting system components that form the interface, (2) strategies that maintain identical or nearly identical interfaces for different system components by adhering to appropriate standards, and (3) strategies that assist in the adaptation of programs to a target environment. The principal emphasis is on operating-system issues. User interface portability, dynamic portability in a network, and international exchange of programs are briefly considered 相似文献
7.
Discusses the value of reuse from a system house's point of view and outlines the industry challenges caused by shorter time frames and increased complexities. The author has spent close to 30 years dedicated to the design and use of silicon and related design methodologies. He pioneered standard-cell libraries for silicon design and introduced the customer-owned tooling (COT) flow to enable component reuse and control the movement of intellectual property 相似文献
8.
9.
10.
Hochong Park Chin R.T. 《IEEE transactions on pattern analysis and machine intelligence》1994,16(3):304-313
A morphological operation using a large structuring element can be decomposed equivalently into a sequence of recursive operations, each using a smaller structuring element. However, an optimal decomposition of arbitrarily shaped structuring elements is yet to be found. In this paper, we have derived an optimal decomposition of a specific class of structuring elements-convex sets-for a specific type of machine-4-connected parallel array processors. The cost of morphological operation on 4-connected parallel array processors is the total number of 4-connected shifts required by the set of structuring elements. First, the original structuring element is decomposed into a set of prime factors, and then their locations are determined while minimizing the cost function. Proofs are presented to show the optimality of the decomposition. Examples of optimal decomposition are given and compared to an existing decomposition reported by Xu (1991) 相似文献
11.
J. Larmouth 《Software》1981,11(10):1071-1117
This paper is the result of a study of the portability problems which might arise in the use of the Fortran 77 language. The study involved both an examination of the text of the Standard and discussions with a number of compiler-writing teams. The paper identifies areas of Fortran 77 which are likely to cause problems, either because of an incomplete language specification, difficulties in compilation, or user inattention. It will be of interest not only to users writing large packages in Fortran, and to Fortran compiler writers, but also to those involved in language standardization and portability in general. 相似文献
12.
《Computer Speech and Language》2005,19(3):345-363
As core speech recognition technology improves, opening up a wider range of applications, genericity and portability are becoming important issues. Most of todays recognition systems are still tuned to a particular task and porting the system to a new task (or language) requires a substantial investment of time and money, as well as human expertise.This paper addresses issues in speech recognizer portability and in the development of generic core speech recognition technology. First, the genericity of wide domain models is assessed by evaluating their performance on several tasks of varied complexity. Then, techniques aimed at enhancing the genericity of these wide domain models are investigated. Multi-source acoustic training is shown to reduce the performance gap between task-independent and task-dependent acoustic models, and for some tasks to out-perform task-dependent acoustic models.Transparent methods for porting generic models to a specific task are also explored. Transparent unsupervised acoustic model adaptation is contrasted with supervised adaptation, and incremental unsupervised adaptation of both the acoustic and linguistic models is investigated. Experimental results on a dialog task show that with the proposed scheme, a transparently adapted generic system can perform nearly as well (about a 1% absolute gap in word error rate) as a task-specific system trained on several tens of hours of manually transcribed data. 相似文献
13.
14.
One of the major purposes of a high-level language is to provide a large measure of machine-Independence in the specification of algorithms. Definitions of languages such as FORTRAN IV and ALGOL 60 encourage compatibility between various implementations. Language specifications are inadequate in that they normally underdefine a language. In particular, the specifications do not normally demand a response to a language violation. The freedom normally given to an implementor to decide the degree and nature of error detection and response hinders portability and may lead to-unexpected results when moving code from one machine to another or even when changing implementations on the same machine. To support the contention that languages should specify a response to violations, an analysis of four FORTRAN IV implementations and a FORTRAN IV verifier was conducted. The study showed that different implementations often lead to different results for the same illegal program. A study of programmers also revealed that they cannot be relied upon to avoid language violations without compiler aids. 相似文献
15.
In the opinion of the authors, a fully portable interactive graphical program should be able to take advantage of the object display terminal's hardware features where these are beneficial to the efficiency of man-machine interaction. By analysing the FORTRAN code of an interactive graphical program designed to operate on both refresh displays and storage tubes, a program design approach has evolved which attempts to reduce the effort required to achieve this aim. 相似文献
16.
The paper is devoted to the problem of portability of applications between different software-hardware platforms. A survey of approaches to solving this problem is given, and analysis of their advantages and disadvantages is presented. Application domains of the existing approaches are discussed. 相似文献
17.
嵌入式浏览器可移植性的研究与实现 总被引:2,自引:0,他引:2
现有嵌入式浏览器都是针对某一领域的具体应用开发,可移植性很差,既使部分产品具有一定的可移植性,在移植时也需要重新编写界面代码.嵌入浏览器的移植大部分工作都在界面的移植上.从解决此问题入手,以开源代码Mozilla为基础,通过研究视窗抽象层的结构,在嵌入图形库MiniGUI上设计并实现视窗抽象层的一系列接口,可以大大提高嵌入式浏览器的可移植性. 相似文献
18.
An effective speedup metric for measuring productivity in large-scale parallel computer systems 总被引:1,自引:0,他引:1
With the parallel computer systems scaling-up, the measure index for performance of the systems demands a shift from traditional “high performance” to “high productivity.” This brings a new challenge to defining a synthetic, yet meaningful, measure index of multiple productivity variables; namely computing performance, reliability, energy consumption, parallel software development, etc. Traditional measures for large-scale parallel computer systems merely focus on computing performance, and are incapable of measuring the multiple productivity variables simultaneously in an effective manner. A recently proposed market-related money model, which pursues high utility/cost ratio, relies on money as a measure to consider the multiple productivity variables. Differing from the previous models, this paper proposes a novel system productivity speedup metric for large-scale parallel computer systems. The metric uses speedup instead of money to comprehensively unify the measures of multiple productivity variables. Finally, we propose a trade-off productivity measurement to weigh different productivity variables, to address different design targets. The measurement can facilitate the system evaluation, expose future technique tendencies, and guide future system design. 相似文献
19.
Stéphane Bressan Alfredo Cuzzocrea Panagiotis Karras Xuesong Lu Sadegh Heyrani Nobari 《Journal of Parallel and Distributed Computing》2013
The widespread usage of random graphs has been highlighted in the context of database applications for several years. This because such data structures turn out to be very useful in a large family of database applications ranging from simulation to sampling, from analysis of complex networks to study of randomized algorithms, and so forth. Amongst others, Erd?s–Rényi Γv,p is the most popular model to obtain and manipulate random graphs. Unfortunately, it has been demonstrated that classical algorithms for generating Erd?s–Rényi based random graphs do not scale well in large instances and, in addition to this, fail to make use of the parallel processing capabilities of modern hardware. Inspired by this main motivation, in this paper we propose and experimentally assess a novel parallel algorithm for generating random graphs under the Erd?s–Rényi model that is designed and implemented in a Graphics Processing Unit (GPU), called PPreZER. We demonstrate the nice amenities due to our solution via a succession of several intermediary algorithms, both sequential and parallel, which show the limitations of classical approaches and the benefits due to the PPreZER algorithm. Finally, our comprehensive experimental assessment and analysis brings to light a relevant average speedup gain of PPreZER over baseline algorithms. 相似文献
20.
Television daily produces massive amounts of videos. Digital video is unfortunately an unstructured document in which it is
very difficult to find any information. Television streams have however a strong and stable but hidden structure that we want
to discover by detecting repeating objects in the video stream. This paper shows that television streams are actually highly
redundant and that detecting repeats can be an effective way to detect the underlying structure of the video. A method for
detecting these repetitions is presented here with an emphasis on the efficiency of the search in a large video corpus. Very
good results are obtained both in terms of effectiveness (98% in recall and precision) as well as efficiency since one day
of video is queried against a 3 weeks dataset in only 1 s.
相似文献
Patrick GrosEmail: |