首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Compared to Beowulf clusters and shared-memory machines, GPU and FPGA are emerging alternative architectures that provide massive parallelism and great computational capabilities. These architectures can be utilized to run compute-intensive algorithms to analyze ever-enlarging datasets and provide scalability. In this paper, we present four implementations of K-means data clustering algorithm for different high performance computing platforms. These four implementations include a CUDA implementation for GPUs, a Mitrion C implementation for FPGAs, an MPI implementation for Beowulf compute clusters, and an OpenMP implementation for shared-memory machines. The comparative analyses of the cost of each platform, difficulty level of programming for each platform, and the performance of each implementation are presented.  相似文献   

3.
Prolog is becoming a popular language in A. I. applications and particularly in the implementation of knowledge based expert systems. We have identified three different uses of Prolog: (1) building expert systems directly in ordinary Prolog, (2) using Prolog as the implementation language for an higher level of interpretation, and (3) extending Prolog with suitable features and directly using it. In this paper, we define the three uses in more details, compare them, and cite some concrete examples.  相似文献   

4.
5.
Prolog-ELF incorporating fuzzy logic and several useful functions into Prolog has been implemented as a basic language for building knowledge systems with uncertainty or fuzziness. Prolog-ELF inherits all the desirable basic features of Prolog. In addition to assertions with truth-values between 1.0 and 0.5 (0 for exceptional cases), fuzzy sets can be very easily manipulated. An application of fuzzy logical database is illustrated.  相似文献   

6.
An extension of Prolog, based on the model elimination theorem-proving procedure, would permit production of a logically complete Prolog technology theorem prover capable of performing inference operations at a rate approaching that of Prolog itself.  相似文献   

7.

Introduction

Multiply-accumulate operation is the most fundamental operation in digital signal processing for image processing, robotics and automatic control. In this paper, a novel architecture of dynamically reconfigurable fused multiply-adder (FMA) is proposed.

Methods

Dynamic reconfiguration is a method that can change the circuit configuration without stop of operation. The proposed circuit provides the following four calculation modes by dynamic reconfiguration: (1) complex number FMA mode, (2) real number FMA mode, (3) complex number parallel calculation mode, and (4) real number parallel calculation mode.  The data format is single precision floating point format based on IEEE754. The proposed circuit was designed using Verilog-HDL. It was simulated by logic circuit simulator, and implemented on FPGA.

Result

As a result of circuit synthesis, we confirmed the reduction of resource in the proposed circuit. Furthermore, we confirmed proper result for each calculation mode by logic simulation and experiment on FPGA.

Conclusion

The proposed circuit provides the four calculation modes by dynamic reconfiguration. We confirmed the reduction of resource and proper result for each calculation mode.  相似文献   

8.
In this work a unified treatment of solid and fluid vibration problems is developed by means of the Finite-Difference Time-Domain (FDTD). The scheme here proposed takes advantage from a scaling factor in the velocity fields that improves the performance of the method and the vibration analysis in heterogenous media. Moreover, the scheme has been extended in order to simulate both the propagation in porous media and the lossy solid materials. In order to accurately reproduce the interaction of fluids and solids in FDTD both time and spatial resolutions must be reduced compared with the set up used in acoustic FDTD problems. This aspect implies the use of bigger grids and hence more time and memory resources. For reducing the time simulation costs, FDTD code has been adapted in order to exploit the resources available in modern parallel architectures. For CPUs the implicit usage of the advanced vectorial extensions (AVX) in multi-core CPUs has been considered. In addition, the computation has been distributed along the different cores available by means of OpenMP directives. Graphic Processing Units have been also considered and the degree of improvement achieved by means of this parallel architecture has been compared with the highly-tuned CPU scheme by means of the relative speed up. The speed up obtained by the parallel versions implemented were up to 3 (AVX and OpenMP) and 40 (CUDA) times faster than the best sequential version for CPU that also uses OpenMP with auto-vectorization techniques, but non includes implicitely vectorial instructions. Results obtained with both parallel approaches demonstrate that massive parallel programming techniques are mandatory in solid-vibration problems with FDTD.  相似文献   

9.
Prolog-X is an implemented portable interactive sequential Prolog system in which clauses are incrementally compiled for a virtual machine called the ZIP Machine. At present, the ZIP Machine is emulated by software, but it has been designed to permit easy implementation in microcode or hardware. Prolog-X running on the software-based emulator provides performance comparable with existing Prolog interpreters. To demonstrate its efficiency, compatibility, and comprehensiveness of implementation, Prolog-X has been used to compile and run several large applications programs. Several novel techniques are used in the implementation, particularly in the areas of the representation of therecordx database, the selection of clauses, and the compilation of arithmetic expressions.  相似文献   

10.
The concept of set abstraction is introduced as a simple analogy of that of lambda abstraction in the theory of lambda calculus. The set abstraction is concerned with two extensions concerning Prolog language features: “set expression” and “predicate variable.” It has been argued in the literature that the set expression extension to Prolog does really contribute to the power of the language, while the extension of predicate variables does not add anything to Prolog. Combining these two concepts of extensions to Prolog, we define “set abstraction” as the set expression in which predicate variables are allowed as data objects. In other words, the set abstraction gets involved in the higher order predicate logic. By showing some application examples, it is demonstrated that with the help of predicate variables set abstractions can nicely handle the issues of the second order predicate logic. Further, the implementation programs written in Prolog and Concurrent Prolog are presented.  相似文献   

11.
Multiwalled carbon nanotube (MWCNT)-polyimide (PI) nanocomposite was prepared with different MWCNT concentrations and characterized for their piezoresistive response. The morphology and mechanical behavior of the nanocomposite was investigated by scanning electron microscopy and force–displacement spectroscopy respectively. The surface conductivity of the nanocomposite was determined by atomic force microscopy in current mode. Studies reveal that this nanocomposite will be useful for strain-sensing element in micro electro mechanical system (MEMS)/nano electro mechanical system (NEMS) based piezoresistive pressure sensor applications. The study shows that the nanocomposite with 2 % MWCNT content is a unique piezoresistive sensing element for MEMS/NEMS pressure sensor.  相似文献   

12.
This paper presents a parallel logic programming language named P-Prolog which is being developed as a logic programming language featuring both and- and or-parallelism. Compared with the other parallel logic programming languages, syntactic constructs such as read-only annotation,6) mode declaration2) and communication constraints7) are not used in P-Prolog. A new concept introduced in P-Prolog is the exclusive relation of guarded Horn clauses. Advances included in P-prolog. are:
  1. The synchronization mechanism can determine the direction of data flow dynamically.
  2. Guarded Horn clauses can be interpreted as eitherdon’t care nondeterminism ordon’t know non-determinism.
A prototype interpreter of P-Prolog has been implemented in C-Prolog. We are now implementing a P-Prolog interpreter in the C language.  相似文献   

13.
In this paper we study the compilation of Prolog by making visible hidden operations (especially unification), and then optimizing them using well-known partial evaluation techniques. Inspection of straightforward partially evaluated unification algorithms gives an idea how to design special abstract machine instructions which later form the target language of our compilation. We handle typical compiler problems like representation of terms explicitly. This work gives a logical reconstruction of abstract Prolog machine code, and represents an approach of constructing a correct compiler from Prolog to such a code. As an example, we are explaining the unification principles of Warren’s New Prolog Engine within our framework.  相似文献   

14.
InA Subset of Concurrent Prolog and Its Interpreter (1983), E. Y. Shapiro introduces the language Concurrent Prolog. In his presentation, the problem of guaranteeing bounded-waiting during a merge operation is used as a programming example. Solutions are proposed for binary and n-ary merges. The solutions are, however, completely dependent on specific operational characteristics of a Concurrent Prolog machine or interpreter. This paper presents an alternate approach in which the property of bounded-waiting is incorporated into the semantics of the programs, demonstrable given only the computational model of the language. The solution strategy is to utilize the familiar systems programming techniques of block-on-input and busy-wait. This approach requires that the language be augmented with a metalogical predicate analogous to thevar(_) predicate of Sequential Prolog. The resultant programs are interesting and illustrative examples of Concurrent Prolog as a programming language.  相似文献   

15.
MapReduce has been demonstrated to be a promising alternative to simplify parallel programming with high performance on single multicore machine. Compared to the cluster version, MapReduce does not have bottlenecks in disk and network I/O on single multicore machine, and it is more sensitive to characteristics of workloads. A single execution flow may be inefficient for many classes of workloads. For example, the fixed execution flow of the MapReduce program structure can impose significant overheads for workloads that inherently have only one emitted value per key, which are mainly caused by the unnecessary reduce phase. In this paper, we refine the workload characterization from Phoenix++ according to the attributes of key-value pairs, and give a demonstration that the refined workload characterization model covers all classes of MapReduce workloads. Based on the model, we propose a new MapReduce system with workload-customizable execution flow. The system, namely Peacock, is implemented on top of Phoenix++. Experiments with four different classes of benchmarks on a 16-core Intel-based server show that Peacock achieves better performance than Phoenix++ for workloads that inherently have only one emitted value per key (up to a speedup of \(3.6\times \) ) while identical for other classes of workloads.  相似文献   

16.
Several attempts have been made to design a production system using Prolog. To construct a forward reasoning system, the rule interpreter is often written in Prolog, but its execution is slow. To develop an efficient production system, we propose a rule translation method where production rules are translated into a Prolog program and forward reasoning is done by the translated program. To translate the rules, we adopted the technique developed in BUP, the bottom-up parsing system in Prolog. Man-machine dialogue functions were added to the production system and showed the potential of our method to be applied to expert systems.  相似文献   

17.
Process migration provides many benefits for parallel environments including dynamic load balancing, data access locality or fault tolerance. This paper describes an in-memory application-level checkpoint-based migration solution for MPI codes that uses the Hierarchical Data Format 5 (HDF5) to write the checkpoint files. The main features of the proposed solution are transparency for the user, achieved through the use of CPPC (ComPiler for Portable Checkpointing); portability, as the application-level approach makes the solution adequate for any MPI implementation and operating system, and the use of the HDF5 file format enables the restart on different architectures; and high performance, by saving the checkpoint files to memory instead of to disk through the use of the HDF5 in-memory files. Experimental results prove that the in-memory approach reduces significantly the I/O cost of the migration process.  相似文献   

18.
Radio frequency identification (RFID) is an important technique used for automatic identification and data capture. In recent years, low-cost RFID tags have been used in many open-loop applications beyond supply chain management, such as the tagging of the medicine, clothes, and belongings after the point of sales. At the same time, with the development of semiconductor industry, handheld terminals and mobile phones are becoming RFID-enabled. Unauthorized mobile RFID readers could be abused by the malicious hackers or curious common people. Even for authorized RFID readers, the ownership of the reader can be transferred and the owners of the authorized mobile reader may not be always reliable. The authorization and authentication of the mobile RFID readers need to take stronger security measures to address the privacy or security issues that may arise in the emerging open-loop applications. In this paper, the security demands of RFID tags in emerging open-loop applications are summarized, and two example protocols for authorization, authentication and key establishment based on symmetric cryptography are presented. The proposed protocols adopt a timed-session-based authorization scheme, and all reader-to-tag operations are authorized by a trusted third party using a newly defined class of timed sessions. The output of the tags is randomized to prevent unauthorized tracking of the RFID tags. An instance of the protocol A is implemented in 0.13-μm CMOS technology, and the functions are verified by field programmable gate array. The baseband consumes 44.0 μW under 1.08 V voltage and 1.92 MHz frequency, and it has 25,067 gate equivalents. The proposed protocols can successfully resist most security threats toward open-loop RFID systems except physical attacks. The timing and scalability of the two protocols are discussed in detail.  相似文献   

19.
This paper suggests a general method for compiling OR-parallelism into AND-parallelism. An interpreter for an AND/OR-parallel language written in the AND-parallel subset of the language induces a source-to-source transformation from the full language into the AND-parallel subset. This transformation can be identified and implemented as a special purpose compiler or applied using a general purpose partial evaluator. The method is demonstrated to compile a variant of Concurrent Prolog into an AND-parallel subset of the language called Flat Concurrent Prolog (FCP). It is also shown applicable to the compilation of OR-parallel Prolog to FCP. The transformation identified is simple and efficient. The performance of the method is discussed in the context of programming examples. These compare well with conventionally compiled Prolog programs.  相似文献   

20.
We present a method for preprocessing Prolog programs so that their operational semantics will be given by the first-order predicate calculus. Most Prolog implementations do not use a full unification algorithm, for efficiency reasons. The result is that it is possible to create terms having loops in them, whose semantics is not adequately described by first-order logic. Our method finds places where such loops may be created, and adds tests to detect them. This should not appreciably slow down the execution of most Prolog programs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号