首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Genetic programming (GP) has been used successfully as a technique for constructing robot control programs. Depending on the number of evaluations and the cost of each evaluation however, GP may require a substantial amount of processing time to find a feasible solution. The advent of parallel GP has brought the execution time of GP to a more acceptable level. This paper investigates parallel GP with a mobile robot navigation problem. The parallel implementations are based on a coarse-grained model. A technique for distributing the task of serial GP is proposed. In particular, this technique shows that the total amount of work can be reduced while maintaining the quality of the solutions. Asynchronous and synchronous implementations are examined. We compare the performance in terms of both the solution quality and the execution time. The timing analysis is investigated to give an insight into the behavior of parallel implementations. The results show that the parallel algorithm with asynchronous migration using 10 processors is 33 times faster than the serial algorithm. This work was presented in part at the 5th International Symposium on Artificial Life and Robotics, Oita, Japan, January 26–28, 2000.  相似文献   

2.
The goal of this article is to compare some optimised implementations on current high performance platforms in order to highlight architectural trends in the field of embedded architectures and to get an estimation of what should be the components of a next generation vision system. We present some implementations of robust motion detection algorithms on three architectures: a general purpose RISC processor—the PowerPC G4—a parallel artificial retina dedicated to low level image processing—Pvlsar34—and the Associative Mesh, a specialized architecture based on associative net. To handle the different aspects and constraints of embedded systems, execution time and power consumption of these architectures are compared.
Alain MérigotEmail:
  相似文献   

3.
Genetic Programming (GP) is a computationally intensive technique which is also highly parallel in nature. In recent years, significant performance improvements have been achieved over a standard GP CPU-based approach by harnessing the parallel computational power of many-core graphics cards which have hundreds of processing cores. This enables both fitness cases and candidate solutions to be evaluated in parallel. However, this paper will demonstrate that by fully exploiting a multi-core CPU, similar performance gains can also be achieved. This paper will present a new GP model which demonstrates greater efficiency whilst also exploiting the cache memory. Furthermore, the model presented in this paper will utilise Streaming SIMD Extensions to gain further performance improvements. A parallel version of the GP model is also presented which optimises multiple thread execution and cache memory. The results presented will demonstrate that a multi-core CPU implementation of GP can yield performance levels that match and exceed those of the latest graphics card implementations of GP. Indeed, a performance gain of up to 420-fold over standard GP is demonstrated and a threefold gain over a graphics card implementation.  相似文献   

4.
The evolutionary circuit design is an approach allowing engineers to realize computational devices. The evolved computational devices represent a distinctive class of devices that exhibits a specific combination of properties, not visible and studied in the scope of all computational devices up till now. Devices that belong to this class show the required behavior; however, in general, we do not understand how and why they perform the required computation. The reason is that the evolution can utilize, in addition to the “understandable composition of elementary components”, material-dependent constructions and properties of environment (such as temperature, electromagnetic field etc.) and, furthermore, unknown physical behaviors to establish the required functionality. Therefore, nothing is known about the mapping between an abstract computational model and its physical implementation. The standard notion of computation and implementation developed in computer science as well as in cognitive science has become very problematic with the existence of evolved computational devices. According to the common understanding, the evolved devices cannot be classified as computing mechanisms.
Lukáš SekaninaEmail: URL: www:http://www.fit.vutbr.cz/∼sekanina
  相似文献   

5.
A low-power wireless video sensor node for distributed object detection   总被引:2,自引:0,他引:2  
In this paper we propose MicrelEye, a wireless video node for cooperative distributed video processing applications that involve image classification. The node is equipped with a low-cost VGA CMOS image sensor, a reconfigurable processing engine (FPGA, Microcontroller, SRAM) and a Bluetooth 100-m transceiver. It has a size of few cubic centimeters and its typical power consumption is approximately ten times less than that of typical commercial DSP-based solutions. As regards classification, a highly optimized hardware-oriented support vector machine-like (SVM-like) algorithm called ERSVM is proposed and implemented. We describe our hardware and software architecture, its performance and power characteristics. The case study considered in this paper is people detection. The obtained results suggest that the present technology allows for the design of simple intelligent video nodes capable of performing classification tasks locally.
Luca BeniniEmail:
  相似文献   

6.
Engineering optimization techniques are computationally intensive and can challenge implementations on tightly-constrained embedded systems. Particle Swarm Optimization (PSO) is a well-known bio-inspired algorithm that is adopted in various applications, such as, transportation, robotics, energy, etc. In this paper, a high-speed PSO hardware processor is developed with focus on outperforming similar state-of-the-art implementations. In addition, the investigation comprises the development of an analytical framework that captures wide characteristics of optimization algorithm implementations, in hardware and software, using key simple and combined heterogeneous indicators. The framework proposes a combined Optimization Fitness Indicator that can classify the performance of PSO implementations when targeting different evaluation functions. The two targeted processing systems are Field Programmable Gate Arrays for hardware implementations and a high-end multi-core computer for software implementations. The investigation confirms the successful development of a PSO processor with appealing performance characteristics that outperforms recently presented implementations. The proposed hardware implementation attains 23,300 improvement ratio of execution times with an elliptic evaluation function. In addition, a speedup of 1777 times is achieved with a Shifted Schwefels function. Indeed, the developed framework successfully classifies PSO implementations according to multiple and heterogeneous properties for a variety of benchmark functions.  相似文献   

7.
8.
Algorithmic aspects of area-efficient hardware/software partitioning   总被引:1,自引:0,他引:1  
Area efficiency is one of the major considerations in constraint aware hardware/software partitioning process. This paper focuses on the algorithmic aspects for hardware/software partitioning with the objective of minimizing area utilization under the constraints of execution time and power consumption. An efficient heuristic algorithm running in O(n log n) is proposed by extending the method devised for solving the 0-1 knapsack problem. Also, an exact algorithm based on dynamic programming is proposed to produce the optimal solution for small-sized problems. Simulation results show that the proposed heuristic algorithm yields very good approximate solutions while dramatically reducing the execution time.  相似文献   

9.
A wide variety of real-time applications (e.g. multimedia, communication, etc.) require implementations that meet tight timing constraints. This work introduces novel high-performance FPGA architecture capable of implementing efficiently any time critical application. The fundamental contribution of the proposed reconfigurable architecture is the design of a highly efficient (performance and power consumption) interconnection structure, taking into consideration the statistical and spatial data extracted from applications, which are implemented on Virtex FPGAs. The derived architecture is software-supported by the MEANDER design framework. Using a number of real-time applications, extensive comparison study in terms of several design parameters proves the effectiveness of the proposed architecture against to Virtex one. More specifically, the proposed architecture achieves performance improvement and power savings up to 20 and 16%, respectively. Moreover, compared to a Virtex architecture with same power budget, our architecture achieves performance improvement by 42%.
Dimitrios Soudris (Corresponding author)Email:
  相似文献   

10.
Passive radio-frequency identification (RFID) tags have long been thought to be too weak to implement public-key cryptography: It is commonly assumed that the power consumption, gate count and computation time of full-strength encryption exceed the capabilities of RFID tags. In this paper, we demonstrate that these assumptions are incorrect. We present two low-resource implementations of a 1,024-bit Rabin encryption variant called WIPR—in embedded software and in hardware. Our experiments with the software implementation show that the main performance bottleneck of the system is not the encryption time but rather the air interface and that the reader’s implementation of the electronic product code Class-1 Generation-2 RFID standard has a crucial effect on the system’s overall performance. Next, using a highly optimized hardware implementation, we investigate the trade-offs between speed, area and power consumption to derive a practical working point for a hardware implementation of WIPR. Our recommended implementation has a data-path area of 4,184 gate equivalents, an encryption time of 180  ms and an average power consumption of 11 \(\upmu \)W, well within the established operating envelope for passive RFID tags.  相似文献   

11.
This paper presents the implementation and scaling of a neocortex inspired cognitive model on a Cray XD1. Both software and reconfigurable logic based FPGA implementations of the model are examined. This model belongs to a new class of biologically inspired cognitive models. Large scale versions of these models have the potential for significantly stronger inference capabilities than current conventional computing systems. These models have large amounts of parallelism and simple computations, thus allowing highly efficient hardware implementations. As a result, hardware-acceleration of these models can produce significant speedups over fully software implementations. Parallel software and hardware-accelerated implementations of such a model are investigated for networks of varying complexity. A scaling analysis of these networks is presented and utilized to estimate the throughput of both hardware-accelerated and software implementations of larger networks that utilize the full resources of the Cray XD1. Our results indicate that hardware-acceleration can provide average throughput gains of 75 times over software-only implementations of the networks we examined on this system.
Christopher N. VutsinasEmail:
  相似文献   

12.
Many-core processors are accelerating the performance of contemporary high-performance systems. Managing power consumption within these systems demands low-power architectures to increase power savings. One of the promising solutions offered today by microprocessor architects is asymmetric microprocessors that integrate different core architectures on a single die. This paper presents analytical models based on scaled power metrics to analyze the impact of various architectural design choices on scaled performance and power savings. The power consumption implications of different processing schemes and various chip configurations were also analyzed. Analysis shows that by choosing the optimal chip configuration, energy efficiency and energy savings can be increased considerably.  相似文献   

13.
In our state-of-the-art study, we improve neural network-based models for predicting energy consumption in buildings by parallelizing the CHC adaptive search algorithm. We compared the sequential implementation of the evolutionary algorithm with the new parallel version to obtain predictors and found that this new version of our software tool halved the execution time of the sequential version. New predictors based on various classes of neural networks have been developed and the obtained results support the validity of the proposed approaches with an average improvement of 75% of the average execution time in relation to previous sequential implementations.  相似文献   

14.
The design of embedded systems radically differs from pure software design in that it should take into account not only the functional, but also extra-functional specifications regarding the use of resources of the execution platform such as processing time, memory, and energy. Meeting extra-functional specifications is essential for the design of embedded systems. It requires predictability of the impact of design choices on the overall behavior of the designed system. It also implies a deep understanding of the interaction between application software and the underlying execution platform. We currently lack approaches for modeling mixed hardware–software systems. There are currently no established rigorous techniques for deriving global models of a given system from models of its application software and its execution platform. However, many researchers and industrials are nowadays working in this area and proposing solutions. The Rigorous Embedded Design Red workshop which took place at EUROSYS11 provided an unique opportunity to discuss several new methodologies for the rigorous design of embedded systems. Through a series of invited talks, the workshop appraised some of the challenges and emerging approaches in the area. A series of design flows has been presented and the workshop discussions focused on performance analysis, correctness (high confidence and security), code generation, and modeling aspects (including timed scheduling and software/hardware interactions). Those concepts have been illustrated with examples coming from the aeronautic, automotive, and robotic areas. The aim of this introduction paper is to briefly present the challenges for Embedded system design surveyed by Red.  相似文献   

15.
Search-based software testing is the application of metaheuristic search techniques to generate software tests. The test adequacy criterion is transformed into a fitness function and a set of solutions in the search space are evaluated with respect to the fitness function using a metaheuristic search technique. The application of metaheuristic search techniques for testing is promising due to the fact that exhaustive testing is infeasible considering the size and complexity of software under test. Search-based software testing has been applied across the spectrum of test case design methods; this includes white-box (structural), black-box (functional) and grey-box (combination of structural and functional) testing. In addition, metaheuristic search techniques have also been applied to test non-functional properties. The overall objective of undertaking this systematic review is to examine existing work into non-functional search-based software testing (NFSBST). We are interested in types of non-functional testing targeted using metaheuristic search techniques, different fitness functions used in different types of search-based non-functional testing and challenges in the application of these techniques. The systematic review is based on a comprehensive set of 35 articles obtained after a multi-stage selection process and have been published in the time span 1996–2007. The results of the review show that metaheuristic search techniques have been applied for non-functional testing of execution time, quality of service, security, usability and safety. A variety of metaheuristic search techniques are found to be applicable for non-functional testing including simulated annealing, tabu search, genetic algorithms, ant colony methods, grammatical evolution, genetic programming (and its variants including linear genetic programming) and swarm intelligence methods. The review reports on different fitness functions used to guide the search for each of the categories of execution time, safety, usability, quality of service and security; along with a discussion of possible challenges in the application of metaheuristic search techniques.  相似文献   

16.
In model-driven development of safety-critical systems (like automotive, avionics or railways), well-formedness of models is repeatedly validated in order to detect design flaws as early as possible. In many industrial tools, validation rules are still often implemented by a large amount of imperative model traversal code which makes those rule implementations complicated and hard to maintain. Additionally, as models are rapidly increasing in size and complexity, efficient execution of validation rules is challenging for the currently available tools. Checking well-formedness constraints can be captured by declarative queries over graph models, while model update operations can be specified as model transformations. This paper presents a benchmark for systematically assessing the scalability of validating and revalidating well-formedness constraints over large graph models. The benchmark defines well-formedness validation scenarios in the railway domain: a metamodel, an instance model generator and a set of well-formedness constraints captured by queries, fault injection and repair operations (imitating the work of systems engineers by model transformations). The benchmark focuses on the performance of query evaluation, i.e. its execution time and memory consumption, with a particular emphasis on reevaluation. We demonstrate that the benchmark can be adopted to various technologies and query engines, including modeling tools; relational, graph and semantic databases. The Train Benchmark is available as an open-source project with continuous builds from https://github.com/FTSRG/trainbenchmark.  相似文献   

17.
Warp processors are a novel architecture capable of autonomously optimizing an executing application by dynamically re-implementing critical kernels within the software as custom hardware circuits in an on-chip FPGA. Previous research on warp processing focused on low-power embedded systems, incorporating a low-end ARM processor as the main software execution resource. We provide a thorough analysis of the scalability of warp processing by evaluating several possible warp processor implementations, from low-power to high-performance, and by evaluating the potential for parallel execution of the partitioned software and hardware. We further demonstrate that even considering a high-performance 1 GHz embedded processor, warp processing provides the equivalent performance of a 2.4 GHz processor. By further enabling parallel execution between the processes and FPGA, the parallel warp processor execution provides the equivalent performance of a 3.2 GHz processor.  相似文献   

18.
Noise removal from color images   总被引:1,自引:0,他引:1  
The noise effects in color images are studied from the human perception and machine perception point of view. Three justifiable observations are made to illustrate problems related to individual color signal processing. To minimize the noise effects, two solutions are studied: One is a rental scheme and the other is a vector signal processing technique. The rental scheme adopts filters originally developed for grey scale images to color images. A set of heuristic criteria is defined to reconstruct an output with minimum artifacts. The vector signal processing technique utilizes a median vector filter based on the well developed median filter for grey scale images. Since the output of the filter does not have the same physical meaning as the median defined in one-dimensional space, the search of a vector median is considered as a minimum problem. The output is guaranteed to be one of the inputs. Both approaches are shown to be very effective in removing speckle noise. Results from real and synthetic images are obtained and compared.  相似文献   

19.
The I/O subsystem has become a major source of energy consumption in a hard real-time monitoring and control system. To reduce its energy consumption without missing deadlines, a dynamic power management (DPM) policy must carefully consider the power parameters of a device, such as its break-even time and wake-up latency, when switching off idle devices. This problem becomes extremely complicated when dynamic voltage scaling (DVS) is applied to change the execution time of a task. In this paper, we present COLORS, a composite low-power scheduling framework that includes DVS in a DPM policy to maximize the energy reduction on the I/O subsystem. COLORS dynamically predicts the earliest-access time of a device and switches off idle devices. It makes use of both static and dynamic slack time to extend the execution time of a task by DVS, in order to create additional switch-off opportunities. Task workloads, processor profiles, and device characteristics all impact the performance of a low-power real-time algorithm. We also identify a key metric that primarily determines its performance. The experimental results show that, compared with previous work, COLORS achieves additional energy reduction up to 20%, due to the efficient utilization of slack time.
Tei-Wei KuoEmail:
  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号