共查询到20条相似文献,搜索用时 15 毫秒
1.
Sequential Monte Carlo (SMC) represents a principal statistical method for tracking objects in video sequences by on-line estimation of the state of a non-linear dynamic system. The performance of individual stages of the SMC algorithm is usually data-dependent, making the prediction of the performance of a real-time capable system difficult and often leading to grossly overestimated and inefficient system designs. Also, the considerable computational complexity is a major obstacle when implementing SMC methods on purely CPU-based resource constrained embedded systems. In contrast, heterogeneous multi-cores present a more suitable implementation platform. We use hybrid CPU/FPGA systems, as they can efficiently execute both the control-centric sequential as well as the data-parallel parts of an SMC application. However, even with hybrid CPU/FPGA platforms, determining the optimal HW/SW partitioning is challenging in general, and even impossible with a design time approach. Thus, we need self-adaptive architectures and system software layers that are able to react autonomously to varying workloads and changing input data while preserving real-time constraints and area efficiency. In this article, we present a video tracking application modeled on top of a framework for implementing SMC methods on CPU/FPGA-based systems such as modern platform FPGAs. Based on a multithreaded programming model, our framework allows for an easy design space exploration with respect to the HW/SW partitioning. Additionally, the application can adaptively switch between several partitionings during run-time to react to changing input data and performance requirements. Our system utilizes two variants of a add/remove self-adaptation technique for task partitioning inside this framework that achieve soft real-time behavior while trying to minimize the number of active cores. To evaluate its performance and area requirements, we demonstrate the application and the framework on a real-life video tracking case study and show that partial reconfiguration can be effectively and transparently used for realizing adaptive real-time HW/SW systems. 相似文献
2.
The authors present a real-time kernel developed to support a distributed multisensor system encountered in robotics applications. To ensure predictability, the kernel provides services with bounded worst-case execution times. In addition, the kernel allows the programmer to specify timing constraints for process execution and interprocess communication. The kernel uses these timing constraints both for scheduling processes and for scheduling communications. To illustrate the kernel, the authors describe a multisensor system being developed on their distributed real-time system. They present the measured performance of kernel primitives along with conclusions and remarks regarding distributed real-time systems 相似文献
3.
Rafael Rodríguez-Sánchez José Luis Martínez Gerardo Fernández-Escribano José Luis Sánchez José Manuel Claver 《Multimedia Tools and Applications》2013,66(3):361-381
The AVC video coding standard adopts variable block sizes for inter frame coding to increase compression efficiency, among other new features. As a consequence of this, an AVC encoder has to employ a complex mode decision technique that requires high computational complexity. Several techniques aimed at accelerating the inter prediction process have been proposed in the literature in recent years. Recently, with the emergence of many-core processors or accelerators, a new way of supporting inter frame prediction has presented itself. In this paper, we present a step forward in the implementation of an AVC inter prediction algorithm in a graphics processing unit, using Compute Unified Device Architecture. The results show a negligible drop in rate distortion with a time reduction, on average, of over 98.8 % compared with full search and fast full search, and of over 80 % compared with UMHexagonS search. 相似文献
4.
Saadatmand Faezeh Sadat Rohbani Nezam Baharvand Farshad Farbeh Hamed 《The Journal of supercomputing》2021,77(2):1939-1957
The Journal of Supercomputing - Technology scaling has exacerbated the aging impact on the performance and reliability of integrated circuits. By entering into nanotechnology era in recent years,... 相似文献
5.
The Journal of Supercomputing - In embedded systems such as automotive systems, multi-core processors are expected to improve performance and reduce manufacturing cost by integrating multiple... 相似文献
6.
Real-Time Systems - Predictable execution time upon accessing shared memories in multi-core real-time systems is a stringent requirement. A plethora of existing works focus on the analysis of... 相似文献
7.
Balarin F. Lavagno L. Murthy P. Sangiovanni-Vincentelli A. Systems C.D. Sangiovanni- A. 《Design & Test of Computers, IEEE》1998,15(1):71-82
The authors review several approaches to control-oriented and dataflow-oriented software scheduling to determine whether a given technique can satisfy deadlines, throughput, and other constraints for embedded real-time systems 相似文献
8.
Arslan Munir Ann Gordon-Ross Sanjay Ranka Farinaz Koushanfar 《Journal of Parallel and Distributed Computing》2014
With Moore’s law supplying billions of transistors on-chip, embedded systems are undergoing a transition from single-core to multi-core to exploit this high transistor density for high performance. However, the optimal layout of these multiple cores along with the memory subsystem (caches and main memory) to satisfy power, area, and stringent real-time constraints is a challenging design endeavor. The short time-to-market constraint of embedded systems exacerbates this design challenge and necessitates the architectural modeling of embedded systems to reduce the time-to-market by expediting target applications to device/architecture mapping. In this paper, we present a queueing theoretic approach for modeling multi-core embedded systems that provides a quick and inexpensive performance evaluation both in terms of time and resources as compared to the development of multi-core simulators and running benchmarks on these simulators. We verify our queueing theoretic modeling approach by running SPLASH-2 benchmarks on the SuperESCalar simulator (SESC). Results reveal that our queueing theoretic model qualitatively evaluates multi-core architectures accurately with an average difference of 5.6% as compared to the architectures’ evaluations from the SESC simulator. Our modeling approach can be used for performance per watt and performance per unit area characterizations of multi-core embedded architectures, with varying number of processor cores and cache configurations, to provide a comparative analysis. 相似文献
9.
Jiayin Li Zhong Ming Meikang Qiu Gang Quan Xiao Qin Tianzhou ChenAuthor vitae 《Journal of Systems Architecture》2011,57(9):840-849
Multi-core technologies are widely used in embedded systems and the resource allocation is vita to guarantee Quality of Service (QoS) requirements for applications on multi-core platforms. For heterogeneous multi-core systems, the statistical characteristics of execution times on different cores play a critical role in the resource allocation, and the differences between the actual execution time and the estimated execution time may significantly affect the performance of resource allocation and cause system to be less robust. In this paper, we present an evaluation method to study the impacts of inaccurate execution time information to the performance of resource allocation. We propose a systematic way to measure the robustness degradation of the system and evaluate how inaccurate probability parameters may affect the performance of resource allocations. Furthermore, we compare the performance of three widely used greedy heuristics when using the inaccurate information with simulations. 相似文献
10.
Guowei Wu Author Vitae Zichuan Xu Author Vitae 《Journal of Systems and Software》2010,83(12):2579-2590
High temperature will affect the stability and performance of multi-core processors. A temperature-aware scheduling algorithm for soft real-time multi-core systems is proposed in this paper, namely LTCEDF (Low Thermal Contribution Early Deadline First). According to the core temperature and thread thermal contribution, LTCEDF performs thread migration and exchange to avoid thermal saturation and to keep temperature equilibrium among all the cores. The core temperature calculation method and the thread thermal contribution prediction method are presented. LTCEDF is simulated on ATMI simulator platform. Simulation results show that LTCEDF can not only minimize the thermal penalty, but also meet real-time guarantee. Moreover, it can create a more uniform power density map than other thermal-aware algorithms, and significantly reduce thread migration frequency. 相似文献
11.
Hussein EL Ghor Maryline Chetto Rafic Hage ChehadeAuthor vitae 《Computers & Electrical Engineering》2011,37(4):498-510
Real-time scheduling refers to the problem in which there is a deadline associated with the execution of a task. In this paper, we address the scheduling problem for a uniprocessor platform that is powered by a renewable energy storage unit and uses a recharging system such as photovoltaic cells. First, we describe our model where two constraints need to be studied: energy and deadlines. Since executing tasks require a certain amount of energy, classical task scheduling like earliest deadline is no longer convenient. We present an on-line scheduling scheme, called earliest deadline with energy guarantee (EDeg), that jointly accounts for characteristics of the energy source, capacity of the energy storage as well as energy consumption of the tasks, and time. In order to demonstrate the benefits of our algorithm, we evaluate it by means of simulation. We show that EDeg outperforms energy non-clairvoyant algorithms in terms of both deadline miss rate and size of the energy storage unit. 相似文献
12.
Describes a fault-tolerant algorithm which uses a time-value scheduling approach to detect faults, sustain high processor utilization, and ensure timely execution of critical tasks 相似文献
13.
14.
《Control Engineering Practice》2007,15(3):363-375
This paper describes the architecture and design framework for a multiprocessor system on chip (SoC) solution that is being developed for adaptive, high-performance, embedded real-time control applications. Most of the design-to-implementation stages are automated by software tools avoiding most of the error-prone programming tasks and hardware-related issues. Therefore, the work presented here minimises the interdisciplinary design efforts typical to mechatronic systems design, allowing control engineers to focus mainly on the control laws development. The performance achieved by the proposed architecture allows for a straightforward addressing of implementation requirements for a variety of embedded applications, including micro-electromechanical systems. 相似文献
15.
Jakob Engblom Andreas Ermedahl Mikael Sjödin Jan Gustafsson Hans Hansson 《International Journal on Software Tools for Technology Transfer (STTT)》2003,4(4):437-455
In this article we give an overview of the worst-case execution time (WCET) analysis research performed by the WCET group of the ASTEC Competence Centre at Uppsala University. Knowing the WCET of a program is necessary when designing and verifying real-time systems. The WCET depends both on the program flow, such as loop iterations and function calls, and on hardware factors, such as caches and pipelines. WCET estimates should be both safe (no underestimation allowed) and tight (as little overestimation as possible). We have defined a modular architecture for a WCET tool, used both to identify the components of the overall WCET analysis problem, and as a starting point for the development of a WCET tool prototype. Within this framework we have proposed solutions to several key problems in WCET analysis, including representation and analysis of the control flow of programs, modeling of the behavior and timing of pipelines and other low-level timing aspects, integration of control flow information and low-level timing to obtain a safe and tight WCET estimate, and validation of our tools and methods. We have focussed on the needs of embedded real-time systems in designing our tools and directing our research. Our long-term goal is to provide WCET analysis as a part of the standard tool chain for embedded development (together with compilers, debuggers, and simulators). This is facilitated by our cooperation with the embedded systems programming-tools vendor IAR Systems. 相似文献
16.
Martin Humenberger Christian Zinner Michael Weber Wilfried Kubinger Markus Vincze 《Computer Vision and Image Understanding》2010,114(11):1180-1202
In this paper, the challenge of fast stereo matching for embedded systems is tackled. Limited resources, e.g. memory and processing power, and most importantly real-time capability on embedded systems for robotic applications, do not permit the use of most sophisticated stereo matching approaches. The strengths and weaknesses of different matching approaches have been analyzed and a well-suited solution has been found in a Census-based stereo matching algorithm. The novelty of the algorithm used is the explicit adaption and optimization of the well-known Census transform in respect to embedded real-time systems in software. The most important change in comparison with the classic Census transform is the usage of a sparse Census mask which halves the processing time with nearly unchanged matching quality. This is due the fact that large sparse Census masks perform better than small dense masks with the same processing effort. The evidence of this assumption is given by the results of experiments with different mask sizes. Another contribution of this work is the presentation of a complete stereo matching system with its correlation-based core algorithm, the detailed analysis and evaluation of the results, and the optimized high speed realization on different embedded and PC platforms. The algorithm handles difficult areas for stereo matching, such as areas with low texture, very well in comparison to state-of-the-art real-time methods. It can successfully eliminate false positives to provide reliable 3D data. The system is robust, easy to parameterize and offers high flexibility. It also achieves high performance on several, including resource-limited, systems without losing the good quality of stereo matching. A detailed performance analysis of the algorithm is given for optimized reference implementations on various commercial of the shelf (COTS) platforms, e.g. a PC, a DSP and a GPU, reaching a frame rate of up to 75 fps for 640 × 480 images and 50 disparities. The matching quality and processing time is compared to other algorithms on the Middlebury stereo evaluation website reaching a middle quality and top performance rank. Additional evaluation is done by comparing the results with a very fast and well-known sum of absolute differences algorithm using several Middlebury datasets and real-world scenarios. 相似文献
17.
Many embedded computing systems are distributed systems: communicating processes executing on several CPUs/ASICs. This paper describes a performance analysis algorithm for a set of tasks executing on a heterogeneous distributed system. Tight bounds are essential to the synthesis and verification of application-specific distributed systems, such as embedded computing systems. Our bounding algorithms are valid for a general problem model: The system can contain several tasks with hard real-time deadlines and different periods; each task is partitioned into a set of processes related by data dependencies. The periods of tasks and the computation times of processes are not necessarily constant and can be specified by a lower bound and an upper bound. Such a model requires a more sophisticated algorithm, but leads to more accurate results than previous work. Our algorithm both provides tighter bounds and is faster than previous methods 相似文献
18.
19.
Linwei Niu 《Real-Time Systems》2011,47(2):75-108
While the dynamic voltage scaling (DVS) techniques are efficient in reducing the dynamic energy consumption for the processor,
varying voltage alone becomes less effective for the overall energy reduction as the static power is growing rapidly. On the
other hand, Quality of Service (QoS) is also a primary concern in the development of today’s pervasive computing systems.
In this paper, we propose a dynamic approach to minimize the overall energy consumption for soft real-time systems while ensuring
the QoS-guarantee. The QoS requirements are deterministically quantified with the window-constraints, which require that at least m out of each non-overlapped window of k consecutive jobs of a task meet their deadlines. Necessary and sufficient conditions for checking the feasibility of task
sets with arbitrary service times and periods are developed to ensure that the window-constraints can be guaranteed in the worst case. And efficient scheduling techniques based on pattern variation and dynamic slack reclaiming
extensions are proposed to combine the task procrastination and dynamic slowdown to minimize the energy consumption. In contrast
to the previous leakage-aware dynamic reclaiming work which never scales the job speed below the critical speed, we will show
that it can be more energy efficient to reclaim the slack with speed lower than the critical speed when necessary. Through
extensive simulations, our experiment results demonstrate that the proposed techniques significantly outperformed the previous
research in both overall and idle energy reduction. 相似文献
20.
Mohammad Ashjaei Nima Khalilzad Saad Mubeen Moris Behnam Ingo Sander Luis Almeida Thomas Nolte 《Real-Time Systems》2017,53(6):916-956
Contemporary distributed embedded systems in many domains have become highly complex due to ever-increasing demand on advanced computer controlled functionality. The resource reservation techniques can be effective in lowering the software complexity, ensuring predictability and allowing flexibility during the development and execution of these systems. This paper proposes a novel end-to-end resource reservation model for distributed embedded systems. In order to support the development of predictable systems using the proposed model, the paper provides a method to design resource reservations and an end-to-end timing analysis. The reservation design can be subjected to different optimization criteria with respect to runtime footprint, overhead or performance. The paper also presents and evaluates a case study to show the usability of the proposed model, reservation design method and end-to-end timing analysis. 相似文献