首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Worst Case Execution Time Analysis for a Processor with Branch Prediction   总被引:4,自引:0,他引:4  
Colin  Antoine  Puaut  Isabelle 《Real-Time Systems》2000,18(2-3):249-274
The fundamental requirement for hard real-time systems is that task deadlines be never missed. As a consequence, knowing tasks worst case execution times (WCET) is crucial for such systems. Taking into account modern architectural features makes it possible to determine tighter WCET bounds than with program analysis that ignores such features. While effects of caches and pipelines on WCET analysis have been extensively studied, to our knowledge the effect of the branch prediction on WCET evaluation has not been studied yet. This paper describes a method for statically bounding the number of timing penalties due to erroneous branch predictions. The proposed method is based on static program analysis and branch target buffer modelling. It consists in collecting information on branch target buffer evolution by considering all possible execution paths of a program. Collected information can then be used to classify control transfer instructions so that their worst case branching cost can be estimated and incorporated into the program WCET. A method is also given to tightly predict the WCET of loops whose number of iterations depend on counter variables of outer loops. Experimental results show that the timing penalty due to wrong branch predictions estimated by the proposed technique is close to the real one, which demonstrates the practical applicability of our method.  相似文献   

2.
Gitto语言是一种能适用于硬实时控制应用的高级编程语言。介绍了Giotto语言的编程方法,以自主航模的设计为例,说明了Giotto语言如何在实际设计中应用,由于Giotto设计过程中将时序程序与功能程序分离的特点,使Giotto程序与具体的运行平台无关,提高了程序的健壮性、稳定性及代码的复用性。  相似文献   

3.
With the increased demand in multimedia applications, the need to provide better system support is greater than ever. Multimedia applications have an added dimension of time in their execution which results in stringent timing requirements. Existing systems incorporate such stringent timing requirements either at the system-level or the application-level. System-level supports are typically operating-system-dependent whereas application-level supports are achieved by building timing controls into the application itself. This lengthens the application development time and fails to take full advantage of operating system's capabilities. In this paper, we propose a framework that resides between the system-level and application-level support. The framework consists of two layers: an interface layer that incorporates high-level end-to-end timing constraints, and a system layer that implements a host-end scheduling mechanism to support high-level end-to-end timing specifications. Two applications have been developed using this framework. The results indicate that the framework is able to support rapid-prototyping of multimedia applications with stringent timing requirements.  相似文献   

4.
Matching an application to an architecture in structure and size is a way of achieving higher computation speed. This paper presents a combination of a compiler and a reconfigurable long instruction word (RLIW) architecture as an approach to the matching problem. Configurations suitable for the execution of different parts of a program are determined by a compiler, and code is generated for both reconfiguring the hardware and performing the computation. The RLIW machine, consisting of multiple processing and global data memory modules, effectively utilizes the fine-grained parallelism detected in programs by a compiler. The long word instructions control the operation of processing and memory modules in the system. To reduce the data transfer between processing modules and data memory modules, we provide reconfigurable interconnections among the processing modules which permit direct communication. The compiler uses new techniques, including region scheduling, generation of code for reconfiguration of the system, and memory allocation techniques, to achieve improved performance. Algorithms for packing operations into long word instructions and techniques for effectively assigning memory modules to the operands required by an instruction are developed. Results of the experiments to evaluate the system indicate that speedups of 60–300% can be obtained for both scientific and nonscientific programs. The reconfigurable architecture is responsible for much of the speedup. Also, the results indicate that the major problem of memory bottleneck faced in designing parallel systems is successfully attacked.This paper represents work done while the author was at the University of Pittsburgh  相似文献   

5.
This paper presents a set of efficient graph transformations for local instruction scheduling. These transformations to the data-dependency graph prune redundant and inferior schedules from the solution space of the problem. Optimally scheduling the transformed problems using an enumerative scheduler is faster and the number of problems solved to optimality within a bounded time is increased. Furthermore, heuristic scheduling of the transformed problems often yields improved schedules for hard problems. The basic node-based transformation runs in O(ne) time, where n is the number of nodes and e is the number of edges in the graph. A generalized subgraph-based transformation runs in O(n2 e) time. The transformations are implemented within the Gnu Compiler Collection (GCC) and are evaluated experimentally using the SPEC CPU2000 floating-point benchmarks targeted to various processor models. The results show that the transformations are fast and improve the results of both heuristic and optimal scheduling.  相似文献   

6.
Lee  Yann-Hang  Krishna  C. M. 《Real-Time Systems》2003,24(3):303-317
Power and energy constraints are becoming increasingly prevalent in real-time embedded systems. Voltage-scaling is a promising technique to reduce energy and power consumption: clock speed tends to decrease linearly with supply voltage while power consumption goes down quadratically. We therefore have a tradeoff between the energy consumption of a task and the speed with which it can be completed. The timing constraints associated with real-time tasks can be used to resolve this tradeoff. In this paper, we present two algorithms for voltage-scaling. Assuming that a processor can operate in one of two modes: high voltage and low voltage, we show how to schedule the voltage settings so that deadlines are met while reducing the total energy consumed. We show that significant reductions can be made in energy consumption.  相似文献   

7.
Wepropose timed SCR specifications, which are a generalizationof SCR specifications, intended to specify quantitative timingproperties of real-time systems. We extend the tabular notationof the SCR method to deal with sporadic and periodic timing constraints.We present a formal semantics for timed SCR specifications bytranslating them into timed transition systems. A shutdown systemin Korean nuclear power plants is used as a case study to illustratetimed SCR specifications.  相似文献   

8.
硬实时系统在强分区约束下的双层分区调度   总被引:4,自引:0,他引:4  
文中研究了硬实时系统在强分区约束下的双层分区的调度问题,合理建立了强分区约束下的双层分区调度模型,给出了最坏情况下的分区任务集可调度的判定条件.同时,在此基础上,提出了与分区利用率匹配的分区设计方法,导出了该方法下的系统可调度利用率的最小上限.仿真实验表明,在严格实时的条件下,文中提出的方法相对于现有方法更具优越性,并提高了分区可调度利用率的最小上限.  相似文献   

9.
基于对象的分布式实时系统调度模型研究   总被引:2,自引:0,他引:2  
为了解决分布式实时系统有关分配和调度等问题,给出并用形式化方法描述了一种基于对象分布式实时系统调度的通用模型。该模型包括表示时限的绝对时间约束,表示周期属性的周期约束,表示各种前趋关系和同步要求的相对时间约束以及保证资源使用一致性的一致性约束,此外该模型克服了以往模型不能在应用系统的逻辑和功能部件上描述系统实时的约束的不足,允许从方法和活动上描述所需的约束,降低了单一约束描述的繁杂程度,为了能够使用现有调度算法进行任务调度,讨论了约束转换的问题,给出了高层约束到底层约束的转换规则和相应的转换算法。  相似文献   

10.
Thispaper introduces a new class of applications for constraint programming.This new type of application originates out of a special classof real-time systems, enjoying increasing popularity in areassuch as automotive electronics and aerospace industry. Real-timesystems of this kind are time triggered in the sense that theiroverall behavior is globally controlled by a recurring clocktick. Being able to compute an appropriate pre-runtime scheduleautomatically is the major challenge for such an architecture.What makes this specific off-line scheduling problem somewhatuntypical is that a potentially indefinite, periodic processinghas to be mapped onto a single time window. In this article wewill show how this problem can be solved by constraint programmingand we will describe which techniques from traditional schedulingand real-time computing led to success and which failed whenconfronted with a large-scale application of this type. The techniquesthat proved to be the most successful were special global constraintsand an elaborate search heuristics from Operations Research.Also for finding a valid schedule mere serialization is shownto be sufficient. The actual implementation was done in the concurrentconstraint programming language Oz.  相似文献   

11.
Pop  Paul  Eles  Petru  Peng  Zebo 《Real-Time Systems》2004,26(3):297-325
We present an approach to static priority preemptive process scheduling for the synthesis of hard real-time distributed embedded systems where communication plays an important role. The communication model is based on a time-triggered protocol. We have developed an analysis for the communication delays with four different message scheduling policies over a time-triggered communication channel. Optimization strategies for the synthesis of communication are developed, and the four approaches to message scheduling are compared using extensive experiments.  相似文献   

12.
光纤通道交换机在强实时约束下的分组调度   总被引:3,自引:0,他引:3  
以光纤通道交换网络强实时约束下的性能研究为背景,采用实时通信中的周期性任务模型,提出了负载匹配的加权轮循分组调度,导出了在该方法下网络消息集严格实时的充要条件,以最差情形下强实时的网络可达负载率为性能衡量指标推证了采用该算法的优越性并通过仿真进行了验证.  相似文献   

13.
Mueller  Frank 《Real-Time Systems》2000,18(2-3):217-247
This paper contributes a comprehensive study of a framework to bound worst-case instruction cache performance for caches with arbitrary levels of associativity. The framework is formally introduced, operationally described and its correctness is shown. Results of incorporating instruction cache predictions within pipeline simulation show that timing predictions for set-associative caches remain just as tight as predictions for direct-mapped caches. The low cache simulation overhead allows interactive use of the analysis tool and scales well with increasing associativity.The approach taken is based on a data-flow specification of the problem and provides another step toward worst-case execution time prediction of contemporary architectures and its use in schedulability analysis for hard real-time systems.  相似文献   

14.
Lee  Minsuk  Min  Sang Lyul  Shin  Heonshik  Kim  Chong Sang  Park  Chang Yun 《Real-Time Systems》1997,13(1):47-65
Cache memories have been extensively used to bridge the speed gap between high speed processors and relatively slow main memory. However, they are not widely used in real-time systems due to their unpredictable performance. This paper proposes an instruction prefetching scheme called threaded prefetching as an alternative to instruction caching in real-time systems. In the proposed threaded prefetching, an instruction block pointer called a thread is assigned to each instruction memory block and is made to point to the next block on the worst case execution path that is determined by a compile-time analysis. Also, the thread is not updated throughout the entire program execution to guarantee predictability. This paper also compares the worst case performances of various previous instruction prefetching schemes with that of the proposed threaded prefetching. By analyzing several benchmark programs, we show that the worst case performance of the proposed scheme is significantly better than those of previous instruction prefetching schemes. The results also show that when the block size is large enough the worst case performance of the proposed threaded prefetching scheme is almost as good as that of an instruction cache with 100 % hit ratio.  相似文献   

15.
支持替代/补偿的实时调度策略   总被引:1,自引:0,他引:1  
提出了支持替代/补偿的实时事务模型,该模型上的实时事务具备较强的适应能力和自我纠错能力,适合于嵌入式实时数据库系统.在分析补偿任务的实时性和价值特征的基础上,研究了补偿任务的调度时机,给出了相应的调度策略和实现算法.  相似文献   

16.
针对IXP425处理器的Bootloader(汇编部分)实现   总被引:6,自引:0,他引:6  
郑虔斌  朱旭涛 《微机发展》2005,15(3):93-94,97
Bootloader是嵌入式开发的最先部分,各种Bootloader的功能也各异,其中成熟的有redbcot,ubcot等。在开发一个新的系统时无法直接把这些成熟的代码拿来使用,而重新写一个Bootloader又太浪费资源,因此移植就成了首选。Bootloader的移植工作主要集中在汇编代码部分,弄清了汇编部分的实现,其C语言部分的修改工作就不多了。文中就是以Intel IXP425处理器为实例解析了Bootloader的汇编代码执行过程。  相似文献   

17.
Dataflow analysis is a well-understood and very powerful technique for analyzing programs as part of the compilation process. Virtually all compilers use some sort of dataflow analysis as part of their optimization phase. However, despite being well-understood theoretically, such analyses are often difficult to code, making it difficult to quickly experiment with variants.To address this, we developed a domain-specific language, Analyzer Generator (AG), that synthesizes dataflow analysis phases for Microsoft's Phoenix compiler framework. AG hides the fussy details needed to make analyses modular, yet generates code that is as efficient as the hand-coded equivalent. One key construct we introduce allows IR object classes to be extended without recompiling.Experimental results on three analyses show that AG code can be one-tenth the size of the equivalent handwritten C++ code with no loss of performance. It is our hope that AG will make developing new dataflow analyses much easier.  相似文献   

18.
以嵌入式处理器和嵌入式操作系统为核心的嵌入式系统已经得到了广泛的应用,但是它不能很好地进行实时的数据处理,而DSP处理器却很适合这样的实时处理。所以基于嵌入式处理器和DSP协同工作的系统设计成为了嵌入式系统的发展方向之一。文中提出并研究设计了一种生物芯片扫描分析仪。该系统以高性能嵌入式处理器和DSP处理器的协同工作为核心,并兼有自动数据分析算法,实现了无需手工干预的生物芯片扫描和分析的一体化,具有高可靠性、高实时性、小型化和低功耗的特点。文中详细论述了系统整体结构以及软硬件部分的实现。  相似文献   

19.
In a SIMD or VLIW machine, conceptual synchronizations are accomplished by using a static code schedule that does not require run-time synchronization. The lack of run-time synchronization overhead makes these machines very effective for fine-grain parallelism, but they cannot execute parallel code structures as general as those executed by MIMD architectures, and this limits their utility.In this paper we present a timing analysis that allows a compiler for a MIMD machine to eliminate a large fraction of the run-time synchronization by making efficient use of static code scheduling. Although these techniques can be adapted to be applied to most MIMD machines, this paper centers on the analysis and scheduling for barrier MIMD machines. Barrier MIMDs are asynchronous multiple instruction stream/multiple data stream architectures capable of parallel execution of variable execution-time instructions and arbitrary control flow (e.g., while loops and calls). However, they also incorporate a special hardware barrier synchronization mechanism that facilitates static scheduling by providing a mechanism which the compiler can use to enforce precise timing constraints. In other words, the compiler tracks relative timing between processors and uses static code scheduling until the timing imprecision becomes too large, at which point the compiler simply inserts a barrier to reduce that timing imprecision to zero (or a small constant).This paper describes new scheduling and barrier placement algorithms for barrier MIMDs that are based loosely on the list scheduling approach employed for VLIWs [Ellis 1985]. In addition, the experimental results from scheduling thousands of synthetic benchmark programs for a parameterized barrier MIMD machine are presented.  相似文献   

20.
Array redistribution is usually required for more efficiently executing a data-parallel program on distributed memory multi-computers. In performing array redistribution using synchronous communication mode, data communications among the processors should be properly arranged to avoid incurring higher data transfer cost. Some efficient communication scheduling methods for the Block-Cyclic redistribution have been proposed. On the other hand, the processor mapping technique can help reduce the data transfer cost of redistribution. To avoid degrading the benefit of data transfer cost reduction, it is needed to construct optimal communication schedules for the redistribution in which the processor mapping technique is applied. In this paper, we present a unified approach to constructing optimal communication schedules for the processor mapping technique applied Block-Cyclic redistribution. The proposed method is founded on the processor mapping technique and can more efficiently construct the required communication schedules than other optimal scheduling methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号