期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

实时微处理器体系结构综述 总被引：1，自引：0，他引：1

石伟张明郭御风龚锐《计算机工程与科学》2015,37(5):857-864

实时应用已经成为嵌入式应用中一类快速崛起的典型应用。作为实时系统的核心部件,实时微处理器体系结构是微处理器领域的一个重要研究方向。与通用处理器追求最大吞吐量不同,实时处理器要求具有紧凑且可计算的最坏执行时间。传统的实时处理器往往采用较为简单的处理器结构,避免复杂结构引入执行时间的不确定性。随着实时应用对处理器性能需求越来越高,实时处理器正逐渐向多线程与多核结构发展。在多线程与多核处理器中,共享资源竞争导致实时系统的确定性变差,对实时处理器体系结构带来了更大挑战。对实时微处理器体系结构进行综述,首先从指令集、微体系结构、存储、I/O、任务调度等多个方面对传统实时处理器进行分析;然后分别对采用多线程与多核结构的高性能实时处理器展开分析;最后对几种商用实时处理器结构进行比较,总结实时处理器发展现状与未来发展趋势。相似文献

2.

实时多任务执行模型到VxWorks的代码映射

下载免费PDF全文

刘晓燕张云生于立新沈嘉权李俊昌《计算机工程与应用》2008,44(3):122-123

基于对实时多任务软件的抽象执行体的研究分析以及对VxWorks平台的分析,提出了实时多任务执行模型到VxWorks平台下C语言程序框架的系统调用的映射规则及代码。给出了实时对象之间的通信原理及映射方法。给出了实时时间管理的映射代码。相似文献

3.

面向多核处理器的实时优化技术:基于独立实时域的实时优化方法

冯华卢凯王小平《计算机科学》2013,40(9):159-162,189

多核处理器具有良好的性能功耗比,因此其在实时嵌入式系统中的应用是一种趋势.然而,现有的软件结构下,多核处理器的多核特性对实时性能的提高没有帮助;甚至,多核处理器核间的资源共享使影响程序执行时间的因素变得复杂,实时任务的最坏执行时间(Worst Case Execution Time,WCET)变得更为不可预测和难以控制.基于国产飞腾处理器研究了基于多核处理器的实时系统构建和实时性能优化,提出了“基于独立实时域的实时优化方法”;通过虚拟化技术把处理器分为“实时域”和“非实时域”,实时任务和非实时任务运行在不同的核心上,充分利用多核处理器各个核心,高效调度实时任务和非实时任务运行. 相似文献

4.

高性能自旋代码设计

李升起李高超《计算机研究与发展》2012,(Z2):160-164

在面向多核硬件结构的高性能多线程软件系统中,自旋代码的性能对系统整体性能具有较大影响.给出了3种自旋代码的设计方案和性能模拟数据.并给出自旋代码实现建议.根据模拟实验结果发现,基于nanosleep的方案的性能极低,建议在系统中避免使用.基于busy的方案会导致处理器忙转,处理器功耗提升.基于pause的方案可以降低处理器功耗,并提供稍优于busy方案的性能,建议在高性能系统中使用. 相似文献

5.

多核并行技术在分子动力学模拟中的应用 总被引：1，自引：0，他引：1

刘青昆滕人达刘凤宫利东张建强《计算机工程与设计》2011,32(10):3395-3398

为了充分利用多核处理器资源,研究了一种用于分子动力学模拟中的多核并行技术。在多核处理器上利用OpenMP技术实现多线程创建与同步、动态设置子线程的调度运行方式以及负载均衡以减少子线程执行等待时间。通过对不同分子体系结构下的动力学模型测试,得出在不同子线程下并行计算的时间,并且得到了良好的性能加速比。实验结果表明,采用OpenMP并行技术可有效地提高电荷求解过程在分子动力学模拟运算中的时间效率,以及多核计算机资源的利用率。相似文献

6.

低功耗多线程编译优化技术 总被引：12，自引：1，他引：12

赵荣彩唐志敏张兆庆 Guang R. Gao 《软件学报》2002,13(6):1123-1129

提出了在多线程体系结构中通过降低执行频率有效减小功耗的理论模型和方法.首先研究识别可降频运行的线程的计算模型和降频因子的计算,然后给出在编译过程中基于对应用程序行为的分析,结合线程划分的低功耗编译优化算法和实现策略.该模型和方法可用于具有执行频率可动态调整的多处理器类多线程体系结构,既可开发TLP(thread level parallelism),又可有效减小功率消耗. 相似文献

7.

基于多级一致性协议的多核处理器WCET分析方法

朱怡安史先琛姚烨李联任鹏远董威振李佳钰《计算机研究与发展》2023,(1):30-42

由于多核处理器优越的计算性能,多核处理器现已广泛应用在嵌入式实时系统中.相对于单核处理器,多核处理器存在资源共享竞争、并行任务干扰等因素,尤其是缓存（Cache）一致性问题,导致任务最坏情况执行时间（worst-case execution time,WCET）的预测更加困难.基于以上因素,提出基于多级一致性协议的多核处理器WCET分析方法.该方法针对多级一致性协议体系架构,提出多级一致性域的概念,将多核处理器的数据访问分为域内访问和跨域访问2个层次,根据Cache读写策略和MESI(modify exclusive shared invalid)一致性协议,得出一致性域内部和跨一致性域的Cache状态更新函数,从而实现多级一致性协议嵌套情况下的WCET分析.实验结果表明,在改变Cache配置参数的情况下,该方法分析结果与GEM5仿真结果的变化趋势一致,经过相关性分析,GEM5仿真结果与该方法分析结果相关性系数不低于0.98;在分析精度方面,该方法的平均过估计率为1.30,相比现有方法降低了0.78. 相似文献

8.

基于共享Cache多核处理器的Hash连接优化 总被引：1，自引：0，他引：1

邓亚丹景宁熊伟《软件学报》2010,21(6):1220-1232

针对目前主流的多核处理器,研究了基于共享缓存多核处理器环境下的数据库Hash连接优化.首先提出基于Radix-Join算法的Hash连接多线程执行框架,通过实例分析了影响多线程Radix-Join算法性能的因素.在此基础上,优化了Hash连接多线程执行框架中的各种线程及其访问共享Cache的性能,优化了聚集连接时Hash连接算法的内存访问,并分析了多线程聚集划分的加速比.基于开源数据库INGRES和EaseDB,实现了所提出的连接多线程执行框架,在实验中测试了多线程Hash连接框架的性能.实验结果表明,该算法可以有效解决Hash连接执行时共享Cache在多线程条件下的访问冲突和处理器负载均衡问题,极大地提高了Hash连接性能. 相似文献

9.

OpenCMP：一个支持事务存储模型的多核处理器模拟器

何裕南安虹郭锐梁博《计算机科学》2007,34(1):248-254

CPU设计正在由仅开发指令级并行性的单线程单核结构转向利用线程级并行性的多线程多核结构，但至今还没有一个可移植性好并被广泛使用的开源多核处理器模拟器，限制了在这样的结构上开展高质量的研究工作。我们开发了一个多核处理器体系结构模拟器OpenCMP，用于支持当前和未来对多线程多核处理器体系结构关键技术的研究。该模拟器适当地抽象了多核处理器结构，为主流的多核处理器结构研究提供一个可扩展、灵活的模拟工具框架，包括支持对乱序、顺序的处理器核和同时多线程处理器核的模拟，以便对更大的多核设计空间进行比较性研究。本文以支持事务存储模型的多核处理器结构模拟器为例，详细描述了如何通过抽象多核结构和事务存储模型的最基本特性和组成部分，扩展单核处理器模拟器SimpleScalar，设计与实现一个多核处理器模拟器。初步研究表明，与现有的多核处理器模拟器相比，该模拟器能够较好地支持对事务存储模型和基于事务存储模型的多核处理器体系结构的研究．相似文献

10.

基于纳什均衡的AUTOSAR任务到多核ECU的映射方法

冉正罗蕾晏华李允《计算机科学》2018,45(6):166-171, 182

随着汽车电子应用程序对处理器性能需求的不断提高,现代汽车电子系统中的电子控制单元(ECU)已升级为多核结构。多核ECU中的AUTOSAR应用程序的设计、实现和集成将面临新的挑战。其中一个重要的挑战是在映射任务到多核ECU的同时确保系统的实时性能。且在AUTOSAR静态配置过程中,实时系统的资源限制和调度分析使问题变得更加复杂。因此,文中提出了一种基于纳什均衡的AUTOSAR任务到多核ECU的映射方法。该方法将任务优先级应用于博弈过程中,对提高任务映射过程的效率具有非常重要的实用价值。最后,将所提方法应用于AUTOSAR标准的实例中。实验结果表明,所提方法在减少各个任务中可运行实体的最坏响应时间方面具有良好的表现。相似文献

11.

Towards a verified compiler prototype for the synchronous language SIGNAL

Zhibin YANG Jean-Paul BODEVEIX Mamoun FILALI Kai HU Yongwang ZHAO Dianfu MA 《Frontiers of Computer Science》2016,10(1):37-53

SIGNAL belongs to the synchronous languages family which are widely used in the design of safety-critical real-time systems such as avionics, space systems, and nuclear power plants. This paper reports a compiler prototype for SIGNAL. Compared with the existing SIGNAL compiler, we propose a new intermediate representation (named S-CGA, a variant of clocked guarded actions), to integrate more synchronous programs into our compiler prototype in the future. The front-end of the compiler, i.e., the translation from SIGNAL to S-CGA, is presented. As well, the proof of semantics preservation is mechanized in the theorem prover Coq. Moreover, we present the back-end of the compiler, including sequential code generation and multithreaded code generation with time-predictable properties. With the rising importance of multi-core processors in safetycritical embedded systems or cyber-physical systems (CPS), there is a growing need for model-driven generation of multithreaded code and thus mapping on multi-core. We propose a time-predictable multi-core architecture model in architecture analysis and design language (AADL), and map the multi-threaded code to this model. 相似文献

12.

T-CREST: Time-predictable multi-core architecture for embedded systems

《Journal of Systems Architecture》2015,61(9):449-471

Real-time systems need time-predictable platforms to allow static analysis of the worst-case execution time (WCET). Standard multi-core processors are optimized for the average case and are hardly analyzable. Within the T-CREST project we propose novel solutions for time-predictable multi-core architectures that are optimized for the WCET instead of the average-case execution time. The resulting time-predictable resources (processors, interconnect, memory arbiter, and memory controller) and tools (compiler, WCET analysis) are designed to ease WCET analysis and to optimize WCET performance. Compared to other processors the WCET performance is outstanding.The T-CREST platform is evaluated with two industrial use cases. An application from the avionic domain demonstrates that tasks executing on different cores do not interfere with respect to their WCET. A signal processing application from the railway domain shows that the WCET can be reduced for computation-intensive tasks when distributing the tasks on several cores and using the network-on-chip for communication. With three cores the WCET is improved by a factor of 1.8 and with 15 cores by a factor of 5.7.The T-CREST project is the result of a collaborative research and development project executed by eight partners from academia and industry. The European Commission funded T-CREST. 相似文献

13.

A scenario- and platform-aware design flow for image-based control systems

《Microprocessors and Microsystems》2020

Image-based control (IBC) systems are increasingly being used in various domains including autonomous driving. The key challenge in IBC is to deal with high computation demand while guaranteeing performance and safety requirements such as stability. While modern industrial heterogeneous platforms, such as NVIDIA Drive, offer the necessary compute power, application development on these platforms with performance and safety guarantees is still challenging. Alternative time-predictable platforms are not yet in widespread use.A typical design flow for IBC systems consists of three distinct elements: (i) mapping tasks onto platform resources; (ii) timing analysis, consisting of task-level worst-case execution time (WCET) analysis and application-level analysis to obtain worst-case performance bounds on aspects such as latency and throughput; (iii) controller design using the obtained performance bounds, ensuring performance and safety. While such a three-step design process is modular in nature, it usually leads to over-dimensioned systems with sub-optimal performance, because task- and/or application-level timing bounds are pessimistic.We present a scenario- and platform-aware design flow for IBC systems that exploits frequently occurring workload scenarios to optimize performance. For industrial platforms, where tight task-level WCET bounds are difficult to obtain, we moreover propose to use frequently occurring task execution times instead of WCET estimates to obtain tight application-level temporal bounds. During controller design, we then optimize performance and guarantee stability by identifying appropriate system scenarios and by designing a switched controller that switches between those scenarios. We illustrate our method considering a predictable multiprocessor system-on-chip platform - CompSOC. We validate the proposed method using hardware-in-the-loop (HiL) experiments with an industrial heterogeneous multiprocessor platform - NVIDIA Drive PX2 - considering a lane keeping assist system (LKAS). We obtain an improved control performance compared to state-of-the-art IBC design. 相似文献

14.

面向WCET分析的实时多核体系结构研究

陈芳园丁亚军张冬松吴　飞　任秀江《计算机工程与科学》2014,36(3):393-398

随着工艺技术的发展以及嵌入式实时应用范围的不断扩大和需求的不断提升,多核处理器必将凭其高性能和低功耗特性应用到嵌入式实时领域中。但是,多核处理器体系结构很难甚至无法满足实时系统的实时限制和对WCET的可预测性要求。从多核中的共享资源入手,分析多核中的片上共享资源（共享Cache、片上互连）和片外共享资源（片外存储）对WCET分析的影响,探讨了各种干扰下的WCET分析方法。介绍了两种多核WCET分析模型：多核静态WCET分析模型和多核混合WCET分析模型;同时,针对嵌入式实时应用提出了多核设计原则。相似文献

15.

Using design space exploration for finding schedules with guaranteed reaction times of synchronous programs on multi-core architecture

《Journal of Systems Architecture》2017

The synchronous model of computation is well suited for real-time systems, because it allows static analysis in order to find and guarantee their reaction times. Today’s multi-core systems are becoming the predominant computing platforms. Synchronous programs are typically compiled into single threaded code, which makes them unsuitable for exploiting parallelism of the multi-core platforms. Moreover, static timing analysis becomes highly intractable for multi-core systems. This article proposes a novel methodology that aims at finding the mapping and schedule of synchronous programs that guarantees, statically, reaction times when mapped onto a multi-core system consisting of two types of time-predictable cores. The proposed methodology combines design space exploration based on evolutionary algorithm and scheduling of parts of synchronous programs. It allows minimizing the resource usage in terms of number of cores by finding the mapping and schedule with the guaranteed reaction time for architectures with different number of cores. In particular, we: (a) transform a synchronous program written in synchronous SystemJ to a graph-based model represented with two types of computation nodes suitable for execution on two types of time-predictable cores, (b) perform mapping of computation nodes on a customizable multi-core platform using genetic operations, and (c) generate a resulting static schedule of computation nodes for each mapping as part of the design space exploration. The design flow, from program specification and node mapping to the design space exploration and multi-core scheduling is completely automated. 相似文献

16.

Transforming flow information during code optimization for timing analysis

Raimund Kirner Peter Puschner Adrian Prantl 《Real-Time Systems》2010,45(1-2):72-105

The steadily growing embedded-systems market comprises many application domains in which real-time constraints must be satisfied. To guarantee that these constraints are met, the analysis of the worst-case execution time (WCET) of software components is mandatory. In general WCET analysis needs additional control-flow information, which may be provided manually by the user or calculated automatically by program analysis. For flexibility and simplicity reasons it is desirable to specify the flow information at the same level at which the program is developed, i.e., at the source level. In contrast, to obtain precise WCET bounds the WCET analysis has to be performed at machine-code level. Mapping and transforming the flow information from the source-level down to the machine code, where flow information is used in the WCET analysis, is challenging, even more so if the compiler generates highly optimized code. In this article we present a method for transforming flow information from source code to machine code. To obtain a mapping that is safe and accurate, flow information is transformed in parallel to code transformations performed by an optimizing compiler. This mapping is not only useful for transforming manual code annotations but also if platform-independent flow information is automatically calculated at the source level. We show that our method can be applied to every type of semantics-preserving code transformation. The precision of this flow-information transformation allows its users to calculate tight WCET bounds. 相似文献

17.

Worst‐case execution time analysis for a Java processor

Martin Schoeberl Wolfgang Puffitsch Rasmus Ulslev Pedersen Benedikt Huber 《Software》2010,40(6):507-542

In this paper, we propose a solution for a worst‐case execution time (WCET) analyzable Java system: a combination of a time‐predictable Java processor and a tool that performs WCET analysis at Java bytecode level. We present a Java processor, called JOP, designed for time‐predictable execution of real‐time tasks. The execution time of bytecodes, the instructions of the Java virtual machine, is known to cycle accuracy for JOP. Therefore, JOP simplifies the low‐level WCET analysis. A method cache, which fills whole Java methods into the cache, simplifies cache analysis. The WCET analysis tool is based on integer linear programming. The tool performs the low‐level analysis at the bytecode level and integrates the method cache analysis. An integrated data‐flow analysis performs receiver‐type analysis for dynamic method dispatches and loop‐bound analysis. Furthermore, a model checking approach to WCET analysis is presented where the method cache can be exactly simulated. The combination of the time‐predictable Java processor and the WCET analysis tool is evaluated with standard WCET benchmarks and three real‐time applications. The WCET friendly architecture of JOP and the integrated method cache analysis yield tight WCET bounds. Comparing the exact, but expensive, model checking‐based analysis of the method cache with the static approach demonstrates that the static approximation of the method cache is sufficiently tight for practical purposes. Copyright © 2010 John Wiley & Sons, Ltd. 相似文献

18.

可平台迁移的最坏执行时间分析

刘育芳张立臣《计算机工程与设计》2006,27(8):1317-1320

当前的很多最坏执行时间分析工具都是针对特定的编程语言或特定的编译器的,因而缺乏平台间的迁移性,从而不能被广泛使用.介绍了一种基于Java字节码的可平台迁移的最坏执行时间分析方法.该分析方法包括两方面：一是对字节码（javabyte code）的高层分析,提取出程序数据流和控制流信息;二是对Java虚拟机的底层分析,获得虚拟机的时间模型.最后这两种分析结合得到程序的最坏执行时间.同时还探讨了将来的研究方向. 相似文献

19.

Complete worst-case execution time analysis of straight-line hard real-time programs

《Journal of Systems Architecture》2000,46(4):339-355

In this article, the problem of finding a tight estimate on the worst-case execution time (WCET) of a hard real-time program is addressed. The analysis is focused on straight-line code (without loops and recursive function calls) which is quite commonly found in synthesised code for embedded systems. A comprehensive timing analysis system covering both low-level (assembler instruction level) as well as high-level aspects (programming language level) is presented. The low-level analysis covers all speed-up mechanisms used for modern superscalar processors: pipelining, instruction-level parallelism and caching. The high-level analysis uses the results from the low-level to compute the final estimate on the WCET. This is done by a heuristic for searching the longest really executable path in the control flow, based on the functional dependencies between various program parts. 相似文献

20.

一种同步语言多线程代码自动生成工具

杨志斌袁胜浩谢健周勇陈哲薛垒 Jean-Paul BODEVIX Mamoun FILALI 《软件学报》2019,30(7):1980-2002

随着安全关键系统对计算性能要求的日趋提高,能够提供更强计算能力而又减少电子设备的体积、重量和功耗的多核处理器将在安全关键领域得到广泛应用.同步语言能够表达确定性并发行为且具有精确时间语义等特性,适用于安全关键软件的建模和验证.目前,同步语言SIGNAL编译器主要支持串行代码生成,较少关注多线程代码生成.提出一种同步语言SIGNAL多线程代码生成工具.首先将SIGNAL程序转换为经过时钟演算的S-CGA中间程序;之后将S-CGA中间程序转换为时钟数据依赖图以分析依赖关系;然后对时钟数据依赖图进行拓扑排序划分,并针对划分结果提出优化算法和基于流水线方式的任务划分方法;最后将划分结果转换为虚拟多线程结构并进一步生成可执行多线程C/Java代码.通过在多核处理器上的实验,验证了所提方法的有效性. 相似文献