首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
随着安全关键系统对计算性能要求的日趋提高,能够提供更强计算能力而又减少电子设备的体积、重量和功耗的多核处理器将在安全关键领域得到广泛应用.同步语言能够表达确定性并发行为且具有精确时间语义等特性,适用于安全关键软件的建模和验证.目前,同步语言SIGNAL编译器主要支持串行代码生成,较少关注多线程代码生成.提出一种同步语言SIGNAL多线程代码生成工具.首先将SIGNAL程序转换为经过时钟演算的S-CGA中间程序;之后将S-CGA中间程序转换为时钟数据依赖图以分析依赖关系;然后对时钟数据依赖图进行拓扑排序划分,并针对划分结果提出优化算法和基于流水线方式的任务划分方法;最后将划分结果转换为虚拟多线程结构并进一步生成可执行多线程C/Java代码.通过在多核处理器上的实验,验证了所提方法的有效性.  相似文献   

2.
随着对安全攸关实时系统功能与非功能要求的日益增加,使用多核技术将成为发展趋势.如何在多核平台条件下保证系统运行的可信任性及可靠性是学术上和应用上的关键问题.目前基于形式化方法的系统设计、验证以及自动代码生成已经在单核平台上形成很多研究成果,但在多核平台上的研究仍面临许多科学问题.同步语言SIGNAL是一种被广泛应用于安全攸关实时系统功能设计的形式化方法,适用于对系统确定性并发行为的描述.SIGNAL编译器也支持将同步规范SynchronousSpecification)生成仿真代码,以对其进行验证与分析.然而现有研究较少关注从SIGNAL同步规范到支持跨平台并行代码的生成方法.本文研究面向SIGNAL同步规范的并行自动代码生成方法.提出了方程依赖图EDG的概念,将SIGNAL规范转换为EDG以分析其全局数据依赖关系;研究了对EDG进行任务划分获取规范中可以并行执行部分的算法;最后,以跨平台并行编程API-OpenMP作为对象,结合程序中信号的时钟关系,将并行任务映射到OpenMP并行代码,并进行了实例验证.  相似文献   

3.
This paper presents a simple and safe compiler, called MinSIGNAL, from a subset of the synchronous dataflow language SIGNAL to C, as well as its existing enhancements. The compiler follows a modular architecture, and can be seen as a sequence of source-to-source transformations applied to an intermediate representation which is named Synchronous Clocked Guarded Actions (S-CGA) and translation to sequential imperative code. Objective Caml (OCaml) is used for the implementation of MinSIGNAL. As a modern functional language, OCaml is adapted to symbolic computation and so, particularly suitable for compiler design and implementation of formal analysis tools. In particular, the safety of its type checking allows to skip some verification that would be mandatory with other languages. Additionally, this work is a basis for the formal verification of the compilation of SIGNAL with a theorem prover such as Coq.  相似文献   

4.
能够提供更强计算能力的多核处理器将在安全关键系统中得到广泛应用.但是,由于现代处理器所使用的流水线、乱序执行、动态分支预测、Cache等性能提高机制以及多核之间的资源共享,使得系统的最坏执行时间分析变得非常困难.为此,国际学术界提出时间可预测系统设计的思想,以降低系统的最坏执行时间分析难度.已有研究主要关注硬件层次及其编译方法的调整和优化,而较少关注软件层次,即时间可预测多线程代码的构造方法以及到多核硬件平台的映射.本文提出一种基于同步语言模型驱动的时间可预测多线程代码生成方法,并对代码生成器的语义保持进行证明;提出一种基于AADL(Architecture Analysis and Design Language)的时间可预测多核体系结构模型,作为本文研究的目标平台;最后,给出多线程代码到多核体系结构模型的映射方法,并给出系统性质的分析框架.  相似文献   

5.
The synchronous model of computation is well suited for real-time systems, because it allows static analysis in order to find and guarantee their reaction times. Today’s multi-core systems are becoming the predominant computing platforms. Synchronous programs are typically compiled into single threaded code, which makes them unsuitable for exploiting parallelism of the multi-core platforms. Moreover, static timing analysis becomes highly intractable for multi-core systems. This article proposes a novel methodology that aims at finding the mapping and schedule of synchronous programs that guarantees, statically, reaction times when mapped onto a multi-core system consisting of two types of time-predictable cores. The proposed methodology combines design space exploration based on evolutionary algorithm and scheduling of parts of synchronous programs. It allows minimizing the resource usage in terms of number of cores by finding the mapping and schedule with the guaranteed reaction time for architectures with different number of cores. In particular, we: (a) transform a synchronous program written in synchronous SystemJ to a graph-based model represented with two types of computation nodes suitable for execution on two types of time-predictable cores, (b) perform mapping of computation nodes on a customizable multi-core platform using genetic operations, and (c) generate a resulting static schedule of computation nodes for each mapping as part of the design space exploration. The design flow, from program specification and node mapping to the design space exploration and multi-core scheduling is completely automated.  相似文献   

6.
7.
8.
SIGNAL is a part of the synchronous languages family, which are broadly used in the design of safety-critical real-time systems such as avionics, space systems, and nuclear power plants. There exist several semantics for SIGNAL, such as denotational semantics based on traces (called trace semantics), denotational semantics based on tags (called tagged model semantics), operational semantics presented by structural style through an inductive definition of the set of possible transitions, operational semantics defined by synchronous transition systems (STS), etc. However, there is little research about the equivalence between these semantics. In this work, we would like to prove the equivalence between the trace semantics and the tagged model semantics, to get a determined and precise semantics of the SIGNAL language. These two semantics have several different definitions respectively, we select appropriate ones and mechanize them in the Coq platform, the Coq expressions of the abstract syntax of SIGNAL and the two semantics domains, i.e., the trace model and the tagged model, are also given. The distance between these two semantics discourages a direct proof of equivalence. Instead, we transformthem to an intermediate model, which mixes the features of both the trace semantics and the tagged model semantics. Finally, we get a determined and precise semantics of SIGNAL.  相似文献   

9.
Real-time systems need time-predictable platforms to allow static analysis of the worst-case execution time (WCET). Standard multi-core processors are optimized for the average case and are hardly analyzable. Within the T-CREST project we propose novel solutions for time-predictable multi-core architectures that are optimized for the WCET instead of the average-case execution time. The resulting time-predictable resources (processors, interconnect, memory arbiter, and memory controller) and tools (compiler, WCET analysis) are designed to ease WCET analysis and to optimize WCET performance. Compared to other processors the WCET performance is outstanding.The T-CREST platform is evaluated with two industrial use cases. An application from the avionic domain demonstrates that tasks executing on different cores do not interfere with respect to their WCET. A signal processing application from the railway domain shows that the WCET can be reduced for computation-intensive tasks when distributing the tasks on several cores and using the network-on-chip for communication. With three cores the WCET is improved by a factor of 1.8 and with 15 cores by a factor of 5.7.The T-CREST project is the result of a collaborative research and development project executed by eight partners from academia and industry. The European Commission funded T-CREST.  相似文献   

10.
There is an enormous amount of parallelism exposed to fine-grain multithreaded architectures to cover latencies. It is a demanding task for a multithreading programmer to manage such a degree of parallelism by hand. To use multithreaded architectures efficiently it is essential to have compiler support for automatically partitioning programs into threads. This paper solves a fundamental problem in compiling for multithreaded architectures, automatically partitioning a program into threads. The focus of such partitioning is to overlap the remote communication latency and minimize the total execution time. We first formulate the partitioning problem based on a multithreaded execution cost model. Then, we prove such a formulation is NP-hard. Therefore, we propose two heuristic thread-partitioning methods to solve this problem in practice. The advanced partitioning algorithm is a novel extension of list scheduling, and it takes advantage of the cost model to generate near-optimum partitioning results. The remote-path-based partitioning algorithm is a simplified version of the advanced one but it is easy for compiler implementation. The two partitioning algorithms were implemented respectively in a thread partitioning testbed and a research EARTH-C compiler. The experimental results show that both partitioning algorithms are effective to generate efficient threaded code, and code generated by the compiler is comparable to hand-written code.  相似文献   

11.
The Cell processor is a heterogeneous multi-core processor with one power processing engine (PPE) core and eight synergistic processing engine (SPE) cores. There is a significant amount of ongoing research in programming models and tools that attempts to make it easy to exploit the computation power of the Cell architecture. In our work, we explore supporting OpenMP on the Cell processor. It is attractive to support OpenMP because programmers can continue using their familiar programming model, and existing code can be re-used. We base our work on IBM’s XL compiler, and developed new components in the XL compiler and a new runtime library. Three major issues are addressed: (1) synchronization support on heterogeneous cores; (2) code generation targeting the different instruction sets; (3) data transfers and implement the OpenMP memory model. We present experimental results for some SPEC OMP 2001 and NAS benchmarks to demonstrate the effectiveness of this approach. A visualization tool based on Paraver is also used to provide some insights into actual thread and synchronization behaviors.  相似文献   

12.
基于GCC的VLIW编译系统研究   总被引:1,自引:1,他引:0  
VLIW机器在单个机器周期中同时发射并执行多个的并行操作,从而获得较高的指令级并行度,这些操作之间的依赖分析和调度工作则被完全交给相应的编译器执行,因此VLIW的并行性能能否充分发挥取决于VLIW体系结构相关编译器的质量。GNU开发的GCC是被最广泛使用的编译系统之一,它具有多语言、多平台支持的能力和开放的结构,能够运用各种成熟的常规编译优化技术生成高效的代码。文章分析了VLIW及GCC的结构特点,提出了一种基于GCC的VLIW编译系统设计方案,利用GCC进行RTL中间代码一级的体系结构无关优化和少量体系结构相关优化,在汇编代码一级针对VLIW结构进行体系结构相关的优化,从而充分利用GCC的成熟编译技术快速开发高效的VLIW多语言编译系统。  相似文献   

13.
14.
为最大程度地减少同步数据流语言编译过程中由编译器引入的错误,需要利用形式化方法自动生成代码,保证编译器产生的代码能够应用于核能仪控系统.本研究使用定理证明工具Coq,对同步数据流语言Lustre到Clight的主节点输入结构翻译阶段涉及的语法、语义及翻译算法进行了形式化定义,并完成翻译算法的形式化证明.研究表明这种经过形式化的编译器能够生成与源代码行为一致的可信目标代码,同时生成的目标代码能够很好满足核能仪控系统的执行规范.  相似文献   

15.
Multithreaded architectures provide an opportunity for efficiently executing programs with irregular parallelism and/or irregular locality. This paper presents a strategy that makes use of the multithreaded execution model without exposing multithreading to the programmer. Our approach is to design simple extensions to C, and to provide compiler support that automatically translates high-level C programs into lower-level threaded programs. In this paper we present EARTH-C our extended C language which contains simple constructs for specifying control parallelism, data locality, shared variables and atomic operations. Based on EARTH-C, we describe compiler techniques that are used for translating to lower-level Threaded-C programs for the EARTH multithreaded architecture. We demonstrate our approach with six benchmark programs. We show that even naive EARTH-C programs can lead to reasonable performance, and that more advanced EARTH-C programs can give performance very close to hand-coded threated-C programs. This work supported, in part, by NSERC and FCAR.  相似文献   

16.
Recently, there is a considerable research effort on providing time-predictable architectures for real-time systems. The correct functioning of them depends not only on the logically correct response, but also the time when it is given. To provide such determinism, the use of VLIW (Very Long Instruction Word) architecture with in-order pipelines and predication support became attractive. Predication is an architecture technique which helps the compiler to translate control dependencies into data dependencies reducing control flow overhead and enhancing the worst-case timing analysis. Since hardware predication support does not come for free, the goal of this work is to investigate a low overhead predication system for multi-issue VLIW machines targeting real-time applications. This study was conducted using a VLIW 4-issue prototype based on the VLIW HP ST231 ISA describing the hardware implementation and the benefits on the worst-case performance.  相似文献   

17.
数据流编程模型将程序的计算与通信分离,暴露了应用程序潜在的并行性并简化了编程难度。分布式计算框架利用廉价PC构建多核集群解决了大规模并行计算问题,但多核集群层次性存储结构和处理单元对数据流程序的性能提出了新的挑战。针对数据流程序在分布式架构下所面临的问题,设计并实现了数据流编程模型和分布式计算框架的结合——在COStream的基础上提出了面向Storm的编译优化框架。框架包括两个模块:面向Storm的层次性任务划分与调度,以及面向Storm的层次性软件流水与代码生成。层次性任务划分利用Storm的任务调度机制将程序所有子任务分配到Storm集群节点内的多核上。层次性软件流水与代码生成将子任务构造成集群节点间的软件流水和节点内多核间的软件流水,并生成相应的目标代码。实验以多核集群为目标平台,在集群上搭建Storm分布式架构,选取数字媒体处理领域典型程序作为测试程序,对面向Storm的编译优化后的程序进行实验分析。实验结果表明了结合方法的有效性。  相似文献   

18.
支持多核并行程序确定性重放的高效访存冲突记录方法   总被引:2,自引:0,他引:2  
多核系统中并行程序执行过程的不确定性给程序调试带来了很大的困难.准确记录初始执行中冲突访存的次序是并行程序确定性重放的基础.提出了通过建立精确happens-before关系记录访存冲突的方法.此方法利用简洁高效的地址冲突检测机制确定冲突访存操作在执行中所处happens-before序关系的位置,可以抑制部分记录信息的产生,从而有效减少记录信息.与其他方式方法相比,可以进一步压缩17%的记录条数.采用逻辑向量时钟描述冲突访存操作间的happens-before关系,与采用标量时钟相比,可以避免happens-before关系的误识,降低重放执行时并行度的损失.  相似文献   

19.
低功耗多线程编译优化技术   总被引:12,自引:1,他引:12  
提出了在多线程体系结构中通过降低执行频率有效减小功耗的理论模型和方法.首先研究识别可降频运行的线程的计算模型和降频因子的计算,然后给出在编译过程中基于对应用程序行为的分析,结合线程划分的低功耗编译优化算法和实现策略.该模型和方法可用于具有执行频率可动态调整的多处理器类多线程体系结构,既可开发TLP(thread level parallelism),又可有效减小功率消耗.  相似文献   

20.
In the ongoing quest for greater computational power, efficiently exploiting parallelism is of paramount importance. Architectural trends have shifted from improving single-threaded application performance, often achieved through instruction level parallelism (ILP), to improving multithreaded application performance by supporting thread level parallelism (TLP). Thus, multi-core processors incorporating two or more cores on a single die have become ubiquitous. To achieve concurrent execution on multi-core processors, applications must be explicitly restructured to exploit parallelism, either by programmers or compilers. However, multithreaded parallel programming may introduce overhead due to communications among threads. Though some resources are shared among processor cores, current multi-core processors provide no explicit communications support for multithreaded applications that takes advantage of the proximity between cores. Currently, inter-core communications depend on cache coherence, resulting in demand-based cache line transfers with their inherent latency and overhead. In this paper, we explore two approaches to improve communications support for multithreaded applications. Prepushing is a software controlled data forwarding technique that sends data to destination’s cache before it is needed, eliminating cache misses in the destination’s cache as well as reducing the coherence traffic on the bus. Software Controlled Eviction (SCE) improves thread communications by placing shared data in shared caches so that it can be found in a much closer location than remote caches or main memory. Simulation results show significant performance improvement with the addition of these architecture optimizations to multi-core processors.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号