共查询到19条相似文献,搜索用时 78 毫秒
1.
数据流编程语言是一种面向领域的编程语言,它能够将计算与通信分离,暴露应用程序的并行性.多核集群中计算、存储和通信等底层资源的复杂性对数据流程序的性能提出了新的挑战.针对数据流程序在多核集群上执行存在资源利用低和扩展性差等问题,利用同步数据流图作为中间表示,文中提出并实现了面向多核集群的层次性流水线并行优化方法.方法包含任务划分与调度、层次流水线调度和数据局部性优化,经过编译优化后生成基于MPI的可并行执行的目标代码.其中任务划分与调度是利用程序中数据和任务并行性将任务映射到计算核上,实现负载均衡和低通信同步开销;层次性流水线调度是利用程序中的并行性构造低延迟流水线调度;数据局部性优化是针对数据访问存在的Cache伪共享做面向存储的优化.实验以X86架构多核处理器组成的集群为平台,选取媒体处理领域的典型应用算法作为测试程序,对层次流水线优化进行实验分析.实验结果表明了优化方法的有效性. 相似文献
2.
数据流编程作为一种编程模式已被广泛应用到各个领域.然而,多核体系结构的不同使得数据流程序在不同平台上移植困难.X10作为一种新型并行编程语言,为不同的多核体系结构提供了统一的并行计算环境.如何利用X10语言的特性来提高数据流程序的效率已成为目前研究工作的一大难点.本文设计并实现了一个面向X10的编译优化系统,该系统确立了三种优化算法:针对X10语言的代码生成优化减少了生成的X10代码量;针对同步数据流图的任务划分优化在负载均衡的基础上,避免了死锁的产生,同时减少了通信开销;针对底层硬件资源的通信优化在机器间通信、机器内部线程间通信、线程内部通信方面进行了区分和优化,减少了通信开销.实验结果表明,设计的三种编译优化算法都获得了较大的性能提升. 相似文献
3.
主从式单边异构体系结构的异构多核处理器广泛应用于面向专门应用领域的计算加速,如异构多核嵌入式处理器、DSP、SoC等;高性能的该类处理器也可用于一些大规模科学和工程计算问题的处理。主从式单边异构处理器对编程模型和编译技术提出了很多挑战性问题,如编程模型的选择、编程语言的设计、编译器架构设计以及运行库的设计等。本文分析了这一类处理器结构特点和执行模型,认为功能卸载模型是最适用于这一体系结构的编程模型;并分析了面向功能卸载模型的编程语言设计关键问题,提出了编译系统的架构,讨论了相应的运行库设计问题。 相似文献
4.
随着网络的快速发展,国人已不满足于英文编程,易语言、习语言、O语言等中文编程语言如雨后春笋般出现,发展至今相继进入了瓶颈期。笔者主要通过对典型中文编程语言的研究,分析现有编程软件的限制因素,为今后中文编程的发展提供新的方向。 相似文献
5.
数据流编程模型将程序设计与媒体处理相结合,已大量应用到各个领域.众核处理器已经成为主流和工业标准,如何利用众核架构的特性来提高流应用执行性能已成为目前研究工作的一大难点.文中提出了一个高效的流编译框架来优化流应用的执行,该框架包含3个优化策略:设计一个最优的软件流水调度方法;提出一个高效的数据存储分配算法;并采用合理的众核间的映射策略,减小通信以及同步的开销.文中在Godson-T上实现了该编译器框架,实验结果表明,该方法比优化前有较大性能改进. 相似文献
6.
多核处理器越来越普及,如何通过软件技术最大提升CPU每个核心的使用率,成为热点问题.引入多核并行编程模型Threading Building Blocks,并与raw threads、OpenMP进行各方面详细比较,分析了其优劣.并研究了TBB结合MPI在SMP集群系统上实现高效的混合并行计算应用的方法.最终发现TBB在多核编程方面有显著的优势.TTB和MPI的结合,又为多核处理器结点集群提供了并行层次化结构,大大优化集群的性能. 相似文献
7.
对采用多核处理器作为SMP集群系统的计算节点的系统上的一种混合编程模型-MPI+OpenMP混合编程模型进行了深入的研究.建立了两个矩阵乘的混合并行算法,在多核集群平台上与纯MPI算法分别进行了实验,并进行了性能方面的比较.试验表明,混合编程具有更好的性能. 相似文献
8.
主要对并行计算的编程模型进行了研究,包括MPI和OpenMP两种编程模型,同时提出了一种层次化混合编程模型。并以计算π的问题为例,用C语言设计了混合编程模型下的程序,在以多核处理器作为节点的曙光TC5000集群上对三种编程模型下的求π程序进行了实验,同时将实验结果进行了性能分析和比较。结果表明该混合并行算法具有更好的扩展性和加速比。 相似文献
9.
由于当前硬件的开发逐日增加,为了充分发挥这些硬件的功能,通过介绍数据流编程语言的应用方式,一同介绍了数据流语言所充分利用的类似流程图的用法.利用智能编译器进行检测程序过程,陈述了与硬件地址相结合的编程方法.这种方法极大地简化了开发人员编写多线程程序的难处,同时能够充分发挥多棱CPU的效率. 相似文献
10.
本文介绍著名的数据流编程的图形软件LabVIEW及其应用。并介绍作者使用LabVIEW开发的专用子程序库以及使用这个库编制的应用程序。 相似文献
11.
The ability to represent, manipulate, and optimize data placement and movement between processors in a distributed address
space machine is crucial in allowing compilers to generate efficient code. Data placement is embodied in the concept of data ownership. Data movement can include not just the transfer of data values but the transfer of ownership as well. However, most existing
compilers for distributed address space machines either represent these notions in a language-or machine-dependent manner,
or represent data or ownership transfer implicitly. In this paper we describe XDP, a set of intermediate language extensions
for representing and manipulating data and ownership transfers explicitly in a compller. XDP is supported by a set of per-processor
structures that can be used to implement ownership testing and manipulation at run-time, XDP provides a uniform framework
for translating and optimizing sequential, data parallel, and message-passing programs to a distributed address space machine.
We describe analysis and optimization techniques for this explicit representation. Finally, we compare the intermediate languages
of some current distributed address space compilers with XDP. 相似文献
12.
is paper presents a model for automatically parallelizing compiler based on C which consists of compile-time and run-time parallelizing facilities.The paper also describes a method for finding both intra-object and inter-object parallelism.The parallelism detection is completely transparent to users. 相似文献
13.
基于机器学习的迭代编译方法可以在对新程序进行迭代编译时,有效预测新程序的最佳优化参数组合。现有方法在模型训练过程中存在优化参数组合搜索效率较低、程序特征表示不恰当、预测精度不高的问题。因此,基于机器学习的迭代编译方法是当前迭代编译领域内的一个研究热点,其研究挑战在于学习算法选择、优化参数搜索以及程序特征表示等问题。基于监督学习技术,提出了一种程序优化参数预测方法。该方法首先通过约束多目标粒子群算法对优化参数空间进行搜索,找到样本函数的最佳优化参数;然后,通过动静结合的程序特征表示技术,对函数特征进行抽取;最后,通过由函数特征和优化参数形成的样本构建监督学习模型,对新程序的优化参数进行预测。分别采用k近邻法和softmax回归建立统计模型,实验结果表明,新方法在NPB测试集和大型科学计算程序上实现了较好的预测性能。 相似文献
14.
C# is the new flagship language in the Microsoft .NET platform. C# is an attractive vehicle for language design research not only because it shares many characteristics with Java, the current language of choice for such research, but also because it is likely to see wide use. Language research needs a large investment in infrastructure, even for relatively small studies. This paper describes a new C# compiler designed specifically to provide that infrastructure. The overall design is deceptively simple. The parser is generated automatically from a possibly ambiguous grammar, accepts C# source, perhaps with new features, and produces an abstract syntax tree, or AST. Subsequent phases—dubbed visitors—traverse the AST, perhaps modifying it, annotating it or emitting output, and pass it along to the next visitor. Visitors are specified entirely at compilation time and are loaded dynamically as needed. There is no fixed set of visitors, and visitors are completely unconstrained. Some visitors perform traditional compilation phases, but the more interesting ones do code analysis, emit non‐traditional data such as XML, and display data structures for debugging. Indeed, most usage to date has been for tools, not for language design experiments. Such experiments use source‐to‐source transformations or extend existing visitors to handle new language features. These approaches are illustrated by adding a statement that switches on a type instead of a value, which can be implemented in a few hundred lines. The compiler also exemplifies the value of dynamic loading and of type reflection. Copyright © 2004 John Wiley & Sons, Ltd. 相似文献
15.
Computer architects have been constantly looking for new approaches to design high-performance machines. Data flow and VLSI offer two mutually supportive approaches towards a promising design for future super-computers. When very high speed computations are needed, data flow machines may be relied upon as an adequate solution in which extremely parallel processing is achieved. This paper presents a formal analysis for data flow machines. Moreover, the following three machines are considered: (1) MIT static data flow machine; (2) TI's DDP static data flow machine; (3) LAU data flow machine. These machines are investigated by making use of a reference model. The contributions of this paper include: (1) Developing a Data Flow Random Access Machine model (DFRAM), for first time, to serve as a formal modeling tool. Also, by making use of this model one can calculate the time cost of various static data machines, as well as the performance of these machines. (2) Constructing a practical Data Flow Simulator (DFS) on the basis of the DFRAM model. Such DFS is modular and portable and can be implemented with less sophistication. The DFS is used not only to study the performance of the underlying data flow machines but also to verify the DFRAM model. 相似文献
16.
针对交通流模型的强非线性、不确定性等特点,提出了基于近似动态规划的交通流模型参数辨识算法.该算法具有自学习和自适应的特性,不依赖于被控对象的解析模型.严格的理论推导证明了这种参数辨识方案的收敛性,仿真结果验证了所提出算法的有效性. 相似文献
17.
One of the chief difficulties which needs to be overcome during the early design stages of a system is that of establishing a satisfactory design for that system. From the time it was first conceived it was apparent that the Relational Data Base Management System is like a compiler in so far as it takes a succession of user requests for information formulated in an applied predicate calculus and translates each one into a series of calls which access an underlying data base and transform data from that data base into the form the user wishes to see. This paper compares the architecture of the Relational Data Base Management System with that of a compiler, and then demonstrates the use of the architecture when processing a language based on an applied predicate calculus. Finally, the paper describes a number of extensions to that architecture which are required to solve the particular problems raised by the data base system. 相似文献
19.
Compiling code for the Icon programming language presents several challenges, particularly in dealing with types and goal-directed expression evaluation. In order to produce optimized code, it is necessary for the compiler to know much more about operations than is necessary for the compilation of most programming languages. This paper describes the organization of the Icon compiler and the way it acquires and maintains information about operations. The Icon compiler generates C code, which makes it portable to a wide variety of platforms and also allows the use of existing C compilers for performing routine optimizations on the final code. A specially designed implementation language, which is a superset of C, is used for writing Icon's run-time system. This language allows the inclusion of information about the abstract semantics of Icon operations and their type-checking and conversion requirements. A translator converts code written in the run-time language to C code to provide an object library for linking with the code produced by the Icon compiler. The translation process also automatically produces a database that contains the information the Icon compiler needs to generate and optimize code. This approach allows easy extension of Icon's computational repertoire, alternate computational extensions, and cross compilation. 相似文献
|