共查询到20条相似文献,搜索用时 15 毫秒
1.
随着各领域需要处理的数据量越来越大,数据密集型应用也变得越来越被重视.该文提出一种包含数据访存层次和访存冲突等信息的新并行程序执行模型PSRAM(h).针对数据密集型应用以访存为主的特点,PSRAM(h)模型将程序执行时间简化为访存时间,通过分析各程序子段的访存层次和数量来预测串行程序的执行时间,进而通过使用各线程执行时间的最大值来预测并行程序的执行时间.使用PSRAM(h)模型下对最典型的数据密集型应用矩阵向量乘进行分析,在龙芯3A处理器和Intel Xeon E5520处理器两个平台上的测试结果表明,PSRAM(h)模型分析结果与实测结果大部分情况下误差小于20%.由此可见,针对数据密集型应用,PSRAM(h)不但可以给出程序执行时间的下限,还可以有效的预测程序的执行时间. 相似文献
2.
This paper describes an implementation of P
L
for a massively parallel SIMD machine, the M
P
MP-1. The system is based on a byte code interpreter which can emulate as many virtual processors on each physical processor as desired (within the limits of memory). The implementation makes it possible to activate more virtual processors once execution has begun and this feature can be used to support nested parallelism. Nested parallelism describes the ability to nest data parallel constructs, a feature of P
L
, C
M
L
, and N
; however, the outer parallel forms usually have to be sequentialized, with only the innermost forms being executed in parallel. N
and a subset of P
L
have been implemented to fully support nested parallelism by flattening nested structures at compile time. To do this the languages must impose various restrictions on both the data and control structures. There is an overhead associated with the runtime technique described here, but it is very versatile and can execute code in parallel that cannot be “flattened.” Hence this technique can be used to effectively support many of the moredifficultaspects of P
L
. 相似文献
3.
4.
5.
《IEEE transactions on pattern analysis and machine intelligence》1983,(1):103-108
This paper describes a technique for predicting the execution behavior of a source program or a software design specification. As a by-product of syntactic analysis, a program graph is constructed which can subsequently be treated as the graph of a finite automaton. The expression for execution behavior is the regular expression of the graph. Several simplification techniques for these expressions are discussed and exemplified. In particular, the substitution of known values for program segments followed by constant folding cannot be done indiscriminately; the allowable situations are characterized. Applications include the prediction of execution time for a program or a software design, other forms of language analysis, and program restructuring. 相似文献
6.
7.
基于区域平均执行时间和数据依赖信息的可能并行区域识别 总被引:1,自引:0,他引:1
随着多核处理器逐渐成为处理器发展的新趋势,为了持续提高程序性能,必须并行执行应用程序.传统的自动并行技术能够很好地并行科学计算应用中的规则循环,但对于含有大量函数调用和指针引用的不规则程序,目前还不能有效地对其实施并行.针对这一现状,文中提出了基于区域平均执行时间和数据依赖信息的可能并行区域识别方法来对一些不规则程序实施高效并行,主要贡献如下:(1)自动识别程序中的多种并行性,不仅包括传统并行性分析中的循环迭代间的细粒度并行性,而且也包括传统并行性分析尚不能有效处理的循环体和函数调用点间的粗粒度并行性.对于程序中蕴含的众多并行性,文中基于区域平均执行时间实施收益分析来选择合适的并行区域实施并行;(2)自动识别可能并行区域间数据依赖关系的数量、类型以及导致数据依赖关系的程序变量.基于文中的分析结果,作者使用面向行为的投机并行系统(behavior oriented parallelism)对SPEC2006中的4个测试用例实现了并行化.并行化后的程序在Intel和AMD多核处理器上分别得到了300%和260%的平均性能加速. 相似文献
8.
本文是《学习MISRA—C》系列连载讲座之五,共六讲。第一讲:“‘安全第一’的C语言编程规范”,简述MISRA—C的概况。第二讲:“跨越数据类型的重重陷阱”,介绍规范的数据定义和操作方式,重点在隐式数据类型转换中的问题。第三讲:“指针、结构体、联合体的安全规范”,解析如何安全而高效地应用指针、结构体和联合体。第四讲:“防范表达式的失控”,剖析MISRA—C中关于表达式、函数声明和定义等的不良使用习惯,最大限度地减小各类潜在错误。第五讲:“准确的程序流控制”,表述C语言中控制表达式和程序流控制的规范做法。第六讲:“构建安全的编译环境”,讲解与编译器相关的规范编写方式,避免来自编译器的隐患。[编者按] 相似文献
9.
本文基于静态相关性分析和动态调整相结合的方法,提出了一种逻辑程序的执行模型,它不仅开发了“与“并行,同进也开发了一定的“或“并行,从而有效地加速了逻辑程序的执行。 相似文献
10.
Symbolic execution provides a mechanism for formally proving programs correct. A notation is introduced which allows a concise presentation of rules of inference based on symbolic execution. Using this notation, rules of inference are developed to handle a number of language features, including loops and procedures with multiple exits. An attribute grammar is used to formally describe symbolic expression evaluation, and the treatment of function calls with side effects is shown to be straightforward. Because symbolic execution is related to program interpretation, it is an easy-to-comprehend, yet powerful technique. The rules of inference are useful in expressing the semantics of a language and form the basis of a mechanical verification condition generator. 相似文献
11.
Alfredo Cristobal-Salas Andrei Tchernykh Jean-Luc Gaudiot Wen-Yen Lin 《International journal of parallel programming》2003,31(2):77-105
This paper surveys and demonstrates the power of non-strict evaluation in applications executed on distributed architectures. We present the design, implementation, and experimental evaluation of single assignment, incomplete data structures in a distributed memory architecture and Abstract Network Machine (ANM). Incremental Structures (IS), Incremental Structure Software Cache (ISSC), and Dynamic Incremental Structures (DIS) provide non-strict data access and fully asynchronous operations that make them highly suited for the exploitation of fine-grain parallelism in distributed memory systems. We focus on split-phase memory operations and non-strict information processing under a distributed address space to improve the overall system performance. A novel technique of optimization at the communication level is proposed and described. We use partial evaluation of local and remote memory accesses not only to remove much of the excess overhead of message passing, but also to reduce the number of messages when some information about the input or part of the input is known. We show that split-phase transactions of IS, together with the ability of deferring reads, allow partial evaluation of distributed programs without losing determinacy. Our experimental evaluation indicates that commodity PC clusters with both IS and a caching mechanism, ISSC, are more robust. The system can deliver speedup for both regular and irregular applications. We also show that partial evaluation of memory accesses decreases the traffic in the interconnection network and improves the performance of MPI IS and MPI ISSC applications. 相似文献
12.
该文以丛生树模型为基础,提出了一种片段式查询执行计划。该执行计划将查询树划分成多个按流水线方式执行的片段,各片段依次执行。该执行计划可以减少中间结果的I/0次数,更充分地利用内存资源。文中还举例说明了计划的执行过程。 相似文献
13.
14.
The advanced method of symbolic evaluation can be applied to program testing situations with results close to those of formal correctness proofs–but without the high cost. 相似文献
15.
Methods are developed for transforming sequential programs for iterative computations into parallel-distributed versions which execute in parallel on a cluster of workstation or PC nodes on a local area network. We focus on communication issues and present algorithms for interprocess communication implemented by UNIX TCP/IP socket commands. Results of performance tests on several application problems, such as simulation of neural networks and the Jacobi method for solving linear equations, representative of a large class of application problems are presented. Analysis indicates that, for problems with rather intensive computation, speedups of better than 2p/3 are possible with an optimal numberpof nodes on a single Ethernet bus segment. Preliminary tests on small clusters show efficient speedups even for nonoptimalp. 相似文献
16.
17.
可编程控制器通过循环执行控制程序来实现用户要求的控制功能,循环执行控制程序称为扫描周期。分析了可编程控制器的这种工作方式对编程的影响,介绍如何利用扫描周期分析设计程序。 相似文献
18.
符号执行是一种实用的验证程序中是否包含某类错误的技术,具有0误报率的优点,但是主流的执行工具并不支持分析多线程程序。本文对已有的多线程程序的符号执行工具进行分析,发现存在的问题有:1)有些工具性能好,但是不支持外部库,实用性很差;2)有些工具支持外部库函数,但是版本老,难以更新和维护,无法检查减法溢出、乘法溢出、移位溢出等基本类型的bug。本文基于最主流的符号执行工具KLEE设计并实现支持多线程程序的符号执行工具——MTSE(Multi-Thread Symbolic Execution)。MTSE支持libc和libc++库,并且相对于已有的同类工作Cloud9,MTSE可以多查找出约50%的程序缺陷,并且指令覆盖率和分支覆盖率上均有约30%的提升。 相似文献
19.
A technique for evaluating the execution time of program fragments on superscalar and explicitly parallel processors is described. Rules for the fragmentation and modification of the initial source code in a high-level language are proposed, and examples in C++ are considered. An implementation of the dynamic analysis of programs and the examination of its results with account for side effects caused by specific features of a processor architecture and operating system are considered.__________Translated from Programmirovanie, Vol. 31, No. 3, 2005.Original Russian Text Copyright © 2005 by Toporkov, Toporkova. 相似文献
20.
从工程实例出发,具体分析了在工程上普遍存在的一类问题:即在没有任务操作系统支持下如何实现多任务的技术。以一个监控系统为例,分析了其在DOS环境下的实现方法。 相似文献