首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
分析基于超块的激进执行模型中超块级预测可行性,给出满足超块级预测的预测器设计方案。对不同应用深度预测可行性高低、期望预测深度及其影响因素等进行论证。实验结果表明,大部分应用具有较高的期望预测深度,适合激进执行,但不同的应用期望深度相差较大。  相似文献   

2.
类间数据依赖分析是类间数据流测试的基础。本文通过分析类簇级测试中的异常传播对程序数据依赖的影响,提出一种包括异常结构在内的类间C++程序数据依赖分析方法,根据类间关系增量式地构造类间数据依赖图,并给出类间数据依赖图的构造算法。最后,在程序切片中应用了该数据依赖分析方法。结果证明,该方法通过分析异常传播对数据依赖的影响能够带来切片精度的提高。  相似文献   

3.
推测多线程技术通过推测执行的方式开发应用程序的线程级并行性,以提高程序执行性能。该技术一般通过执行模型来检测运行时可能的线程推测错误情况,并采取合适的机制恢复程序正确运行。描述的Prophet是一种基于硬件实现的推测多线程执行模型。重点描述了Prophet执行模型针对执行模型设计的关键问题的解决方案,包括Prophet的线程状态控制和多版本的Cach。系统,Prophet的多版本Cache系统提供了推测数据缓存功能,并使用基于总线监听的Cache协议实现了数据依赖违规检测。还给出了使用Olden基准程序对Prophet执行模型进行功能和性能测试的结果,并分析说明了Prophet系统可以有效地开发应用程序的线程级并行性。  相似文献   

4.
软件过程模型的很多语义上的错误和歧义都是由数据依赖关系引起的,例化阶段数据依赖关系的研究,对过程的执行效率有很大影响.首先介绍了过程模型SPM,在其基础上,提出了数据依赖的概念,并给出一种活动-数据关系;然后通过对该关系进行分析,提出了保证过程模型实例语义合理的基本规则;最后应用有穷状态自动机理论,实现了保证这些规则的算法.  相似文献   

5.
针对化工过程间安全分析问题,结合计算机领域中数据依赖技术,提出一种新的应用于化工过程的安全分析解决方案。以双容水槽液位控制系统为实例,分析工艺流程和变量之间的关系,从中提取9个状态,10个迁移过程以及迁移的条件、事件及执行过程等信息,建立其扩展有限状态机模型。通过考察迁移T8中L2变量,分析其数据依赖关系路径,确定数据依赖正负影响关系,实现基于数据依赖的化工过程安全分析新方法,并通过对T4中L2变量的分析验证了所提方法的有效性,使得扩展有限状态机数据依赖技术成为计算机自动推理来实现化工过程的安全分析的一种新的有效方法。  相似文献   

6.
分布式大数据函数依赖发现   总被引:1,自引:0,他引:1  
在关系数据库中,函数依赖发现是一种十分重要的数据库分析技术,在知识发现、数据库语义分析、数据质量评估以及数据库设计等领域有着广泛的应用.现有的函数依赖发现算法主要针对集中式数据,通常仅适用于数据规模比较小的情况.在大数据背景下,分布式环境函数依赖发现更富有挑战性.提出了一种分布式环境下大数据的函数依赖发现算法,其基本思想是首先在各个节点利用本地数据并行进行函数依赖发现,基于以上发现的结果对函数依赖候选集进行剪枝,然后进一步利用函数依赖的左部(left hand side,LHS)的特征,对函数依赖候选集进行分组,针对每一组候选函数依赖并行执行分布式环境发现算法,最终得到所有函数依赖.对不同分组情况下所能检测的候选函数依赖数量进行了分析,在算法的执行过程中,综合考虑了数据迁移量和负载均衡的问题.在真实的大数据集上的实验表明,提出的检测算法在检测效率方面与已有方法相比有明显的提升.  相似文献   

7.
软件测试中的结构性测试是以程序的结构为基础生成测试用例,以测试准则为判定测试的充分性,由于程序结构的复杂性,难于保证对程序进行充分而高效的测试,本文提出了一种基于程序依赖图的程序结构划分的测试方法,即程序块划分法。该方法难过对程序进行结构划分,将复杂的程序分解为若干程序块,并通过程序块间的数据依赖关系导出各程序块的语义,从而使测试可以在程序块的级别的基础上独立进行。  相似文献   

8.
为提升Web报表系统中公式计算的效率,建立了公式计算性能优化的模型.提出了一种公式间依赖关系分析的方法,自适应构建公式间的依赖关系图;在构建的依赖关系图的基础上,进一步提出了高效的层次化拓扑排序算法,极大的提高了报表中公式计算效率,减小报表系统每张报表的表内公式计算的总执行时间.理论分析和实验结果表明,该模型具有较强的可行性和算法高效性.  相似文献   

9.
堆内存的大量使用使得Java程序上数据依赖关系的精确提取仍存在许多困难.对于堆空间上的依赖提取,通常的做法是先对堆上空间进行命名,再据此分析依赖关系.然而该方法不能在多个定义间进行强更新,故分析精度不够理想.针对此问题,该文首先提出了一种点间确定别名的概念,然后用它生成强更新和相对更新来精化数据依赖分析.实验表明,与不进行强更新和相对更新的数据依赖分析方法相比,新算法能够在相对较少的额外时间消耗内,有效地提高堆空间上依赖分析的精度.  相似文献   

10.
H·Garcia-Molina等人提出了用于解决长事务问题的Sagas模型,但Sagas模型的事务补偿过程会撤销整个长事务,另外模型要求长事务的每个子事务都必须具有补偿子事务,这两个缺陷大大影响了Sagas模型的执行效率和适用性。本文通过利用任务间的依赖关系以及对事务进行分类的方法,在Sagas的基础上实现了一个部分补偿的工作流事务模型,对于不同类型的事务执行不同的补偿策略,同时即使撤销子事务也仅撤销该子事务所对应的依赖事务,而不是撤销整个事务流程。  相似文献   

11.
This paper presents a new parallelization model, called coarse-grained thread pipelining, for exploiting speculative coarse-grained parallelism from general-purpose application programs in shared-memory multiprocessor systems. This parallelization model, which is based on the fine-grained thread pipelining model proposed for the superthreaded architecture, allows concurrent execution of loop iterations in a pipelined fashion with runtime data-dependence checking and control speculation. The speculative execution combined with the runtime dependence checking allows the parallelization of a variety of program constructs that cannot be parallelized with existing runtime parallelization algorithms. The pipelined execution of loop iterations in this new technique results in lower parallelization overhead than in other existing techniques. We evaluated the performance of this new model using some real applications and a synthetic benchmark. These experiments show that programs with a sufficiently large grain size compared to the parallelization overhead obtain significant speedup using this model. The results from the synthetic benchmark provide a means for estimating the performance that can be obtained from application programs that will be parallelized with this model. The library routines developed for this thread pipelining model are also useful for evaluating the correctness of the codes generated by the superthreaded compiler and in debugging and verifying the simulator for the superthreaded processor  相似文献   

12.
13.
The schedulability analysis of real-time embedded systems requires worst case execution time (WCET) analysis for the individual tasks. Bounding WCET involves not only language-level program path analysis, but also modeling the performance impact of complex micro-architectural features present in modern processors. In this paper, we statically analyze the execution time of embedded software on processors with speculative execution. The speculation of conditional branch outcomes (branch prediction) significantly improves a program's execution time. Thus, accurate modeling of control speculation is important for calculating tight WCET estimates. We present a parameterized framework to model the different branch prediction schemes. We further consider the complex interaction between speculative execution and instruction cache performance, that is, the fact that speculatively executed blocks can generate additional cache hits/misses. We extend our modeling to capture this effect of branch prediction on cache performance. Starting with the control flow graph of a program, our technique uses integer linear programming to estimate the program's WCET. The accuracy of our method is demonstrated by tight estimates obtained on realistic benchmarks.  相似文献   

14.
高性能通用微处理器体系结构关键技术研究   总被引:1,自引:0,他引:1  
X处理器是我国自主设计的基于EPIC思想的高性能通用微处理器.介绍了8级流水线和OLSM执行模型,以很少的硬件代价克服了基本EPIC模型的局限性.设计了一种多分支预测结构,支持多条分支指令的并行执行,并通过判定执行减少分支指令的数目;设计了两级cache存储器,提出DTD低功耗设计方法,并通过前瞻执行隐藏访存的延迟.最后,展望了高性能通用微处理器的发展趋势.  相似文献   

15.
The speculated execution of threads in a multithreaded architecture, plus the branch prediction used in each thread execution unit, allows many instructions to be executed speculatively, that is, before it is known whether they actually needed by the program. In this study, we examine how the load instructions executed on what turn out to be incorrectly executed program paths impact the memory system performance. We find that incorrect speculation (wrong execution) on the instruction and thread-level provides an indirect prefetching effect for the later correct execution paths and threads. By continuing to execute the mispredicted load instructions even after the instruction or thread-level control speculation is known to be incorrect, the cache misses observed on the correctly executed paths can be reduced by 16 to 73 percent, with an average reduction of 45 percent. However, we also find that these extra loads can increase the amount of memory traffic and can pollute the cache. We introduce the small, fully associative wrong execution cache (WEC) to eliminate the potential pollution that can be caused by the execution of the mispredicted load instructions. Our simulation results show that the WEC can improve the performance of a concurrent multithreaded architecture up to 18.5 percent on the benchmark programs tested, with an average improvement of 9.7 percent, due to the reductions in the number of cache misses.  相似文献   

16.
Billion-transistor processors will be much as they are today, just bigger, faster and wider (issuing more instructions at once). The authors describe the key problems (instruction supply, data memory supply and an implementable execution core) that prevent current superscalar computers from scaling up to 16- or 32-instructions per issue. They propose using out-of-order fetching, multi-hybrid branch predictors and trace caches to improve the instruction supply. They predict that replicated first-level caches, huge on-chip caches and data value speculation will enhance the data supply. To provide a high-speed, implementable execution core that is capable of sustaining the necessary instruction throughput, they advocate a large, out-of-order-issue instruction window (2,000 instructions), clustered (separated) banks of functional units and hierarchical scheduling of ready instructions. They contend that the current uniprocessor model can provide sufficient performance and use a billion transistors effectively without changing the programming model or discarding software compatibility  相似文献   

17.
主动实时数据库因结合了时间限制与主动机制而使系统事务的并发控制变得更为复杂。主动规则的引入使事务触发新的事务且在执行上具有多种耦合方式,传统的实时并发控制策略无法对具有复杂执行模式的事务进行有效调度,而基于主动数据库的并发控制机制也没有考虑事务的实时性问题。通过对事务不同耦合方式的实时要求及事务间冲突关系进行分析,提出了新的主动实时数据库乐观并发控制方法,对不同事务级联深度进行评估,结合事务执行的时间信息对冲突事务进行动态调整串行化顺序。理论分析与实验证明,能在保证事务可串行性的同时降低了不必要事务重启个数,更好地满足系统的实时性。  相似文献   

18.
Thread-level speculation becomes more attractive for the exploitation of thread-level parallelism from irregular sequential applications. But it is common for speculative threads to fail to reach the expected parallel performance. The reason is that the performance of speculative threads is extremely complicated by the fact that it not only suffers from the imprecision of compiler-directed performance estimation due to ambiguous control and data dependences, but also depends on the underlying hardware configuration and program behaviors. Thus, this paper proposes a statically greedy and dynamically adaptive approach for loop-level speculation to dynamically determine the best loop level at runtime. It relies on the compiler to select and optimize all loop candidates greedily, which are then proceeded on the cost-benefit analysis of different loop nesting levels for the determination of the order of loop speculation. Under the runtime loop execution prediction, we dynamically schedule and update the order of loop speculation, and ensure the best loop level to be always parallelized. Two different policies are also examined to maximize overall performance. Compared with traditional static loop selection techniques, our approach (:an achieve comparable or better performance.  相似文献   

19.
前瞻多线程结构(SMA)是在超标前瞻执行技术和多线程技术的基础上结合了二者的优点而发展起来的,首先研究了SMA模型的特点,指出了3个关键性能要素;现场负载不均衡、线程间控制前瞻失效与线程间数据前瞻失效,为了有效地开发SMA结构的潜能,引入了若干启式规则,设计了基于线程的动态轮廓采样机制,并在此基础上实现了一个持续优化框架原理,对上述优化规则的模拟表明,该优化原型能够较好地完成线程优化任务,有效地开发SMA结构的性能潜力。  相似文献   

20.
Instruction-level traces are widely used for program and hardware analysis. However, program traces for just a few seconds of execution are enormous, up to several terabytes in size, uncompressed. Specialized compression can shrink traces to a few gigabytes, but trace analyzers typically stream the decompressed trace through the analysis engine. Thus, the complexity of analysis depends on the decompressed trace size (even though the decompressed trace is never stored to disk). This makes many global or interactive analyses infeasible. This paper presents a method to compress program traces using binary decision diagrams (BDDs). BDDs intrinsically support operations common to many desirable program analyses and these analyses operate directly on the BDD. Thus, they are often polynomial in the size of the compressed representation. The paper presents mechanisms to represent a variety of trace data using BDDs and shows that BDDs can store, in 1 GB of RAM, the entire data-dependence graph of traces with over 1 billion instructions. This allows rapid computation of global analyses such as heap-object liveness and dynamic slicing  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号