首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 171 毫秒
1.
讨论了一种可针对程序中的不同函数(routine)采用不同的循环优化序列的方法,该方法基于polyhedron模型,使用简化cache失效率方程分别对每一个函数作优化序列评估,以迭代编译方式为每个函数寻找一个独特的循环优化序列.该方法降低了变换实施的复杂度,降低了对编译器具体实现的依赖;考虑程序中不同函数的差异以获得更好的优化效果.对SPEC2006的实验数据表明,经过定制优化之后,较之Open64-03,加速比为1.05~1.13.  相似文献   

2.
针对天然气长输管网能耗高的问题,研究了基于Aspen Plus的计算机模拟模型及优化算法。采用BWRS热力学方法,考虑了燃气轮机消耗天然气对管网的影响,并采用Calculator模块计算燃气消耗量,建立了天然气管网模型。以压缩机站能耗最低为目标函数,以节点流量、压力、压缩机转速为操作变量,以压缩机特性参数、各节点流量、压力、管道的压力限制等为约束条件,采用SQP算法(序列二次规划法)对含15个节点、9条管段、6个压缩机站的循环天然气管网进行优化,分别采用不同的压降计算方法,并得到了各最优管网运行方案。与文献值相比,该方案燃料消耗速率降低24.2%。相对于不考虑天然气消耗的管网,该优化方案使得能耗降低1772.6 k W。燃料消耗速率降低4.83%。不同压降计算方法优化结果表明,采用Weymouth法能耗最高,AGA法次之,Panhandle B法最低。  相似文献   

3.
兰浩  李德信 《计算机应用》2008,28(1):181-183
针对离散数据点序列的拟和精度及光顺度问题,提出了一种三次非均匀有理B样条(NURBS)曲线整体光顺逼近算法。该算法建立了一个由最小二乘、离散点曲率和、离散点曲率变化和三项组成的目标函数并求出了最优控制点序列坐标,采用非线性优化方法对权因子序列进行了调整,确立了逼近误差的近似表示方法,并提出了包含上述方法的循环判断流程。最后,实现了拟合曲线在UG NX 4.0中的显示和分析。  相似文献   

4.
目前,对情感分类常用的特征抽取方法是基于词典的向量空间模型(VSM),潜在的语义分析(LSA)和基于无监督算法的词嵌入(word2vec),和随机词向量法,这些方法都是对单个词语进行处理。本文通过哈工大词云对采集的豆瓣评论数据集进行语义角色进行的标记以后,采用了改进的隐马尔科夫模型(MHMM)对词对向量进行特征构建,并将其作为一个序列片段作为长短记忆门(LSTM)的输入,最后使用softmax函数对动态循环神经网络输出的序列进行分类,本次实验使用了交叉熵作为优化函数,采用了随机梯度下降法对优化函数进行迭代产生最优解,实验结果证明了该方法对豆瓣影评数据进行情感分类产生了更好的效果。  相似文献   

5.
龙文  焦建军  徐松金 《计算机应用》2012,32(6):1704-1706
通过构造一个适当的适应度函数,将渣油加氢精制反应动力学模型的参数估计问题转化为一个多维优化问题,然后提出一种组合遗传算法来求解该优化问题。该算法利用混沌序列初始化种群以保证其均匀分布在搜索空间中。在每次迭代过程中随机组合不同的交叉策略和变异以产生若干个新的子代个体。对四个标准数值优化问题进行了仿真实验,仿真结果表明了组合遗传算法的有效性。以石油炼制工业中典型装置催化裂化为例,对渣油加氢精制反应动力学模型的参数进行了优化,获得了满意的结果。  相似文献   

6.
基于掩模金字塔的高精度全局运动估计算法   总被引:1,自引:0,他引:1  
在视频序列的全局运动估计中,前景运动对象的存在常常会大幅度地降低估计的精度,为此提出一种对前景对象自适应的高精度全局运动估计算法.该算法以像素块为单位,利用块内外点的比重判定前景区域,同时引入马尔可夫聚类方法进行后处理,有效地提高了运动对象的定位精度;通过对目标函数引入权重系数增强对残差的鲁棒性,以进一步提高算法的估计精度.此外,基于像素掩模的3层金字塔构建序列图像,并将改进的梯度方法引入到优化过程中,提高了算法的实时性.对不同运动类型的标准视频序列的实验结果表明,该算法有效地提高了全局运动估计的精度和速度.  相似文献   

7.
时间序列图象的配准是医学临床科研、诊断和治疗中图象应用的必要步骤。为了快速、准确、简便地进行医学序列图象配准,提出了一种新的用于时间序列图象自动配准的方法,该方法利用了图象的联合直方图,首先通过对图象做简单的阈值分割,将联合直方图划分为4个区域,然后根据不同的配准图象数据选择定义在不同区域上的计数值作为参数计算的准则函数。该方法设计简单、巧妙,以计数方法代替其他方法中大量的浮点运算。由于准则函数具有良好的光滑特性,且选择Powell算法做最优化搜索,因此保证了优化结果的准确性。和其他算法相经,该方法大大简化了准则函数的计算,从而显著提高了配准优化搜索的速度。根据实验结果及基于互信息量方法做的对比,证明本文提出的方法准确、简便、快速、有效。  相似文献   

8.
部分传输序列(PTS)方法是OFDM系统中降低PAPR的一种高效技术,但计算复杂度高。提出一种基于FFT特性的多相循环移位和共轭(PCSC)法,该方法先由一个IFFT运算产生一个时域序列,然后结合FFT的时域循环移位性和共轭性构造出不同的转换,在时域产生更多的备选序列。为进一步降低计算复杂度,PCSC法充分利用天线Tx1而在天线Tx2中未使用IFFT运算。仿真结果表明,在子块数量和相位加权系数相同时,本文所提出的PCSC方法相对于PTS方法具有更加良好的PAPR降低性能。  相似文献   

9.
分布存储系统中优化通信的冗余计算分割   总被引:1,自引:0,他引:1  
针对并行循环套序列,提出一种冗余计算分割的通信优化方法,根据数据流分析,文中给出用以确定每个循环套的冗余计算量的一般方法,并在此基础上提出冗余计算分割的实现和判定,针对规则依赖的程序,该文还提出了一个高效的冗余计算分割的实现方法,该技术已经在一个并行编译器中实现,试验结果表明,它比传统的通信优化技术有明显的优越性。  相似文献   

10.
基于文化量子粒子群的模糊神经网络参数优化   总被引:1,自引:1,他引:0       下载免费PDF全文
模糊神经网络参数学习是一个函数优化问题。针对已有优化方法收敛精度不高的缺点,提出基于文化量子粒子群算法的模糊神经网络参数优化,并将其应用于混沌时间序列预测。仿真实例结果证实了该算法的优越性。  相似文献   

11.
The paper presents approaches to the validation of optimizing compilers. The emphasis is on aggressive and architecture-targeted optimizations which try to obtain the highest performance from modern architectures, in particular EPIC-like micro-processors. Rather than verify the compiler, the approach of translation validation performs a validation check after every run of the compiler, producing a formal proof that the produced target code is a correct implementation of the source code.First we survey the standard approach to validation of optimizations which preserve the loop structure of the code (though they may move code in and out of loops and radically modify individual statements), present a simulation-based general technique for validating such optimizations, and describe a tool, VOC-64, which implements these technique. For more aggressive optimizations which, typically, alter the loop structure of the code, such as loop distribution and fusion, loop tiling, and loop interchanges, we present a set of permutation rules which establish that the transformed code satisfies all the implied data dependencies necessary for the validity of the considered transformation. We describe the necessary extensions to the VOC-64 in order to validate these structure-modifying optimizations.Finally, the paper discusses preliminary work on run-time validation of speculative loop optimizations, that involves using run-time tests to ensure the correctness of loop optimizations which neither the compiler nor compiler-validation techniques can guarantee the correctness of. Unlike compiler validation, run-time validation has not only the task of determining when an optimization has generated incorrect code, but also has the task of recovering from the optimization without aborting the program or producing an incorrect result. This technique has been applied to several loop optimizations, including loop interchange, loop tiling, and software pipelining and appears to be quite promising.  相似文献   

12.
High-level program optimizations, such as loop transformations, are critical for high performance on multi-core targets. However, complex sequences of loop transformations are often required to expose parallelism (both coarse-grain and fine-grain) and improve data locality. The polyhedral compilation framework has proved to be very effective at representing these complex sequences and restructuring compute-intensive applications, seamlessly handling perfectly and imperfectly nested loops. It models arbitrarily complex sequences of loop transformations in a unified mathematical framework, dramatically increasing the expressiveness (and expected effectiveness) of the loop optimization stage. Nevertheless identifying the most effective loop transformations remains a major challenge: current state-of-the-art heuristics in polyhedral frameworks simply fail to expose good performance over a wide range of numerical applications. Their lack of effectiveness is mainly due to simplistic performance models that do not reflect the complexity today’s processors (CPU, cache behavior, etc.). We address the problem of selecting the best polyhedral optimizations with dedicated machine learning models, trained specifically on the target machine. We show that these models can quickly select high-performance optimizations with very limited iterative search. We decouple the problem of selecting good complex sequences of optimizations in two stages: (1) we narrow the set of candidate optimizations using static cost models to select the loop transformations that implement specific high-level optimizations (e.g., tiling, parallelism, etc.); (2) we predict the performance of each high-level complex optimization sequence with trained models that take as input a performance-counter characterization of the original program. Our end-to-end framework is validated using numerous benchmarks on two modern multi-core platforms. We investigate a variety of different machine learning algorithms and hardware counters, and we obtain performance improvements over productions compilers ranging on average from $3.2\times $ to $8.7\times $ , by running not more than $6$ program variants from a polyhedral optimization space.  相似文献   

13.
This paper presents new approaches to the validation of loop optimizations that compilers use to obtain the highest performance from modern architectures. Rather than verify the compiler, the approach of translation validationperforms a validation check after every run of the compiler, producing a formal proof that the produced target code is a correct implementation of the source code. As part of an active and ongoing research project on translation validation, we have previously described approaches for validating optimizations that preserve the loop structure of the code and have presented a simulation-based general technique for validating such optimizations. In this paper, for more aggressive optimizations that alter the loop structure of the code—such as distribution, fusion, tiling, and interchange—we present a set of permutation ruleswhich establish that the transformed code satisfies all the implied data dependencies necessary for the validity of the considered transformation. We describe the extensions to our tool voc-64 which are required to validate these structure-modifying optimizations. This paper also discusses preliminary work on run-time validation of speculative loop optimizations. This involves using run-time tests to ensure the correctness of loop optimizations whose correctness cannot be guaranteed at compile time. Unlike compiler validation, run-time validation must not only determine when an optimization has generated incorrect code, but also recover from the optimization without aborting the program or producing an incorrect result. This technique has been applied to several loop optimizations, including loop interchange and loop tiling, and appears to be quite promising. This research was supported in part by NSF grant CCR-0098299, ONR grant N00014-99-1-0131, and the John von Neumann Minerva Center for Verification of Reactive Systems.  相似文献   

14.
This paper presents a new compiler optimization algorithm that parallelizes applications for symmetric, shared-memory multiprocessors. The algorithm considers data locality, parallelism, and the granularity of parallelism. It uses dependence analysis and a simple cache model to drive its optimizations. It also optimizes across procedures by using interprocedural analysis and transformations. We validate the algorithm by hand-applying it to sequential versions of parallel, Fortran programs operating over dense matrices. The programs initially were hand-coded to target a variety of parallel machines using loop parallelism. We ignore the user's parallel loop directives, and use known and implemented dependence and interprocedural analysis to find parallelism. We then apply our new optimization algorithm to the resulting program. We compare the original parallel program to the hand-optimized program, and show that our algorithm improves three programs, matches four programs, and degrades one program in our test suite on a shared-memory, bus-based parallel machine with local caches. This experiment suggests existing dependence and interprocedural array analysis can automatically detect user parallelism, and demonstrates that user parallelized codes often benefit from our compiler optimizations, providing evidence that we need both parallel algorithms and compiler optimizations to effectively utilize parallel machines  相似文献   

15.
Simultaneous Multithreading (SMT) is a processor architectural technique that promises to significantly improve the utilization and performance of modern wide-issue superscalar processors. An SM T processor is capable of issuing multiple instructions from multiple threads to a processor's functional units each cycle. Unlike shared-memory multiprocessors, SMT provides and benefits from fine-grained sharing of processor and memory system resources; unlike current uniprocessors, SMT exposes and benefits from inter-thread instruction-level parallelism when hiding long-latency operations. Compiler optimizations are often driven by specific assumptions about the underlying architecture and implementation of the target machine, particularly for parallel processors. For example, when targeting shared-memory multiprocessors, parallel programs are compiled to minimize sharing, in order to decrease high-cost inter-processor communication. Therefore, optimizations that are appropriate for these conventional machines may be inappropriate for SMT, which can benefit from finegrained resource sharing within the processor. This paper reexamines several compiler optimizations in the context of simultaneous multithreading. We revisit three optimizations in this light: loop-iteration scheduling, software speculative execution, and loop tiling. Our results show that all three optimizations should be applied differently in the context of SMT architectures: threads should be parallelized with a cyclic, rather than a blocked algorithm; non-loop programs should not be software speculated, and compilers no longer need to be concerned about precisely sizing tiles to match cache sizes. By following these new guidelines, compilers can generate code that improves the performance of programs executing on SMT machines.  相似文献   

16.
Translation validation is an approach for validating the output of optimizing compilers. Rather than verifying the compiler itself, translation validation mandates that every run of the compiler generate a formal proof that the produced target code is a correct implementation of the source code. Speculative loop optimizations are aggressive optimizations which are only correct under certain conditions which cannot be validated at compile time. We propose using an automatic theorem prover together with the translation validation framework to automatically generate run-time tests for such speculative optimizations. This run-time validation approach must not only detect the conditions under which an optimization generates incorrect code, but also provide a way to recover from the optimization without aborting the program or producing an incorrect result. In this paper, we apply the run-time validation technique to a class of speculative reordering transformations and give some initial results of run-time tests generated by the theorem prover CVC.  相似文献   

17.
The performence of scientific programs on modern processors can be significantly degraded by memory references that frequently arise due to load and store operations associated with array references. We have developed techniques for optimally allocating registers to array elements whose values are repeatedly referenced over one or more loop iterations. The resulting placement of loads and stores is optimal in that number of loads and stores encoutered along each path through the loop is minimal for the given program branching structure. To place load, store, and register-to-register shift operations without introducing fully/partially redundant and dead memory operations, a detailed value flow analysis of array references is required. We present an analysis framework to efficiently solve various data flow problems required by array load-store optimizations. The framework determines the collective behavior of recurrent references spread over multiple loop iterations. We also demonstrate how our algorithms can be adapted for various fine-grain architectures.  相似文献   

18.
Loop cleaning     
Removing computations from a repeatedly executed region such as a loop or recursive procedure body (loop cleaning) is one of the most powerful program optimizations. In this paper, equivalent program transformations which consist of simultaneous removal of a set of region statements with placing the set either before the repeatedly executed region or behind it are considered. Their capabilities for performing more complete loop cleaning than the known ones are demonstrated.  相似文献   

19.
循环优化测试对保证现代编译器质量有着重要作用.传统手工构造测试用例方法面临着效率低的问题,而目前的自动构造方法对循环优化的针对性不足.提出并实现了一种基于参数化分支时序逻辑(pCTL)的循环优化测试用例自动生成方法.并用生成的测试用例对GCC-4.1.1进行覆盖率测试,结果表明该方法可以生成具有很高针对性的循环优化测试用例,并且很少的测试用例即可达到较高的覆盖程度.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号