首页 | 官方网站   微博 | 高级检索  
     

面向部分向量化的循环分布及聚合优化
引用本文:韩林,徐金龙,李颖颖,王阳.面向部分向量化的循环分布及聚合优化[J].计算机科学,2017,44(2):70-74, 81.
作者姓名:韩林  徐金龙  李颖颖  王阳
作者单位:信息工程大学 郑州450001;数学工程与先进计算国家重点实验室 无锡214125,信息工程大学 郑州450001;数学工程与先进计算国家重点实验室 无锡214125,信息工程大学 郑州450001,信息工程大学 郑州450001
基金项目:本文受郑州市科学技术局,前沿技术研究开发计划(141PQYJS558),数学工程与先进计算国家重点实验室开放课题(2013A11)资助
摘    要:大量循环中都存在着少数无法向量化的语句以及许多可向量化语句,循环分布通常可以将这些语句分离到不同的循环中,进而实现循环的部分向量化。目前主流的优化编译器仅支持简单激进的循环分布方法,因而导致向量化后的循环开销过大,且不利于寄存器和cache的重用。针对上述问题,提出了面向部分向量化的循环分布及聚合方法。首先,分析了一般循环分布的两个关键问题:语句集的划分和循环执行顺序的确定;其次,提出了面向最大聚合的凝聚图结点排序方法来指导循环合并,在不影响并行性的前提下减小了循环开销;最后,通过实验对提出的方法进行了验证。实验结果表明,对于测试用例,提出的方法能够生成正确的向量化代码,并且能够显著提高向量化程序的执行效率。

关 键 词:部分向量化  循环分布  循环聚合  凝聚图
收稿时间:2015/11/3 0:00:00
修稿时间:2016/2/26 0:00:00

Method of Loop Distribution and Aggregation for Partial Vectorization
HAN Lin,XU Jin-long,LI Ying-ying and WANG Yang.Method of Loop Distribution and Aggregation for Partial Vectorization[J].Computer Science,2017,44(2):70-74, 81.
Authors:HAN Lin  XU Jin-long  LI Ying-ying and WANG Yang
Affiliation:Information Engineering University,Zhengzhou 450001,China;State Key Laboratory of Mathematical Engineering and Advanced Computing,Wuxi 214125,China,Information Engineering University,Zhengzhou 450001,China;State Key Laboratory of Mathematical Engineering and Advanced Computing,Wuxi 214125,China,Information Engineering University,Zhengzhou 450001,China and Information Engineering University,Zhengzhou 450001,China
Abstract:There are a large number of loops which contain few unvectorizable statements and many vectorizable statements.Loop distribution separates these specific statements into different loops,and then partial vectorization can be achieved.Currently,the mainstream optimizing compiler just support loop distribution which is simple and aggressive,resulting in large loop overhead and bad reuse of register and cache.To solve these problems,a method of loop distribution and aggregation for partial vectorization was proposed.Firstly,two key issues were analyzed in loop distribution,which are grouping of statements and execution order of distributed loops.Secondly,a modified topological sorting method was presented to achieve better loop aggregation,which reduces the loop overhead.Finally,we evaluated the proposed method in the experimental section.The experimental results show that the proposed method can produce correct SIMD code,and can significantly improve the efficiency of implementation program.
Keywords:Prtial vectorization  Loop distribution  Loop aggregation  Aggregation graph
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号