首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
BWDSP100是一款采用超长指令字(VLIW)和单指令多数据流(SIMD)架构的针对高性能计算领域而设计的32位静态标量数字信号处理器,其指令级并行(ILP)主要是通过其特殊的分簇体系结构和SIMD指令来实现,然而现有的编译框架无法对这些特殊的SIMD指令提供支持。由于BWDSP100拥有丰富的SIMD向量化资源,且其所运用的雷达数字信号处理领域对程序的性能要求极高,因此针对BWDSP100结构的特点,在传统Open64编译器中SIMD编译优化框架的基础上提出并实现了一种支持单双字模式选择的SIMD编译优化算法,通过该算法可以显著提高一些在DSP上有着广泛运用计算密集型程序的性能。实验结果表明,与优化前相比,该算法方案在BWDSP编译器上的实现能够平均取得5.66的加速比。  相似文献   

2.
GCC后端中四路双精度短向量寄存器的实现   总被引:1,自引:1,他引:0  
设计和实现一个新的产品化的编译器通常需要几年时间。基于已有的编译器进行修改和扩展,是研发面向新体系结构的编译器的主要途径。GNU编译器集合(GCC)支持多种高级语言和多种目标处理器平台、文档及源代码开放等。基于GCC的Sparc后端,实现了支持四路双精度SIMD指令的四路双精度短向量寄存器的描述。在此过程中,定义了新的目标机,扩充了一类向量模式,定义了一类新的寄存器约束,实现了四路双精度寄存器的描述,定义了四路双精度SIMD指令的机器描述。对于面向此类SIMD指令的内嵌函数,GCC编译器能够正确使用该类向量寄存器来生成对应的SIMD指令。  相似文献   

3.
分簇结构超长指令字DSP编译器的设计与实现   总被引:5,自引:0,他引:5  
超长指令字(VLIW)是高端DSP普遍采用的体系结构。VLIW DSP在硬件上没有调度和冲突判决的机制,其性能的发挥完全依靠编译嚣的优化效果.基于可重定向编译基础设施IMPACT,为分簇VLIW DSP YHFT—D4设计与实现了优化编译器.其中着重讨论了可重定向信息的定义、代码注释、SIMD指令的支持、分簇寄存器分配以度指令级并行开发和资源冲突解决等内容.实验结果表明该编译器可以达到较好的优化效果.  相似文献   

4.
《软件工程师》2018,(2):14-17
Open64是一个拥有GNU通用公共许可证的开源高性能编译器,设计结构好,分析优化全面,是编译器高级研究的理想平台。本文针对BWDSP处理器所提供的高效特殊运算指令,在Open64基础上研究并实现了面向BWDSP中的特殊指令合成策略。该策略通过扩展并重定向编译器后端模块,能够充分地利用BWDSP中的复数指令、累加指令、乘累加指令和平方和指令等特殊指令。实验结果表明,本文提出的特殊指令合成策略能够很大程度上提高程序的性能。  相似文献   

5.
BWDSP是针对高性能计算设计的一款新型的处理器, 采用多簇超长指令字体系结构和SIMD架构, 有丰富的指令集. 为充分利用BWDSP提供的向量化资源, 迫切需要提出一种向量化算法. 本文在open64基础上研究并实现了面向多簇超长指令字(VLIW)DSP的SIMD编译优化算法. 算法基于OPEN64的中间语言WHIRL, 能够充分地利用BWDSP丰富的硬件资源和向量化指令. 最终实验结果表明, 对于能够合成双字和单字的循环程序, 该优化算法能够平均取得6倍和4倍的加速比.  相似文献   

6.
SIMD计算机的优化编译器设计   总被引:1,自引:1,他引:0       下载免费PDF全文
赵辉  黄石 《计算机工程》2009,35(1):201-203
利用处理器的相关资源,提高编译器优化性能和增强代码可适应性是SIMD处理器优化编译的关键。该文基于M语言和LSSIMD体系结构,结合现代编译器的编译技术,提出针对SIMD协处理器编译器的优化和实现方法,包括寄存器分配、单值合并、代码压缩等。实验结果表明,编译生成的目标代码准确、高效。  相似文献   

7.
目前BWDSP104X编译器对程序中条件分支的处理是采用传统的谓词优化方法,及每条指令和一个谓词相关,只有当谓词为真时指令才被执行,但它存在的局限性是当涉及到多条件谓词时,并不能消除跳转分支,且多条件谓词之间可能存在控制依赖关系,不利于指令并行和指令流水. 因此在现有编译器框架下,针对传统谓词优化方法的不足之处,本文提出一种基于BWDSP104X体系结构下多条件谓词编译优化方法. 实验结果表明,与传统谓词优化方法相比,该优化算法在BWDSP104X编译器上能够取得平均5.62的加速比.  相似文献   

8.
多媒体处理器的SIMD代码生成   总被引:1,自引:0,他引:1  
通用处理器的SIMD(Single Instruction Multiple Data)多媒体扩展,为提高多媒体应用的性能提供了新的体系结构支持。但目前编译技术对这类指令不能提供很好的支持。本文提出了一个新的SIMD指令生成算法,基于把编译器前端的程序分析和编译器后端的机器信息相结合的思想,采用扩展的treeparsing技术,有效识别程序中的并行操作以生成SIMD指令。基于SUIF(Stanford University Intermediate Format)编译器框架的实验表明,针对一组多媒体kernel,本文提出的算法可平均减少其非SIMD代码47%的cycles。  相似文献   

9.
国防科技大学自主研制的高性能加速器采用中央处理器(CPU)+通用数字信号处理器(GPDSP)的片上异构融合架构,使用超长指令集(VLIW)+单指令多数据流(SIMD)的向量化结构的GPDSP是峰值性能主要支撑的加速核。主流编译器在密集的数据计算指令排布、为指令静态分配硬件执行单元、GPDSP特有的向量指令等方面不能很好地支持高性能加速器。基于低级虚拟器(LLVM)编译框架,在前寄存器分配调度阶段,结合峰值寄存器压力感知方法(PERP)、蚁群优化(ACO)算法与GPDSP结构特点,优化代价模型,设计支持寄存器压力感知的指令调度模块;在后寄存器分配阶段提出支持静态功能单元分配的指令调度策略,通过冲突检测机制保证功能单元分配的正确性,为指令并行执行提供软件基础;在后端封装一系列丰富且规整的向量指令接口,实现对GPDSP向量指令的支持。实验结果表明,所提出的LLVM编译架构优化方法从功能和性能上实现了对GPDSP的良好支撑,GCC testsuite测试整体性能平均加速比为4.539,SPEC CPU 2017浮点测试整体性能平均加速比为4.49,SPEC CPU 2017整型测试整体性能平均...  相似文献   

10.
协作式全局指令调度与寄存器分配   总被引:1,自引:1,他引:0  
指令级并行是现代高性能代理器的重要特征,对于发挥这类处理器所具有的并行处理能力来说,编译器有至关重要的影响。文中讨论指令级并行编译中的核心问题-全局指令调度与 器分配,并以作者为一种新型的显式并行体系结构微处理器的编译系统为背景,介绍了此类编译器后端设计中面临的指令调度与寄存器分配的时序问题,以及为解决这一问题而提出了的一种协作式全局指令调度与寄存器分配方法。  相似文献   

11.
European Community policy and the market   总被引:1,自引:0,他引:1  
Abstract This paper starts with some reflections on the policy considerations and priorities which are shaping European Commission (EC) research programmes. Then it attempts to position the current projects which seek to capitalise on information and communications technologies for learning in relation to these priorities and the apparent realities of the marketplace. It concludes that while there are grounds to be optimistic about the contribution EC programmes can make to the efficiency and standard of education and training, they are still too technology driven.  相似文献   

12.
融合集成方法已经广泛应用在模式识别领域,然而一些基分类器实时性能稳定性较差,导致多分类器融合性能差,针对上述问题本文提出了一种新的基于多分类器的子融合集成分类器系统。该方法考虑在度量层融合层次之上通过对各类基多分类器进行动态选择,票数最多的类别作为融合系统中对特征向量识别的类别,构成一种新的自适应子融合集成分类器方法。实验表明,该方法比传统的分类器以及分类融合方法识别准确率明显更高,具有更好的鲁棒性。  相似文献   

13.
Although there are many arguments that logic is an appropriate tool for artificial intelligence, there has been a perceived problem with the monotonicity of classical logic. This paper elaborates on the idea that reasoning should be viewed as theory formation where logic tells us the consequences of our assumptions. The two activities of predicting what is expected to be true and explaining observations are considered in a simple theory formation framework. Properties of each activity are discussed, along with a number of proposals as to what should be predicted or accepted as reasonable explanations. An architecture is proposed to combine explanation and prediction into one coherent framework. Algorithms used to implement the system as well as examples from a running implementation are given.  相似文献   

14.
This paper provides the author's personal views and perspectives on software process improvement. Starting with his first work on technology assessment in IBM over 20 years ago, Watts Humphrey describes the process improvement work he has been directly involved in. This includes the development of the early process assessment methods, the original design of the CMM, and the introduction of the Personal Software Process (PSP)SM and Team Software Process (TSP){SM}. In addition to describing the original motivation for this work, the author also reviews many of the problems he and his associates encountered and why they solved them the way they did. He also comments on the outstanding issues and likely directions for future work. Finally, this work has built on the experiences and contributions of many people. Mr. Humphrey only describes work that he was personally involved in and he names many of the key contributors. However, so many people have been involved in this work that a full list of the important participants would be impractical.  相似文献   

15.
基于复小波噪声方差显著修正的SAR图像去噪   总被引:4,自引:1,他引:3  
提出了一种基于复小波域统计建模与噪声方差估计显著性修正相结合的合成孔径雷达(Synthetic Aperture Radar,SAR)图像斑点噪声滤波方法。该方法首先通过对数变换将乘性噪声模型转化为加性噪声模型,然后对变换后的图像进行双树复小波变换(Dualtree Complex Wavelet Transform,DCWT),并对复数小波系数的统计分布进行建模。在此先验分布的基础上,通过运用贝叶斯估计方法从含噪系数中恢复原始系数,达到滤除噪声的目的。实验结果表明该方法在去除噪声的同时保留了图像的细节信息,取得了很好的降噪效果。  相似文献   

16.
Abstract  This paper considers some results of a study designed to investigate the kinds of mathematical activity undertaken by children (aged between 8 and 11) as they learned to program in LOGO. A model of learning modes is proposed, which attempts to describe the ways in which children used and acquired understanding of the programming/mathematical concepts involved. The remainder of the paper is concerned with discussing the validity and limitations of the model, and its implications for further research and curriculum development.  相似文献   

17.
正The demands of a rapidly advancing technology for faster and more accurate controllers have always had a strong influence on the progress of automatic control theory.In recent years control problems have been arising with increasing frequency in widely different areas,which cannot be addressed using conventional control techniques.The principal reason for this is the fact that a highly competitive economy is forcing systems to operate in regimes where  相似文献   

18.
正Aim The Journals of Zhejiang University-SCIENCE(A/B/C)areedited by the international board of distinguished Chinese andforeign scientists,and are aimed to present the latest devel-opments and achievements in scientific research in China andoverseas to the world’s scientific circles,especially to stimulateand promote academic exchange between Chinese and for-eign scientists everywhere.  相似文献   

19.
The relative concentrations of different pigments within a leaf have significant physiological and spectral consequences. Photosynthesis, light use efficiency, mass and energy exchange, and stress response are dependent on relationships among an ensemble of pigments. This ensemble also determines the visible characteristics of a leaf, which can be measured remotely and used to quantify leaf biochemistry and structure. But current remote sensing approaches are limited in their ability to resolve individual pigments. This paper focuses on the incorporation of three pigments—chlorophyll a, chlorophyll b, and total carotenoids—into the LIBERTY leaf radiative transfer model to better understand relationships between leaf biochemical, biophysical, and spectral properties.Pinus ponderosa and Pinus jeffreyi needles were collected from three sites in the California Sierra Nevada. Hemispheric single-leaf visible reflectance and transmittance and concentrations of chlorophylls a and b and total carotenoids of fresh needles were measured. These data were input to the enhanced LIBERTY model to estimate optical and biochemical properties of pine needles. The enhanced model successfully estimated reflectance (RMSE = 0.0255, BIAS = 0.00477, RMS%E = 16.7%), had variable success estimating transmittance (RMSE = 0.0442, BIAS = 0.0294, RMS%E = 181%), and generated very good estimates of carotenoid concentrations (RMSE = 2.48 µg/cm2, BIAS = 0.143 µg/cm2, RMS%E = 20.4%), good estimates of chlorophyll a concentrations (RMSE = 10.7 µg/cm2, BIAS = − 0.992 µg/cm2, RMS%E = 21.1%), and fair estimates of chlorophyll b concentrations (RMSE = 7.49 µg/cm2, BIAS = − 2.12 µg/cm2, RMS%E = 43.7%). Overall root mean squared errors of reflectance, transmittance, and pigment concentration estimates were lower for the three-pigment model than for the single-pigment model. The algorithm to estimate three in vivo specific absorption coefficients is robust, although estimated values are distorted by inconsistencies in model biophysics. The capacity to invert the model from single-leaf reflectance and transmittance was added to the model so it could be coupled with vegetation canopy models to estimate canopy biochemistry from remotely sensed data.  相似文献   

20.
This article discusses the history and design of the special versions of the bombe key-finding machines used by Britain’s Government Code & Cypher School (GC&CS) during World War II to attack the Enigma traffic of the Abwehr (the German military intelligence service). These special bombes were based on the design of their more numerous counterparts used against the traffic of the German armed services, but differed from them in important ways that highlight the adaptability of the British bombe design, and the power and flexibility of the diagonal board. Also discussed are the changes in the Abwehr indicating system that drove the development of these machines, the ingenious ways in which they were used, and some related developments involving the bombes used by the U.S. Navy’s cryptanalytic unit (OP-20-G).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号