首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
针对OpenCL(open computing language)编译时期的特有模式, 提出了一种新的针对异构计算平台的编译期优化方法。该方法根据设备端和主机端的各自特点, 将设备端的一些冗余操作提到主机端或者新的设备端kernel中去执行, 以达到降低存储器读写的目的。这种方法充分利用了异构计算平台的特点, 较传统优化方法相对灵活。大多数情况下能有效提高OpenCL的运行速度, 测试用例中在应用原有编译器优化的基础上最快提高了270%。  相似文献   

2.
GKD-Base PL/SQL存储函数实现的关键技术研究   总被引:1,自引:0,他引:1  
介绍具有自主知识产权的某安全数据库管理系统GKD-Base的PL/SQL引擎,基于该引擎研究GKD-Base存储函数机制实现的关键技术。设计了函数管理器和执行状态堆栈,通过语法树表示存储函数编译后生成的中间代码,并解决了中间代码的执行问题。最后实现了IN、OUT、INOUT三种参数模式函数的参数传递机制。  相似文献   

3.
随着智能计算和大数据应用的发展,人们对GPU等加速部件的需求不断增长.计算软件栈比如CUDA、OpenCL软件栈是能充分发挥GPU硬件性能的关键.考虑计算软件栈未来在国产基础软硬件平台(比如飞腾CPU和麒麟操作系统)上的可移植性和适配性,重点研究OpenCL开源计算软件栈.测试分析OpenCL应用在不同平台上的表现,评估应用在不同OpenCL软件栈上(比如Mesa、ROCm等)进行GPU计算的表现,评估软件栈中驱动、内核等对GPU计算的影响,并且整个测试涵盖了编译、数据传输和内核执行等OpenCL计算各个阶段的时间开销.经过测试评估发现,国产平台更迫切也更适合使用GPU进行加速计算,ROCm是比较理想的OpenCL开源软件栈,有较好的性能和稳定性,并且与闭源软件栈相比存在一定的优化空间.  相似文献   

4.
传统的可编程逻辑控制器(PLC)采用解释执行方式执行梯形图,执行效率低下.对此该文提出以编译执行方式代替解释执行方式来提高执行效率.但通常的编译执行方式实现难度巨大,因此本文提出利用GNU编译器集(GCC)实现编译执行的解决方法.即先将梯形图转换为C语言程序,然后通过GCC的开放平台得到编译执行所需的各工具,并运用这些工具编译C语言程序从而实现PLC的编译执行方式.测试表明编译执行的PLC执行效率大幅提升.  相似文献   

5.
目前,异构计算技术已经被广泛应用于人工智能领域,旨在利用以GPGPU为主的并行加速设备和CPU协同工作,更高效地完成大规模的并行计算.深度学习模型的构建、训练以及推理离不开机器学习框架的支持,但目前主流的机器学习框架基本仅支持CUDA异构编程模型.CUDA的私有性和封闭性导致机器学习框架严重依赖于英伟达GPGPU.众多其它厂商的硬件加速器,尤其是国产加速器难以充分发挥其在深度学习中的潜力.使用开源统一异构编程标准OpenCL代替私有的CUDA编程模型,是打破这一技术壁垒的有效方法.本文提出了TensorFlow中CUDA到OpenCL核函数的代码转换方案,总结整理了核函数转换的基本规则、典型难点问题的解决方法以及OpenCL核函数的性能优化等关键技术.本文首次完成了TensorFlow 2.2版本中135个OpenCL核函数的实现.经一系列测试验证,转换生成的135个OpenCL核函数能够在多种支持OpenCL标准的加速器上正确运行,优化后,近八成的OpenCL核函数在英伟达Tesla V100S上达到了与CUDA核函数相当的计算性能.测试结果验证了本文提出的CUDA到OpenCL核函...  相似文献   

6.
【目的】TensorFlow是人工智能领域最具代表性的深度学习框架。国产加速设备需要一个支持OpenCL的TensorFlow才能发挥其加速性能,为此需要将TensorFlow框架下的CUDA代码向OpenCL转换。如何验证OpenCL核函数的正确性,是研发任务面对的重要问题。【方法】基于TensorFlow动态链接库自定义算子和raw_ops测试接口,本文提出了一套OpenCL核函数的测试解决方案,包括自定义算子的源码设计规范、测试代码规范、代码审核方法和测试流程。【结果】本文实现了对135个OpenCL核函数代码的审核与测试,在各种数据类型及多种数据规模下进行了测试对比,完成了OpenCL核函数正确性的验证,及其与CUDA核函数的性能比较。【结论】本文为TensorFlow下OpenCL核函数的测试提供了可靠而有效的解决方案。  相似文献   

7.
金凯峰  王雅文 《软件》2013,(12):14-17
测试用例执行框架是代码测试系统(Code Testing System CTS)的重要组成部分,用于执行测试用例,捕获插装结果。在CTS系统从Windows平台向Linux平台移植的过程中,测试用例执行框架遇到了线程调用和控制不兼容,异常捕获失败以及代码编译不通过等问题。本文通过使用Linux平台上常用的Pthread线程库,解决线程调用问题,通过设计并实现异常栈解决C语言异常捕获失败的问题,通过解析被测单元所在C语言工程的makeifle文件,提取出编译需要的头文件及链接共享库或静态链接库需要的链接选项,解决测试用例执行框架编译失败的问题。通过解决上述问题,CTS的测试用例执行框架能够在Linux平台上正常运行。  相似文献   

8.
为解决OpenCL多任务环境的自适应调度问题,分析资源竞争导致的内核执行效率下降情况,提出一个可在CPU-GPU异构平台上高效调度多个程序内核的OpenCL任务调度框架。通过随机森林模型分析OpenCL任务在不同设备上的运行状态,提出一套量化OpenCL内核数据传输的公式,提高OpenCL任务分析的准确性;采用负载均衡程度和单任务调度时间混合指标策略,保证系统执行效率,保障单个任务的执行效率。通过实验验证该框架的良好性能,实验结果表明,在不同程度的资源竞争的情况下,与两种常见的调度策略相比,该框架在负载均衡和任务执行效率指标方面均有提升。  相似文献   

9.
气候系统模式的不确定性量化分析包含复杂的工作流执行流程。对其中量化分析过程和模式后处理任务两个阶段的工作流进行了分析,设计和实现了针对该分析工作的工作流执行平台。该平台能够自动化执行典型的气候系统仿真实验与分析的工作流程,支持通过专家知识灵活配置工作流。平台中包含工作流层面的容错设计,支持通过分布式并发执行来加速工作流中的大计算量任务。同时,平台支持用户自定义插件的接入,增强了工作流系统的模块化和规范化。使用设计的平台对GAMIL2气候系统模式进行了物理过程参数不确定性的分析实验,验证了本平台的可行性和有效性。  相似文献   

10.
针对JavaScript引擎在即时编译模式下的开销过高和网页加载时间过长的问题,改进JavaScript引擎中即时编译模式的编译方式,设计一种对JavaScript代码的动态编译方式,只对JavaScript代码中频繁执行的热点区域进行即时编译,其余代码则运行在解释模式下,合理地利用了即时编译模式。实验测试结果表明,动态编译方式能够减少JavaScript引擎的开销耗费,加快网页的加载速度。  相似文献   

11.
Abstract This paper describes an approach to the design of interactive multimedia materials being developed in a European Community project. The developmental process is seen as a dialogue between technologists and teachers. This dialogue is often problematic because of the differences in training, experience and culture between them. Conditions needed for fruitful dialogue are described and the generic model for learning design used in the project is explained.  相似文献   

12.
European Community policy and the market   总被引:1,自引:0,他引:1  
Abstract This paper starts with some reflections on the policy considerations and priorities which are shaping European Commission (EC) research programmes. Then it attempts to position the current projects which seek to capitalise on information and communications technologies for learning in relation to these priorities and the apparent realities of the marketplace. It concludes that while there are grounds to be optimistic about the contribution EC programmes can make to the efficiency and standard of education and training, they are still too technology driven.  相似文献   

13.
融合集成方法已经广泛应用在模式识别领域,然而一些基分类器实时性能稳定性较差,导致多分类器融合性能差,针对上述问题本文提出了一种新的基于多分类器的子融合集成分类器系统。该方法考虑在度量层融合层次之上通过对各类基多分类器进行动态选择,票数最多的类别作为融合系统中对特征向量识别的类别,构成一种新的自适应子融合集成分类器方法。实验表明,该方法比传统的分类器以及分类融合方法识别准确率明显更高,具有更好的鲁棒性。  相似文献   

14.
Development of software intensive systems (systems) in practice involves a series of self-contained phases for the lifecycle of a system. Semantic and temporal gaps, which occur among phases and among developer disciplines within and across phases, hinder the ongoing development of a system because of the interdependencies among phases and among disciplines. Such gaps are magnified among systems that are developed at different times by different development teams, which may limit reuse of artifacts of systems development and interoperability among the systems. This article discusses such gaps and a systems development process for avoiding them.  相似文献   

15.
This paper presents control charts models and the necessary simulation software for the location of economic values of the control parameters. The simulation program is written in FORTRAN, requires only 10K of main storage, and can run on most mini and micro computers. Two models are presented - one describes the process when it is operating at full capacity and the other when the process is operating under capacity. The models allow the product quality to deteriorate to a further level before an existing out-of-control state is detected, and they can also be used in situations where no prior knowledge exists of the out-of-control causes and the resulting proportion defectives.  相似文献   

16.
Going through a few examples of robot artists who are recognized worldwide, we try to analyze the deepest meaning of what is called “robot art” and the related art field definition. We also try to highlight its well-marked borders, such as kinetic sculptures, kinetic art, cyber art, and cyberpunk. A brief excursion into the importance of the context, the message, and its semiotics is also provided, case by case, together with a few hints on the history of this discipline in the light of an artistic perspective. Therefore, the aim of this article is to try to summarize the main characteristics that might classify robot art as a unique and innovative discipline, and to track down some of the principles by which a robotic artifact can or cannot be considered an art piece in terms of social, cultural, and strictly artistic interest. This work was presented in part at the 13th International Symposium on Artificial Life and Robotics, Oita, Japan, January 31–February 2, 2008  相似文献   

17.
Although there are many arguments that logic is an appropriate tool for artificial intelligence, there has been a perceived problem with the monotonicity of classical logic. This paper elaborates on the idea that reasoning should be viewed as theory formation where logic tells us the consequences of our assumptions. The two activities of predicting what is expected to be true and explaining observations are considered in a simple theory formation framework. Properties of each activity are discussed, along with a number of proposals as to what should be predicted or accepted as reasonable explanations. An architecture is proposed to combine explanation and prediction into one coherent framework. Algorithms used to implement the system as well as examples from a running implementation are given.  相似文献   

18.
This paper provides the author's personal views and perspectives on software process improvement. Starting with his first work on technology assessment in IBM over 20 years ago, Watts Humphrey describes the process improvement work he has been directly involved in. This includes the development of the early process assessment methods, the original design of the CMM, and the introduction of the Personal Software Process (PSP)SM and Team Software Process (TSP){SM}. In addition to describing the original motivation for this work, the author also reviews many of the problems he and his associates encountered and why they solved them the way they did. He also comments on the outstanding issues and likely directions for future work. Finally, this work has built on the experiences and contributions of many people. Mr. Humphrey only describes work that he was personally involved in and he names many of the key contributors. However, so many people have been involved in this work that a full list of the important participants would be impractical.  相似文献   

19.
基于复小波噪声方差显著修正的SAR图像去噪   总被引:4,自引:1,他引:3  
提出了一种基于复小波域统计建模与噪声方差估计显著性修正相结合的合成孔径雷达(Synthetic Aperture Radar,SAR)图像斑点噪声滤波方法。该方法首先通过对数变换将乘性噪声模型转化为加性噪声模型,然后对变换后的图像进行双树复小波变换(Dualtree Complex Wavelet Transform,DCWT),并对复数小波系数的统计分布进行建模。在此先验分布的基础上,通过运用贝叶斯估计方法从含噪系数中恢复原始系数,达到滤除噪声的目的。实验结果表明该方法在去除噪声的同时保留了图像的细节信息,取得了很好的降噪效果。  相似文献   

20.
Abstract  This paper considers some results of a study designed to investigate the kinds of mathematical activity undertaken by children (aged between 8 and 11) as they learned to program in LOGO. A model of learning modes is proposed, which attempts to describe the ways in which children used and acquired understanding of the programming/mathematical concepts involved. The remainder of the paper is concerned with discussing the validity and limitations of the model, and its implications for further research and curriculum development.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号