期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

黄昕《计算机与现代化》2009,(6)

OpenTM在OpenMP的基础上引入事务的语法和语义,为事务存储程序设计提供了基于指导命令的程序设计接口.本文选取标准并行基准测试程序NPB中的应用程序LU作为例子,利用事务存储的投机并行执行能力和OpenTM接口实现了流水算法的并行.实验表明,OpenTM程序设计简单,避免了使用锁模式的复杂性,能够在科学计算领域发挥重大作用. 相似文献

2.

基于OpenMP/Fortran的源到源转换事务存储编程环境

黄春贾建斌彭林《计算机科学》2011,38(4):299-302

首次在Fortran语言中引入事务存储,对OpenMP Fortran API进行了扩展,以源到源转换的方式实现了FortranTM编译器原型。针对软件事务存储实现的特点,扩展了EXCLUDED和SCHEDULE指导命令子句,以便为程序员提供性能调整优化APIo测试结果表明FortranTM API编程便利,具有良好的性能。相似文献

3.

事务存储系统

彭林谢伦国张小强《计算机研究与发展》2009,46(8)

多核处理器性能的发挥依靠程序的并行,共享存储并行编程模型为大多数多核处理器所采用,而有效同步多个线程对共享变量的访问是其关键、也是难题.借鉴数据库中事务的思想,人们提出事务存储(transactional memory),旨在提供一种编程简单,对程序正确性推理容易的同步手段.简介了事务存储的起源,诠释了事务存储系统的概念.论述了事务存储的编程接口和执行模型.讨论了事务存储系统所涉及的主要内容,对各种方法和策略进行了比较.对事务存储中有待解决的问题进行了探讨.最后介绍了几个开源的事务存储研究平台. 相似文献

4.

基于OpenMP的事务存储同步语义研究

田祖伟李勇帆《计算机科学》2009,36(5):166-168

多核处理器环境下必须解决多核处理器的并行编程问题,才能够充分发挥多核处理器的性能.事务存储(Transactional Memory)机制提供了一种在多核环境下程序并行执行和同步的方法.已有的工作已将事务存储扩展到了OpenMP,为程序员提供满足事务原子性、一致性和隔离性的共享存储访问.但当前事务存储的语义并不完善,事务间不能交换中间结果,不能实现锁的部分语义.提出并实现了一种基于开放嵌套的事务存储的同步语义,从而解决了事务间不能交换中间结果的问题,增强了扩展事务存储后OpenMP的并行编程能力. 相似文献

5.

面向嵌入式系统的编译器设计及实现

李光宇李延新袁爱进《微计算机信息》2006,22(29):175-177

监控组态软件在工业控制中的应用越来越广泛,对用户编程接口的支持显得日益重要,本文对组态软件中用户编程接口进行了研究,在充分考虑监控组态软件行业应用特点的基础上,设计了一种组态语言——INVA语言,对其编译器及开发环境的实现方案进行论述,并对语言设计、语法分析、语义分析及编译器的实现进行了深入的讨论,并给出了其实现方案。该方案已经成功运用辽宁省教育厅重大项目“工业现场智能化设备的嵌入式软件构件平台研究”中,证明了该编译器的有效性。相似文献

6.

51C程序设计中应注意的问题

张景元《工业控制计算机》2001,14(1):56-58

本对单片机的C语言编译器进行了研究分析,指出了与一般C语言的区别及其在软件开发过程中须注意的存储区的定位及访问,专用寄存器的访问方法、并行接口的定义、位变量的定义等几个关键性的问题,并且结合8279扩展键盘/显示器实例进行了编程。相似文献

7.

硬件事务存储系统研究综述

王永会张鑫伟刘轶《小型微型计算机系统》2013,34(5)

随着多核处理器的发展,硬件平台已经提供了充裕的并行能力,这对软件并行编程提出了更高的要求.传统的基于锁机制的并行编程模型存在着诸多难题.借鉴数据库中事务的思想,人们提出事务存储,旨在提供一种可编程性良好的同步手段.硬件事务存储快速有效的优势使之成为研究的热点.阐述了事务存储的基本概念、执行模型和编程接口.介绍了硬件事务存储系统的三大核心内容,对比了两种典型的硬件事务存储系统.分析讨论了目前硬件事务存储系统研究的热点和难点问题.最后介绍了硬件事务存储研究的平台和测试程序. 相似文献

8.

LabVIEW中实现Oracle大对象数据存储的一种方法

孙熙文王友钊《工业控制计算机》2005,18(2):38-40

本通过Oracle的预编译器Pro^*C＼C 和LabVIEW的C接口CIN的应用,介绍了一种在LabVIEW下实现Oracle大对象数据(LOBs)存储的方法,并将该方法成功应用在大型旋转机械振动状态监测系统中。相似文献

9.

TS201的混合编程问题的研究 总被引：1，自引：0，他引：1

李长军于雷李云松《单片机与嵌入式系统应用》2007,(9):74-77

介绍在TS201上进行软件开发的几种方法,从工程实现的角度比较各自的优缺点,指出C/C 语言和汇编语言混合编程技术的优越性.之后详细阐述ccts编译器的C/C 运行时模型对TigerShark系列DSP芯片的C/C 混合编程所规定的调用规则和接口规范,并给出程序设计实例.对工程实践有很大的参考价值. 相似文献

10.

事务存储研究 总被引：1，自引：0，他引：1

黄国睿张平魏广博马航《计算机工程与设计》2010,31(2)

为了研究多核处理器系统上的并行编程问题,开展了对事务存储模型的研究.阐述了事务存储,介绍了事务存储系统的实现方法,利用4种事务存储系统详细阐述了事务存储的实现;重点讨论了6种影响事务存储发展的关键技术,即实现方式、数据结构组织、并发控制,冲突检测、争用管理等;提出了事务存储将向着软硬件结合、提升性能、提高正确性和满足多核应用需求的方向发展. 相似文献

11.

OpenGR: A directive-based grid programming environment

Motonori Hirano Mitsuhisa Sato Yoshio Tanaka 《Parallel Computing》2005,31(10-12):1140

A new grid programming environment for remote procedure call (RPC) based master–worker type task parallelization is presented. The environment is realized through the use of a set of compiler directives, called OpenGR, and is implemented in the present study based on the Omni OpenMP compiler system and Ninf-G grid-enabled RPC system as a parallel execution mechanism. Using OpenGR directives, existing sequential applications can be readily adapted to the grid environment as master–worker parallel programs using the RPC architecture. The combination of OpenGR and OpenMP directives also allows for the hybrid parallelization of sequential programs, supporting both synchronous and asynchronous parallelism. 相似文献

12.

MPtostream:an OpenMP compiler for CPU-GPU heterogeneous parallel systems

《中国科学:信息科学(英文版)》2012,(9):1961-1971

In light of GPUs’ powerful floating-point operation capacity,heterogeneous parallel systems incorporating general purpose CPUs and GPUs have become a highlight in the research field of high performance computing(HPC).However,due to the complexity of programming on GPUs,porting a large number of existing scientific computing applications to the heterogeneous parallel systems remains a big challenge.The OpenMP programming interface is widely adopted on multi-core CPUs in the field of scientific computing.To effectively inherit existing OpenMP applications and reduce the transplant cost,we extend OpenMP with a group of compiler directives,which explicitly divide tasks among the CPU and the GPU,and map time-consuming computing fragments to run on the GPU,thus dramatically simplifying the transplantation.We have designed and implemented MPtoStream,a compiler of the extended OpenMP for AMD’s stream processing GPUs.Our experimental results show that programming with the extended directives deviates from programming with OpenMP by less than 11% modification and achieves significant speedup ranging from 3.1 to 17.3 on a heterogeneous system,incorporating an Intel Xeon E5405 CPU and an AMD FireStream 9250 GPU,over the execution on the Xeon CPU alone. 相似文献

13.

A Comparison of Co-Array Fortran and OpenMP Fortran for SPMD Programming

Alan J. Wallcraft 《The Journal of supercomputing》2002,22(3):231-250

Co-Array Fortran, formally called F^––, is a small set of extensions to Fortran 90/95 for Single-Program-Multiple-Data (SPMD) parallel processing. OpenMP Fortran is a set of compiler directives that provide a high level interface to threads in Fortran, with both thread-local and thread-shared memory. OpenMP is primarily designed for loop-level directive-based parallelization, but it can also be used for SPMD programs by spawning multiple threads as soon as the program starts and having each thread then execute the same code independently for the duration of the run. The similarities and differences between these two SPMD programming models are described.Co-Array Fortran can be implemented using either threads or processes, and is therefore applicable to a wider range of machine types than OpenMP Fortran. It has also been designed from the ground up to support the SPMD programming style. To simplify the implementation of Co-Array Fortran, a formal Subset is introduced that allows the mapping of co-arrays onto standard Fortran arrays of higher rank. An OpenMP Fortran compiler can be extended to support Subset Co-Array Fortran with relatively little effort. 相似文献

14.

多核微机基于OpenMP的并行计算

蔡佳佳李名世郑锋《微机发展》2007,17(10):87-91

随着四核微机走向市场和八十核处理器在实验室研制成功,多核正引领软件研发发生基础性变化。开发人员需要在代码中添加线程来利用系统所提供的多个内核,从而提升PC应用软件的功能和性能。文中探讨在多核微机上进行并行计算的实现技术。介绍了共享存储系统并行编程接口OpenMP的模型、指令和库函数,以及Intel C 编译器9.1和Microsoft Visual Studio 2005等对OpenMP的支持;着重探讨了二维离散快速傅里叶变换并行算法的设计、实现与优化技术;展望了高性能并行计算软构件库的开发前景。相似文献

15.

Compiler support for general-purpose computation on GPUs

Yu-Te Lin Peng-Sheng Chen 《The Journal of supercomputing》2009,50(1):78-97

In recent years, the GPU (graphics processing unit) has evolved into an extremely powerful and flexible processor, with it now representing an attractive platform for general-purpose computation. Moreover, changes to the design and programmability of GPUs provide the opportunity to perform general-purpose computation on a GPU (GPGPU). Even though many programming languages, software tools, and libraries have been proposed to facilitate GPGPU programming, the unusual and specific programming model of the GPU remains a significant barrier to writing GPGPU programs. In this paper, we introduce a novel compiler-based approach for GPGPU programming. Compiler directives are used to label code fragments that are to be executed on the GPU. Our GPGPU compiler, Guru, converts the labeled code fragments into ISO-compliant C code that contains appropriate OpenGL and Cg APIs. A native C compiler can then be used to compile it into the executable code for GPU. Our compiler is implemented based on the Open64 compiler infrastructure. Preliminary experimental results from selected benchmarks show that our compiler produces significant performance improvements for programs that exhibit a high degree of data parallelism. 相似文献

16.

Runtime support and compilation methods for user-specifiedirregular data distributions

Ponnusamy R. Saltz J. Choudhary A. Yuan-Shin Hwang Fox G. 《Parallel and Distributed Systems, IEEE Transactions on》1995,6(8):815-831

This paper describes two new ideas by which a High Performance Fortran compiler can deal with irregular computations effectively. The first mechanism invokes a user specified mapping procedure via a set of proposed compiler directives. The directives allow use of program arrays to describe graph connectivity, spatial location of array elements, and computational load. The second mechanism is a conservative method for compiling irregular loops in which dependence arises only due to reduction operations. This mechanism in many cases enables a compiler to recognize that it is possible to reuse previously computed information from inspectors (e.g., communication schedules, loop iteration partitions, and information that associates off-processor data copies with on-processor buffer locations). This paper also presents performance results for these mechanisms from a Fortran 90D compiler implementation 相似文献

17.

A preliminary evaluation of OpenACC implementations

Ruymán Reyes Iván López Juan J. Fumero Francisco de Sande 《The Journal of supercomputing》2013,65(3):1063-1075

During the last few years, the availability of hardware accelerators, such as GPUs, has rapidly increased. However, the entry cost to GPU programming is high and requires a considerable porting and tuning effort. Some research groups and vendors have made attempts to ease the situation by defining APIs and languages that simplify these tasks. In the wake of the success of OpenMP, industria and academia are working toward defining a new standard of compiler directives to leverage the GPU programming effort. Support from vendors and similarities with the upcoming OpenMP 4.0 standard lead us to believe that OpenACC is a good alternative for developers who want to port existing codes to accelerators. In this paper, we evaluate three OpenACC implementations: two commercial implementations (PGI and CAPS) and our own research implementation, accULL, to evaluate the current status and future directions of the standard. 相似文献

18.

Generic Programming and High-Performance Libraries

Douglas Gregor Jaakko Järvi Mayuresh Kulkarni Andrew Lumsdaine David Musser Sibylle Schupp 《International journal of parallel programming》2005,33(2-3):145-164

Generic programming is an especially attractive paradigm for developing libraries for high-performance computing because it simultaneously emphasizes generality and efficiency. In the generic programming approach, interfaces are based on sets of specified requirements on types, rather than on any particular types, allowing algorithms to inter-operate with any data types meeting the necessary requirements. These sets of requirements, known as concepts, can specify syntactic as well as semantic requirements. Besides providing a powerful means of describing interfaces to maximize software reuse, concepts provide a uniform mechanism for more closely coupling libraries with compilers and for effecting domain-specific library-based compiler extensions. To realize this goal however, programming languages and their associated tools must support concepts as first-class constructs. In this paper we advocate better syntactic and semantic support to make concepts first-class and present results demonstrating the kinds of improvements that are possible with static checking, compiler optimization, and algorithm correctness proofs for generic libraries based on concepts. 相似文献

19.

Locality-Aware Automatic Parallelization for GPGPU with OpenHMPP Directives

José M. Andión Manuel Arenaz François Bodin Gabriel Rodríguez Juan Touriño 《International journal of parallel programming》2016,44(3):620-643

The use of GPUs for general purpose computation has increased dramatically in the past years due to the rising demands of computing power and their tremendous computing capacity at low cost. Hence, new programming models have been developed to integrate these accelerators with high-level programming languages, giving place to heterogeneous computing systems. Unfortunately, this heterogeneity is also exposed to the programmer complicating its exploitation. This paper presents a new technique to automatically rewrite sequential programs into a parallel counterpart targeting GPU-based heterogeneous systems. The original source code is analyzed through domain-independent computational kernels, which hide the complexity of the implementation details by presenting a non-statement-based, high-level, hierarchical representation of the application. Next, a locality-aware technique based on standard compiler transformations is applied to the original code through OpenHMPP directives. Two representative case studies from scientific applications have been selected: the three-dimensional discrete convolution and the simple-precision general matrix multiplication. The effectiveness of our technique is corroborated by a performance evaluation on NVIDIA GPUs. 相似文献

20.

The Oberon System family

M. Brandis R. Crelier M. Franz J. Templ 《Software》1995,25(12):1331-1366

Oberon simultaneously refers to a moduar, extensible operating system and an object-oriented programming language developed for its implementation. Although the original Oberon System had been conceived as the native operating system for a custom-built workstation, furhter implementations for several commercial platforms were developed later and are described here. All of these implementations are based on an efficient, retargetable Oberon compiler, and each provides a complete Oberon environment and the original library interface. This paper describes the structure of the compiler, summarizes the experience gained in adapting it for various CISC and RISC processors, and presents some empirical performance data. It also sheds light on the task of grafting an operating environment onto a variety of existing operating systems. 相似文献