期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

黄昕《计算机与现代化》2009,(6)

OpenTM在OpenMP的基础上引入事务的语法和语义,为事务存储程序设计提供了基于指导命令的程序设计接口.本文选取标准并行基准测试程序NPB中的应用程序LU作为例子,利用事务存储的投机并行执行能力和OpenTM接口实现了流水算法的并行.实验表明,OpenTM程序设计简单,避免了使用锁模式的复杂性,能够在科学计算领域发挥重大作用. 相似文献

2.

面向CGRA循环流水映射的数据并行优化

杨子煜严明王大伟李思昆《计算机学报》2013,36(6)

数据密集型应用中的核心循环消耗了程序的大量执行时间.如何实现核心循环在粗粒度可重构体系结构(CGRA)上的有效映射仍是当前研究领域的难点.为了在CGRA上最大程度开发应用并行性,降低循环访存开销,提高硬件资源利用率,文中提出一种新颖的面向CGRA循环流水映射的数据并行优化方法.通过定义一种新的可重构计算模型TMGC2以实现对循环的多条数据流水线并行加速.为避免并行化执行带来的额外存储体冲突问题影响CGRA执行性能,为后续循环映射创造良好的数据条件,引入存储体消除策略对数据进行重组,并结合数据重用图实现数据并行优化.实验表明,采用文中方法对已有CGRA循环流水映射方法进行优化,可以提高37.2％的数据吞吐量及41.3％的资源利用率. 相似文献

3.

基于事务存储技术的地震前兆设备模拟系统

单维锋李军李忠王茂发《计算机测量与控制》2014,(1)

并行程序设计技术可有效发挥多核处理器的计算能力,提高程序的并发性能;传统的基于锁、信号量等同步机制的并行程序设计容易出现死锁、优先级导致等错误;事务存储技术是一种新型的并行编程模式,可有效地降低面向共享内存模型下并行程序开发的难度;虽然研究人员提出了众多事务存储实现技术,但实际应用案例较少;文章将事务存储技术应用至地震前兆设备模拟系统中,并详细介绍了基于DSTM2和DeuceSTM的并行方案实现技术,通过大量的实验对比了不同并行方案的性能并详细剖析了其原因;实验结果显示,虽然基于事务存储技术的地震前兆设备模拟系统性能和粗粒度锁的并行方案相当,略低于细粒度锁的并行方案,但是基于事务存储技术的并行程序设计方法具有简单、易用的优点,还是可以考虑应用在某些具体应用中。相似文献

4.

数据网格中面向服务的事务技术 总被引：1，自引：0，他引：1

冯家宏董科军阎保平《计算机工程》2005,31(18):59-61

讨论了数据网格的背景以及目前国内外的研究状况.并根据应用的需求,提出了数据网格中面向服务的事务管理观点.定义了服务事务在数据网格体系结构中的位置、功能和实现方案,阐述了原子事务和业务事务两类事务的实现协议,最后以一个原子事务的例子说明了一个服务事务的执行过程. 相似文献

5.

一种支持声明式表示层集成的组件模型

李辉杨燕王帅白琳钟华《计算机系统应用》2011,20(12):46-49,9

从可重用组件或者模块中构造应用是软件工程中的一项重要技术,其中在应用集成相关的领域中,研究工作多集中在数据和业务层上.然而在表示层进行集成可以重用子模块的数据、业务逻辑和界面等多个层面,故而有利于进一步降低开发代价,具有重要研究意义.针对当前表示集成技术的不足,提出了一个新的面向表示层集成的组件模型.该组件模型支持层次... 相似文献

6.

AOP技术在车险业务系统中的应用

WANG Pei-lin TANG Yang 《数字社区&智能家居》2008,(34)

面向方面的编程(AOP)是一种新的编程技术,它弥补了面向对象的编程(OOP)在跨越模块行为上的不足。AOP引进了Aspect,它将影响多个类的行为封装到一个可重用模块中,它允许程序员对横切关注点进行模块化,从而消除了OOP引起的代码混乱和分散问题,增强了系统的可维护性和代码的重用性。该文分析传统权限控制、事务控制的实现方法,并研究了在AOP下权限控制、事务控制的实现方法从而为AOP技术在项目中的应用提供一定的参考。相似文献

7.

AOP技术在车险业务系统中的应用

王培林汤阳《数字社区&智能家居》2008,3(12)

面向方面的编程（AOP）是一种新的编程技术,它弥补了面向对象的编程（OOP）在跨越模块行为上的不足。AOP引进了Aspect,它将影响多个类的行为封装到一个可重用模块中,它允许程序员对横切关注点进行模块化,从而消除了OOP引起的代码混乱和分散问题．增强了系统的可维护性和代码的重用性乞该文分析传统权限控制、事务控制的实现方法,并研究了在AOP下权限控制、事务控制的实现方法从而为AOP技术在项目中的应用提供一定的参考。相似文献

8.

基于迭代序的流程序局部性分析和优化

唐滔杨学军林一松《计算机研究与发展》2012,49(6):1363-1375

流编程模型是一种近年来被广泛研究的并行编程模型,它在基于软件管理的流式存储器,如流寄存器文件的流体系结构上得到了良好的应用.但同时也有研究指出流编程模型同样适合于基于硬件管理的一致性cache的体系结构.流编程模型目前最重要的应用背景GPGPU在发展中也逐渐引入通用的数据cache,因此发掘流程序的cache局部性就成为在这类体系结构上提高流程序性能的关键.由于流程序特殊的执行模型,其重用向局部性转化的过程与传统的串行程序不一致,无法直接使用传统的局部性分析方法直接对流程序进行分析.在深入分析了重用向局部性转化过程的基础上,提出了"迭代序"的概念用于描述流和串行程序重用向局部性转化时的不同,同时结合流程序的执行特点面向并行扩展了传统的局部性分析理论,给出了基于迭代序的局部性分析方法.此外,结合局部性分析模型还提出了两种流程序的cache局部性优化方法.在GPGPUSim模拟平台上进行的验证结果表明对流程序局部性的定量分析是有效的,并且提出的优化方法也可以有效改善流程序的cache局部性,提高流程序的性能. 相似文献

9.

事务存储并行程序编程接口研究 总被引：1，自引：0，他引：1

下载免费PDF全文

贾建斌黄春赵克佳《计算机工程与科学》2010,32(11):136-140

事务存储并行程序编程接口按照实现方式和实现层次的不同,分为三种形式:库函数接口、语言扩展和编译器指导命令。本文以RSTM、英特尔C/C++软件事务存储编译器原型和OpenTM为例,讨论了三种事务存储编程接口的特点,对OpenTM编程接口进行了扩展和完善,并对未来编程接口的发展进行了展望。相似文献

10.

多核机群上数据密集型应用并行程序性能优化

黄华林钟诚《计算机工程与应用》2012,48(30):73-77

在异构多核机群系统上利用数据任务块的动态调度策略和全锁定技术,给出一种面向数据密集型应用的结点内主存和可用的共享二级缓存大小中动态调度数据块的多进程级和多线程级并行编程机制,给出了优化数据密集型应用并行程序性能的策略和技术。在多核计算机组成的异构机群上并行求解随机序列多关键字查找的实验结果表明,所给出的多核并行程序设计机制和性能优化方法可行和高效。相似文献

11.

Optimizing the execution of multiple data analysis queries on parallel and distributed environments

Andrade H. Kurc T. Sussman A. Saltz J. 《Parallel and Distributed Systems, IEEE Transactions on》2004,15(6):520-532

We investigate techniques for efficiently executing multiquery workloads from data and computation-intensive applications in parallel and/or distributed computing environments. In this context, we describe a database optimization framework that supports data and computation reuse, query scheduling, and active semantic caching to speed up the evaluation of multiquery workloads. Its most striking feature is the ability of optimizing the execution of queries in the presence of application-specific constructs by employing a customizable data and computation reuse model. Furthermore, we discuss how the proposed optimization model is flexible enough to work efficiently irrespective of the parallel/distributed environment underneath. In order to evaluate the proposed optimization techniques, we present experimental evidence using real data analysis applications. For this purpose, a common implementation for the queries under study was provided according to the database optimization framework and deployed on top of three distinct experimental configurations: a shared memory multiprocessor, a cluster of workstations, and a distributed computational Grid-like environment. 相似文献

12.

Active semantic caching to optimize multidimensional data analysis in parallel and distributed environments

《Parallel Computing》2007,33(7-8):497-520

In this paper, we present a multi-query optimization framework based on the concept of active semantic caching. The framework permits the identification and transparent reuse of data and computation in the presence of multiple queries (or query batches) that specify user-defined operators and aggregations originating from scientific data-analysis applications. We show how query scheduling techniques, coupled with intelligent cache replacement policies, can further improve the performance of query processing by leveraging the active semantic caching operators. We also propose a methodology for functionally decomposing complex queries in terms of primitives so that multiple reuse sites are exposed to the query optimizer, to increase the amount of reuse. The optimization framework and the database system implemented with it are designed to be efficient irrespective of the underlying parallel and/or distributed machine configuration. We present experimental results highlighting the performance improvements obtained by our methods using real scientific data-analysis applications on multiple parallel and distributed processing configurations (e.g., single symmetric multiprocessor (SMP) machine, cluster of SMP nodes, and a Grid computing configuration). 相似文献

13.

基于Web Service开放系统结构的软件重用 总被引：11，自引：1，他引：11

饶元冯博琴《计算机工程》2004,30(20):72-74

提出了基于时间，空间与应用域三维软件重用行为特征空间的概念，指出软件重用需解决3个原则性问题，并提出RSRPM面向开放系统的Web Service重用技术体重结构模型。在对RSRPM软件重用模式与实例的分析表明，它不仅重用了Web服务，同时还重用了服务提供系统的数据库结构与数据内容，使得开发周期从4人/月降低到了16人/天，开发效率提高了近10倍。相似文献

14.

一种基于SLP的新型编译框架

张素平王冬丁丽丽王鹏翔宫一于海宁《计算机应用研究》2017,34(1)

对于SLP算法不能高效处理并行代码占有率较小的大型应用程序的问题,本文提出并评估了一种新型的基于改进的SLP(Superword level parallel)算法的编译框架。它主要包括三个阶段,首先,将代码中的结构相似的异构语句通过改进的SLP算法尽可能的改为同构语句;然后,用全局的观点,在优化目标代码之前获取其数据模型重用;最后,联合数据布局优化进行进一步的性能提升。本文就此框架做了大量实验,实验结果表明本框架比SLP算法性能更佳,优于它约15.3%。相似文献

15.

Improving Memory Traffic by Assembly-Level Exploitation of Reuses for Vector Registers 总被引：1，自引：0，他引：1

Chang Chih-Yung Chen Tzung-Shi Sheu Jang-Ping 《The Journal of supercomputing》2000,17(2):187-204

In this paper, we propose a compilation scheme to analyze and exploit the implicit reuses of vector register data. According to the reuse analysis, we present a translation strategy that translates the vectorized loops into assembly vector codes with exploitation of vector reuses. Experimental results show that our compilation technique can improve the execution time and traffic between shared memory and vector registers. Techniques discussed here are simple, systematic, and easy to be implemented in the conventional vector compilers or translators to enhance the data locality of vector registers. 相似文献

16.

A run-time optimization approach for reducing data movements using locality-aware searching

Liang Li Endong Wang Xingjun Zhang Kang Yan Tao Ju Xiaoshe Dong 《The Journal of supercomputing》2014,69(2):864-886

The CPU–GPU communication bottleneck limits the performance improvement of GPU applications in heterogeneous GPGPU systems and usually is handled by data reuse optimization. This paper analyzes data reuse through DAG abstraction and obtains rules showing that the run-time data reuse optimization can effectively relieve the bottleneck. Based on the rules, this paper proposes a run-time optimization framework for data reuse, called R-Tracker. The R-Tracker uses locality-aware searching approach to handle reuses. It can not only low costly implement the data reuse optimization but also effectively implement the searching, the data transfers, and the GPU computation concurrently. R-Tracker relaxes the constraints that are required in compiler-based approaches and thus achieves better reuse effect. The experimental results show that R-Tracker improves the performance by 1.77–16.42 % over compiler-based approach OpenMPC and 1.40–8.39 % over CGCM in single-node execution, and 48.78–60 % over CGCM in multi-node execution. 相似文献

17.

Parallel Concurrency Control Activity for Transaction Management in Real-time Database Systems 总被引：1，自引：1，他引：1

Subhash Bhalla 《The Journal of supercomputing》2004,28(3):345-369

In a real-time database system, an application supports a mix of transactions. These include the real-time transactions that require completion by a given deadline. Time-critical requirements also exist in many distributed multi-media system applications. Existing concurrency control procedures introduce excessive delays due to non-availability of data resources. In this study, we ignore the delays incurred by ordinary transactions, in order to achieve a non-interference mode of execution (near parallel) for the time-critical transactions. For this purpose, a data allocation model has been studied. It is a stochastic process model based on the use of two-phase locking. It highlights the available possibilities for reductions of delays for time-critical transactions within a distributed real-time database systems. Based on the new conceptual model, modified synchronization techniques for time-critical transactions have been proposed. 相似文献

18.

Profiling and Optimizing Transactional Memory Applications

Ferad Zyulkyarov Srdjan Stipic Tim Harris Osman S. Unsal Adrián Cristal Ibrahim Hur Mateo Valero 《International journal of parallel programming》2012,40(1):25-56

Many researchers have developed applications using transactional memory (TM) with the purpose of benchmarking different implementations, and studying whether or not TM is easy to use. However, comparatively little has been done to provide general-purpose tools for profiling and optimizing programs which use transactions. In this paper we introduce a series of profiling and optimization techniques for TM applications. The profiling techniques are of three types: (i) techniques to identify multiple potential conflicts from a single program run, (ii) techniques to identify the data structures involved in conflicts by using a symbolic path through the heap, rather than a machine address, and (iii) visualization techniques to summarize how threads spend their time and which of their transactions conflict most frequently. Altogether they provide in-depth and comprehensive information about the wasted work caused by aborting transactions. To reduce the contention between transactions we suggest several TM specific optimizations which leverage nested transactions, transaction checkpoints, early release and etc. To examine the effectiveness of the profiling and optimization techniques, we provide a series of illustrations from the STAMP TM benchmark suite and from the synthetic WormBench workload. First we analyze the performance of TM applications using our profiling techniques and then we apply various optimizations to improve the performance of the Bayes, Labyrinth and Intruder applications. We discuss the design and implementation of the profiling techniques in the Bartok-STM system. We process data offline or during garbage collection, where possible, in order to minimize the probe effect introduced by profiling. 相似文献

19.

基于BLS聚合签名技术的平行链共识算法优化方案

刘琪郭荣新蒋文贤马登极《计算机应用》2022,42(12):3785-3791

目前,平行链的每个共识节点均需发送各自的共识交易到主链上以参与共识,这导致大量的共识交易严重占用主链的区块容量,并且浪费手续费。针对上述问题,利用平行链上的共识交易具有共识数据相同签名不同的特点,结合双线性映射技术,提出一种基于BLS聚合签名技术的平行链共识算法优化方案。首先,用共识节点对交易数据进行签名;然后,用平行链各节点通过点对点（P2P）网络在内部广播共识交易并同步消息;最后,由Leader节点统计共识交易,且当共识交易的数量大于2/3时,将对应的BLS签名数据聚合并发送交易聚合签名到主链上进行验证。实验结果表明,所提方案与原始平行链共识算法相比能够有效解决平行链上共识节点重复发送共识交易到主链的问题,在减少对主链存储空间的占用的同时节省交易手续费,只占用主链存储空间4 KB并且只产生一笔0.01比特元（BTY）的交易手续费。相似文献