期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Efficient architectures for data access in a shared memory hierarchy

Krishnan Padmanabhan 《Journal of Parallel and Distributed Computing》1991,11(4)

Interconnection structures that can provide access to multiple levels of a shared memory hierarchy in a multiprocessor are investigated. The results are also applicable to distributed memory architectures in which localities of communication can be statically defined. All the structures presented conform in some fashion to the binary cube topology, with per-processor logic cost ranging from O(log N) to O(log²N). The results illustrate that without resorting to separate networks for access at each level, several architectures can provide fast access at tower levels in the hierarchy and progressively slower access at higher levels. Even at the highest communication level (corresponding to systemwide communication), messages encounter less delay than in a nonhierarchical access situation. 相似文献

2.

Concurrent C++: Concurrent programming with class(es)

N. H. Gehani W. D. Roome 《Software》1988,18(12):1157-1177

C++ and Concurrent C are both upward-compatible supersets of C that provide data abstraction and parallel programming facilities, respectively. Although data abstraction facilities are important for writing concurrent programs, we did not provide data abstraction facilities in Concurrent C because we did not want to duplicate the C++ research effort. Instead, we decided that we would eventually integrate C++ and Concurrent C facilities to produce a language with both data abstraction and parallel programming facilities, namely, Concurrent C++. Data abstraction and parallel programming facilities are orthogonal. Despite this, the merger of Concurrent C and C++ raised several integration issues. In this paper, we will give introductions to C++ and Concurrent C, give two examples illustrating the advantages of using data abstraction facilities in concurrent programs, and discuss issues in integrating C++ and Concurrent C to produce Concurrent C++. 相似文献

3.

Support for Parallel and Concurrent Programming in C++

N. I. V’yukova V. A. Galatenko S. V. Samborskii 《Programming and Computer Software》2018,44(1):35-42

C++ was originally designed as a sequential programming language. For development of multithreaded applications, libraries, such as Pthreads, Windows threads, and Boost, are traditionally used. The C++11 standard introduced some basic concepts and means for developing parallel and concurrent programs, but the direct use of these low-level means requires high programming skills and significant efforts. The absence of high-level models of parallelism in C++ is somewhat compensated for by various parallel libraries and directive parallelization tools (such as OpenMP), as well as by language extensions supported by some compilers (Intel CilkPlus). Nevertheless, we still require more advanced means to express parallelism in programs at the level of language standard and language library. In this survey, we consider the means for parallel and concurrent programming that are included into the C++17 standard, as well as some capabilities that are to be expected in the future standards. 相似文献

4.

Concurrent computation of attribute filters on shared memory parallel machines 总被引：1，自引：0，他引：1

Wilkinson MH Gao H Hesselink WH Jonker JE Meijster A 《IEEE transactions on pattern analysis and machine intelligence》2008,30(10):1800-1813

Morphological attribute filters have not previously been parallelized, mainly because they are both global and non-separable. We propose a parallel algorithm which achieves efficient parallelism for a large class of attribute filters, including attribute openings, closings, thinnings and thickenings, based on Salembier's Max-Trees and Min-trees. The image or volume is first partitioned in multiple slices. We then compute the Max-trees of each slice using any sequential Max-Tree algorithm. Subsequently, the Max-trees of the slices can be merged to obtain the Max-tree of the image. A C-implementation yielded good speed-ups on both a 16-processor MIPS 14000 parallel machine, and a dual-core Opteron-based machine. It is shown that the speed-up of the parallel algorithm is a direct measure of the gain with respect to the sequential algorithm used. Furthermore, the concurrent algorithm shows a speed gain of up to 72% on a single-core processor, due to reduced cache thrashing. 相似文献

5.

C++的一种并发扩充方案

陈家骏赵建华郑国梁《软件学报》1998,9(8)

该文给出了一种对C++进行并发扩充的方案.它基于这样的并发面向对象模型:系统由一组自治的并发对象构成,对象可以有一个体,一旦对象被创建,对象体就开始执行;对象间采用同步消息传递,允许对象内部的并发;对象的并发控制分散在各方法的激励条件中.文章还给出了一种转换策略,把扩充的C++描述转换成C++描述,使之能被现有的C++编译器识别.转换中利用了某些多任务操作系统(如Windows 95)所提供的多线程和同步设施. 相似文献

6.

Finding shuffle words that represent optimal scheduling of shared memory access

Daniel Reidenbach 《国际计算机数学杂志》2013,90(6):1292-1309

In the present paper, we introduce and study the problem of computing, for any given finite set of words, a shuffle word with a minimum so-called scope coincidence degree. The scope coincidence degree is the maximum number of different symbols that parenthesize any position in the shuffle word. This problem is motivated by an application of a new automaton model and can be regarded as the problem of scheduling shared memory accesses of some parallel processes in a way that minimizes the number of memory cells required. We investigate the complexity of this problem and show that it can be solved in polynomial time. 相似文献

7.

航天实时内存数据库存取机制MCacheTree的研究 总被引：1，自引：0，他引：1

甘杉郭丽丽《计算机工程与设计》2010,31(17)

考虑到空间环境探测、空间科学实验所产生的数据时效性比较强,为了对它们进行有效的管理,研究了提高实时性能的索引技术.由于IO速度较慢、外存延迟时间难以预测,实时数据库系统通常采取内存数据库技术.基于此,提出了一种新的适合航天实时内存数据库系统的索引结构:MCacheTree,它将内存缓存和检索树有机地结合起来,并应用延迟写和延迟删除的优化技术,有效地降低了查询时间,提高了实时性能.最后通过实验验证了该设计的高效性. 相似文献

8.

Ch:面向交互式教学的跨平台C/C++解释计算环境 总被引：4，自引：1，他引：3

程辉《计算机教育》2009,(7):34-46

C语言是计算机程序设计入门教学中最流行也较难学的编程语言之一。本文介绍的Ch是一个面向交互式教学的跨平台C/C++解释计算环境,是一个完整的C语言解释器,支持最新C语言标准C99中大部分的新增特性以及C++的类,由交互式命令外壳(command shell)和教学专用且界面友好的集成开发环境(ChIDE)两大模块构成。Ch支持计算数组(computational array),提供了图形绘制库和高级数值函数库,能够方便快捷地解决许多工程和科学方面的复杂问题。在Windows系统中,Ch计算环境支持常用的Unix和Linux命令,帮助学生在熟悉的Windows环境中学习Unix和Linux。Ch还可以作为引擎脚本无缝地嵌入到编译的程序中,实现柔性编程。本文最后概括性地介绍了笔者在美国加州大学戴维斯分校多年教学实践中开发并使用的一套基于Ch的C程序设计教学平台。教学实践表明,使用这个平台在相当程度上提高了计算机程序设计教学的实用性、授课效果和学生学习的积极性,帮助学生充分理解和掌握计算机程序设计这一工程和科学领域的重要基本技能。相似文献

9.

Cosmo++: An object-oriented C++ library for cosmology

Grigor Aslanyan 《Computer Physics Communications》2014

This paper introduces a new publicly available numerical library for cosmology, Cosmo++. The library has been designed using object-oriented programming techniques, and fully implemented in C++. Cosmo++ introduces a unified interface for using most of the frequently used numerical methods in cosmology. Most of the features are implemented in Cosmo++ itself, while a part of the functionality is implemented by linking to other publicly available libraries. The most important features of the library are Cosmic Microwave Background anisotropies power spectrum and transfer function calculations, likelihood calculations, parameter space sampling tools, sky map simulations, and mask apodization. Cosmo++ also includes a few mathematical tools that are frequently used in numerical research in cosmology and beyond. A few simple examples are included in Cosmo++ to help the user understand the key features. The library has been fully tested, and we describe some of the important tests in this paper. Cosmo++ is publicly available at http://cosmo.grigoraslanyan.com. 相似文献

10.

Synchronous C++: a language for interactive applications

Petitpierre C. 《Computer》1998,31(9):65-72

Synchronous C++ defines active objects that contain their own execution threads and can communicate with each other by means of synchronizing method calls. The author shows how to model programs in sC++ and compares sC++ with event driven programming. He focuses on examples in which the dynamic and functional models dominate and the object model is secondary. In doing so, he proposes a mapping between the elements of all three models and sC++ statements. Several other concepts have been proposed to extend OO languages to concurrency: delayed evaluations (a concept that proposes to launch each method on a separate thread, as in Actors), processes orthogonal to the objects (Ada), asynchronous channels and exceptions (Eiffel), and others. However, the researchers who proposed these solutions emphasized the improvements of speed they expected from achieving concurrency, rather than the characteristics that make concurrency suitable for the analysis and the implementation of specifications such as the ones provided by models 相似文献

11.

NOBRAINER: A Tool for Example-Based Transformation of C/C++ Code

Savchenko V. V. Sorokin K. S. Bronshtein I. E. Volkov A. S. Kachanov V. V. Pankratenko G. A. Ermakov M. K. Markov S. I. Spiridonov A. V. Aleksandrov I. V. 《Programming and Computer Software》2020,46(5):362-372

Programming and Computer Software - Refactoring is an integral part of the modern software development process. Often, the refactoring must be performed at the global level with modifications in a... 相似文献

12.

Portable C/C++ code for portable XML data

Zhaoqing Wang Cheng H.H. 《Software, IEEE》2006,23(1):76-81

相似文献

13.

High-fidelity C/C++ code transformation

《Science of Computer Programming》2007,68(2):64-78

As software systems become increasingly massive, the advantages of automated transformation tools are clearly evident. These tools allow the machine to both reason about and manipulate high-level source code. They enable off-loading of mundane and laborious programming tasks from human developer to machine, thereby reducing cost and development time frames.Although there has been much work in software transformation, there still exist many hurdles in realizing this technology in a commercial domain. From our own experience, there are two significant problems that must be addressed before transformation technology can be usefully applied in a commercial setting. These are: (1) Avoiding disruption of the style (i.e., layout and commenting) of source code and the introduction of any undesired modifications that can occur as a side effect of the transformation process. (2) Correct automated handling of C preprocessing and the presentation of a semantically correct view of the program during transformation. Many existing automated transformation tools require source to be manually modified so that preprocessing constructs can be parsed. The real semantic of the program remains obscured resulting in the need for complicated analysis during transformation. Many systems also resort to pretty printing to generate transformed programs, which inherently disrupts coding style. In this paper we describe our own C/C++ transformation system, Proteus, that addresses both these issues. It has been tested on millions of lines of commercial C/C++ code and has been shown to meet the stringent criteria laid out by Lucent’s own software developers. 相似文献

14.

C/C++预处理分析与改进

饶伟《数字社区&智能家居》2006,(8)

通常的C/C 预处理器是一个宏处理器,在编译前自动地把源文件转换为编译器可识别的形式。传统的预处理方法基于文本行替换,没有考虑到具体的上下文环境。这种预处理机制在文件包含、宏作用域、头文件关系上存在着一些缺陷,会影响工程项目代码重用,降低程序的可维护性、可扩展性等。通过从分析c预处理器缺陷出发,并利用FOG[1]及其语言可以得到一种基于元变量和元函数的语法替换机制的解决方案。相似文献

15.

改进C/C++的预处理功能

初永丽《微计算机信息》2006,22(9):281-284

预处理在C/C 中发挥着重要作用,然而这些预处理功能存在着一些缺陷,例如在头文件包含进来时,无法改变头文件中的内容;代码的重用性不高;大量重复代码等等。本文提出用一种高级配置语言XVCL(XML-basedVariantConfigura-tionLanguage)代替原来的预处理机制,来克服以上提出的问题。文件被组织为树形结构,并定义了利于提高重用性的变量作用域机制。文章通过一个实例来验证本文提出的方法的有效性。相似文献

16.

High-Fidelity C/C++ Code Transformation

Daniel G. Waddington Bin Yao 《Electronic Notes in Theoretical Computer Science》2005,141(4):35

As software systems become increasingly massive, the advantages of automated transformation tools are clearly evident. These tools allow the machine to both reason about and manipulate high-level source code. They enable off-loading of mundane and laborious programming tasks from human developer to machine, thereby reducing cost and development timeframes.Although there has been much academic work in software transformation, there still exists many hurdles in realising this technology in a commercial domain. From our own experience, there are two significant problems that must be addressed before transformation technology can be usefully applied in a commercial setting. These are: 1.) avoiding disruption of style (i.e. layout and commenting) and the introduction of any undesired modifications which occur as a side effect of the transformation process. 2.) correct handling of C preprocessing and the presentation of a semantically correct view of the program during transformation. Many existing automated transformation tools inherently disrupt style through the use of pretty printing and the need to perform preprocessing before any transformation. Some also require source to be modified so that it conforms to a subset of the grammar. In this paper we describe our own C/C++ transformation system, Proteus, that is able to meet the stringent criteria laid out by Lucent's own software developers. 相似文献

17.

C／C＋＋预处理分析与改进

饶伟《数字社区&智能家居》2006,(3):117-119

通常的C／C＋＋预处理器是一个宏处理器，在编译前自动地把源文件转换为编译器可识别的形式。传统的预处理方法基于文本行替换，没有考虑到具体的上下文环境。这种预处理机制在文件包含、宏作用域、头文件关系上存在着一些缺陷，会影响工程项目代码重用，降低程序的可维护性、可扩展性等。通过从分析C预处理器缺陷出发，并利用FOG【1】及其语言可以得到一种基于元变量和元函数的语法替换机制的解决方案。相似文献

18.

libMesh : a C++ library for parallel adaptive mesh refinement/coarsening simulations

Benjamin S. Kirk John W. Peterson Roy H. Stogner Graham F. Carey 《Engineering with Computers》2006,22(3-4):237-254

In this paper we describe the libMesh (http://libmesh.sourceforge.net) framework for parallel adaptive finite element applications. libMesh is an open-source software library that has been developed to facilitate serial and parallel simulation of multiscale, multiphysics applications using adaptive mesh refinement and coarsening strategies. The main software development is being carried out in the CFDLab (http://cfdlab.ae.utexas.edu) at the University of Texas, but as with other open-source software projects; contributions are being made elsewhere in the US and abroad. The main goals of this article are: (1) to provide a basic reference source that describes libMesh and the underlying philosophy and software design approach; (2) to give sufficient detail and references on the adaptive mesh refinement and coarsening (AMR/C) scheme for applications analysts and developers; and (3) to describe the parallel implementation and data structures with supporting discussion of domain decomposition, message passing, and details related to dynamic repartitioning for parallel AMR/C. Other aspects related to C++ programming paradigms, reusability for diverse applications, adaptive modeling, physics-independent error indicators, and similar concepts are briefly discussed. Finally, results from some applications using the library are presented and areas of future research are discussed. 相似文献

19.

Signalling regions: Multiprocessing in a shared memory reconsidered

Charles W. Reynolds 《Software》1990,20(4):325-356

The goal of this paper is to return attention to two problems that arise in the context of supporting the monitor as a mechanism for concurrent programming. This paper will re-examine the monitor concept in its original context—a multiprocessing environment implemented on a single processor sharing memory with and being interrupted by asynchronous peripheral devices—and will address the two previously unresolved problems. The first is the conflict between the immediate resumption requirement in explicit signalling and the policies and priorities of the process scheduler. The second is the possibility of deadlock inherent in nested monitors and in its most important instance, the dynamic resource allocation problem. After briefly describing the historical context of these two problems, the paper proposes a language structure called a signalling region that together with the notion of encapsulation by modules solves the immediate resumption problem and avoids the nested monitor problem. The former is done by a combination of the signal-and-return semantics of Concurrent Pascal and the signal-and-continue semantics of Mesa and StarMod. The latter is done by suggesting that mutual exclusion and data encapsulation are distinct concepts that, if separated, make nested encapsulation possible while avoiding the problems of nested mutual exclusion. Classical examples of the use of signalling regions in an extended Modula-2 are given as well as an implementation by translation to unextended Modula-2 together with a Kernel module. 相似文献

20.

CUDA-Zero:a framework for porting shared memory GPU applications to multi-GPUs

CHEN DeHao CHEN WenGuang & ZHENG WeiMin 《中国科学:信息科学(英文版)》2012,(3):663-676

As the prevalence of general purpose computations on GPU, shared memory programming models were proposed to ease the pain of GPU programming. However, with the demanding needs of more intensive workloads, it’s desirable to port GPU programs to more scalable distributed memory environment, such as multi-GPUs. To achieve this, programs need to be re-written with mixed programming models (e.g. CUDA and message passing). Programmers not only need to work carefully on workload distribution, but also on scheduling mechanisms to ensure the efficiency of the execution. In this paper, we studied the possibilities of automating the process of parallelization to multi-GPUs. Starting from a GPU program written in shared memory model, our framework analyzes the access patterns of arrays in kernel functions to derive the data partition schemes. To acquire the access pattern, we proposed a 3-tiers approach: static analysis, profile based analysis and user annotation. Experiments show that most access patterns can be derived correctly by the first two tiers, which means that zero efforts are needed to port an existing application to distributed memory environment. We use our framework to parallelize several applications, and show that for certain kinds of applications, CUDA-Zero can achieve efficient parallelization in multi-GPU environment. 相似文献