期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

程东年王淑英《微机发展》1996,6(3):35-37

并行处理摸拟器”旨在帮助使用者加深对典型并行计算机系统基本工作过程的理解，通过一个基本的并行程序开发环境使其可进行并行程序设计及模拟执行．它通过对ＭＩＭＤ多处理机系统体系结构、编译器和操作系统基本特征的模拟，实现了对作业、作业步和ＤＯ循环级并行性的显式及隐式开发，并可自动检测系统配置规模及准确定位连接故障。相似文献

2.

求解一般带状线性方程组的解耦分解并行算法

宋晓秋《计算机工程与设计》1995,16(5):51-56

本文给出了适用于ＭＩＭＤ多处理机系统的求解一般带状线性方程组的解耦分解并行算法，并在Ｓ１０－２并行机上进行了数值实验，理论分析和数值实验的结果表明，该并行算法是有效的。相似文献

3.

一种基于C扩展的SIMD的并行程序设计语言

下载免费PDF全文

景晓军方滨兴《软件学报》1996,7(7):401-408

ＳＩＭＣ（ＳＩＭＤＣ）是通过对Ｃ语言进行语法扩展（未进行语义扩展）得到的支持ＳＩＭＤ（ｓｉｎｇｌｅｉｎｓｔｒｕｃｔｉｏｎｍｕｌｔｉｐｌｅｄａｔａ）并行程序设计的并行语言．ＳＩＭＣ可方便地描述ＳＩＭＤ并行算法，具有ＳＩＭＤ计算机系统结构定义能力，可支持多种系统结构上的并行算法研究．ＳＩＭＣ语言的模拟执行系统已在单机上实现，并作为作者研究开发的ＳＩＭＤ计算机程序设计及性能评价模拟环境的并行程序设计语言，用于ＳＩＭＤ计算机算法及结构的性能评价. 相似文献

4.

多Transputer系统上的并行消隐算法及其图形显示

王艳春许有信《计算机应用》1994,14(3):34-36

本文讨论了三维物体隐面消除的并行处理问题。给出了一类ＭＩＭＤ并行深度缓冲器算法，并在多Ｔｒａｎｓｐｕｔｅｒ系统上实现。文中还对这些算法的效率进行了比较。相似文献

5.

SICE：一个SIMD计算机程序设计模拟环境

景晓军方滨兴《计算机工程与应用》1996,32(5):56-61

本文实现的ＳＩＣＥ（ＳＩＭＤＣＥｍｕｌａｔｏｒ）是一个在串行机的环境下模拟进行ＳＩＭＤ计算机程序设计的软件包。ＳＩＣ（ＳＩＭＤＣ）是作者定义的一种基于Ｃ语言的ＳＩＭＤ并行扩展语言，它一方面支持反映ＳＩＭＤ结构特点的的并行语句，更重要的是可支持ＳＩＭＤ结构的定义，能方便的用于ＳＩＭＤ机器的算法研究。相似文献

6.

交互式多模型滤波器及其并行实现研究* 总被引：5，自引：1，他引：5

潘泉戴冠中张洪才《控制理论与应用》1997,14(4):544-550

本文研究交互式多模型滤波器（ＩＭＭＦ）．ＩＭＭＦ对于动态模式具有随机突变的一类混合系统的估值具有良好的性能。本文首先给出ＩＭＭＦ的数学描述，揭示ＩＭＭＦ的机理特征，针对ＩＭＭＦ的结构特点，以ＰＤ－１００多机处理系统为基础，研究了ＩＭＭＦ并行实现的处理器拓扑结构、ＩＭＭＦ任务划分、分配和ＩＭＭＦ并行映射实现，给出了ＩＭＭＦ并行实现算法。在ＰＤ－１００系统上的仿真表明，本文的并行算法具有加速比线性好，相似文献

7.

一个数据并行语言的设计及其实现

陈斯愈黄林鹏《计算机工程》1997,23(3):3-6

数据并行模型应用到ＭＩＭＤ机器上，实现ＳＰＭＤ模式的松散同步的方式越来越受到人们的重视。文中提出了一个以屏构并行系统为环境的数据并行语言Ｍｕｌｔｉ－ｃ的设计和实现。正在实现的Ｍｕｌｉｔｉ－ｃ编译器，以预编译的方式接受ＳＩＭＤ形式的程序说明，放宽同步要求，产生能以ＳＰＭＫ方式在并行系统上运行的Ｃ程序。相似文献

8.

论面向CIMS管理与决策软件的UIMS设计方法

杜清秀《计算机工程与应用》1997,33(1):27-30

用户界面管理系统ＵＩＭＳ（ＵｓｅｒＩｎｔｅｒｆａｃｅＭａｎａｇｅｍｅｎｔＳｙｓｔｅｍ）是为用户界面设计者提供良好设计环境的辅助设计工具。本文首先讨论了人机交互软件的特殊性并从方法学的角度出发，研究其开发过程，提出了一种人机交互软件系统的开发方法－ＨＣＳＤＭ（Ｈｕｍａｎ－Ｃｏｍ－ｐｕｔｅｒＩｎｔｅｒａｃｔｉｏｎＳｙｓｔｅｍＤｅｖｅｌｏｐｍｅｎｔＭｅｔｈｏｄ）。其次，从ＵＩＭＳ的基本特征、ＵＩＭＳ模型和ＵＩＭＳ规范说明出发，讨论了在ＵＩＭＳ研究与设计中应考虑的问题。最后，对ＵＩＭＳ的研究与开发前景进行了一些展望相似文献

9.

PROLOG并行处理技术

鲁汉榕《计算机工程》1986,(1)

我们把用多处理机系统来并行解释(传统顺序)PROLOG程序称做 PROLOG并行处理。本文讨论PROLOG并行处理的有关问题,如PROLOG并行处理的背景,PROLOG解释的并行模型及相应的处理方式(包括子句分布,并行解释控制策略,处理机/进程的分配/调度,PROLOG过程语义维护,通讯复杂性控制,以及支撑多处理机体系结构等)。最后介绍我们提出并准备实现的PROLOG的A并行处理方式,这一种包纳了多种并行模式的逆向递归式断言并行解释方式。相似文献

10.

并行计算及并行处理功能的应用

牟道楠《计算机应用与软件》1995,12(5):21-25

利用ＩＢＭ４３８１－Ｐ０３型及ＣＰＵ计算机进行并行处理功能的开发和并行计算的应用，得到较高的加速比。相似文献

11.

Fast neural net simulation with a DSP processor array 总被引：1，自引：0，他引：1

Muller U.A. Gunzinger A. Guggenbuhl W. 《Neural Networks, IEEE Transactions on》1995,6(1):203-213

This paper describes the implementation of a fast neural net simulator on a novel parallel distributed-memory computer. A 60-processor system, named MUSIC (multiprocessor system with intelligent communication), is operational and runs the backpropagation algorithm at a speed of 330 million connection updates per second (continuous weight update) using 32-b floating-point precision. This is equal to 1.4 Gflops sustained performance. The complete system with 3.8 Gflops peak performance consumes less than 800 W of electrical power and fits into a 19-in rack. While reaching the speed of modern supercomputers, MUSIC still can be used as a personal desktop computer at a researcher's own disposal. In neural net simulation, this gives a computing performance to a single user which was unthinkable before. The system's real-time interfaces make it especially useful for embedded applications. 相似文献

12.

PAPS — A testbed for performance prediction of parallel applications

《Parallel Computing》1997,22(13):1837-1851

The PAPS (Performance Analysis of Parallel Systems) toolset is a testbed for the model based performance prediction of message passing parallel applications executed on private memory multiprocessor computer systems. PAPS allows to describe the execution behavior of the computer hardware and operating system software resources up to a very detailed level. This enables very accurate performance prediction of parallel applications even in the case of substantial performance degradation due to contention for shared resources. In this paper the fundamental design principles and implementation methodologies for the development of the PAPS toolset are presented and the PAPS parallel system specification formalisms are described. A simplified performance study of a parallel Gaussian elimination application on the nCUBE 2 multiprocessor system is used to demonstrate the usage of the tool. 相似文献

13.

多机仿真器SimDSM的设计及实现 总被引：2，自引：1，他引：1

陈新洪锦伟高原李晓峰郑世荣《小型微型计算机系统》2000,21(2):186-189

仿真器是进行计算机体系结构研究的有力工具。本文以多机仿真器ＳｉｍＤＳＭ为实例,讨论了设计仿真器的一般原理和评价仿真器的标准,ＳｉｍＤＳＭ是为了分析研究分布式共享存储系统而设计的。它基于ＣＩＳＣ结构的Ｉｎｔｅｌ的体系结构,由于采用执行驱动和自定义的线程的实现方法,使得仿真可以高效准确进行。实验的结果表明,该仿真器具有较高的实用性和有效性。相似文献

14.

Energy efficient scheduling of parallel tasks on multiprocessor computers 总被引：2，自引：1，他引：1

Keqin Li 《The Journal of supercomputing》2012,60(2):223-247

In this paper, scheduling parallel tasks on multiprocessor computers with dynamically variable voltage and speed are addressed as combinatorial optimization problems. Two problems are defined, namely, minimizing schedule length with energy consumption constraint and minimizing energy consumption with schedule length constraint. The first problem has applications in general multiprocessor and multicore processor computing systems where energy consumption is an important concern and in mobile computers where energy conservation is a main concern. The second problem has applications in real-time multiprocessing systems and environments where timing constraint is a major requirement. Our scheduling problems are defined such that the energy-delay product is optimized by fixing one factor and minimizing the other. It is noticed that power-aware scheduling of parallel tasks has rarely been discussed before. Our investigation in this paper makes some initial attempt to energy-efficient scheduling of parallel tasks on multiprocessor computers with dynamic voltage and speed. Our scheduling problems contain three nontrivial subproblems, namely, system partitioning, task scheduling, and power supplying. Each subproblem should be solved efficiently, so that heuristic algorithms with overall good performance can be developed. The above decomposition of our optimization problems into three subproblems makes design and analysis of heuristic algorithms tractable. A unique feature of our work is to compare the performance of our algorithms with optimal solutions analytically and validate our results experimentally, not to compare the performance of heuristic algorithms among themselves only experimentally. The harmonic system partitioning and processor allocation scheme is used, which divides a multiprocessor computer into clusters of equal sizes and schedules tasks of similar sizes together to increase processor utilization. A three-level energy/time/power allocation scheme is adopted for a given schedule, such that the schedule length is minimized by consuming given amount of energy or the energy consumed is minimized without missing a given deadline. The performance of our heuristic algorithms is analyzed, and accurate performance bounds are derived. Simulation data which validate our analytical results are also presented. It is found that our analytical results provide very accurate estimation of the expected normalized schedule length and the expected normalized energy consumption and that our heuristic algorithms are able to produce solutions very close to optimum. 相似文献

15.

Simulating the DYNIX operating system parallel programming interface on a UNIX system

Mehdi Badii 《Software》1998,28(5):463-480

This paper presents the implementation of multitasking functions of DYNIX Sequent computers on the UNIX operating system. The Sequent computers are shared memory multiprocessor computers running the DYNIX operating system. These functions support data and function partitioning. They let the user implement subprograms by the processors of a Sequent computer in parallel. The functions can synchronize, lock, and unlock data and program segments. As a result, the simulator allows the users to develop their multitasking programs on a uniprocessor computer such as a SUN workstation, and later port them to a Sequent computer. Further, the simulator adds a level of abstraction on top of UNIX for concurrent programming. The functions of the simulator allow the user to handle the communication and synchronization of the processes in a program at a higher level of abstraction, while concentrating on the design of multitasking algorithms. The simulator is applied to a parallel selection algorithm. © 1998 John Wiley & Sons, Ltd. 相似文献

16.

Simulation of hierarchical multiprocessor database systems

P. S. Kostenetskii L. B. Sokolinsky 《Programming and Computer Software》2013,39(1):10-24

The paper is dedicated to issues concerning simulation and analysis of hierarchical multiprocessor systems oriented to database applications. Requirements for a parallel database system model are given. A survey and comparative analysis of known parallel database system models are presented. A new multiprocessor database system model is introduced. This model allows us to simulate and evaluate arbitrary hierarchical multiprocessor configurations in the context of the OLTP class database applications. Examples of using the database multiprocessor model for simulation study of multiprocessor database systems are presented. 相似文献

17.

Response time analysis of parallel computer and storage systems

Varki E. 《Parallel and Distributed Systems, IEEE Transactions on》2001,12(11):1146-1161

Fork-join structures have gained increased importance in recent years as a means of modeling parallelism in computer and storage systems. The basic fork-join model is one in which a job arriving at a parallel system splits into K independent tasks that are assigned to K unique, homogeneous servers. In the paper, a simple response time approximation is derived for parallel systems with exponential service time distributions. The approximation holds for networks modeling several devices, both parallel and nonparallel. (In the case of closed networks containing a stand-alone parallel system, a mean response time bound is derived.) In addition, the response time approximation is extended to cover the more realistic case wherein a job splits into an arbitrary number of tasks upon arrival at a parallel system. Simulation results for closed networks with stand-alone parallel subsystems and exponential service time distributions indicate that the response time approximation is, on average, within 3 percent of the seeded response times. Similarly, simulation results with nonexponential distributions also indicate that the response time approximation is close to the seeded values. Potential applications of our results include the modeling of data placement in disk arrays and the execution of parallel programs in multiprocessor and distributed systems 相似文献

18.

Multiprocessor platform for partitioned real‐time systems

下载免费PDF全文

Héctor Pérez Tijero Mario Aldea Rivas Daniel Medina Ortega 《Software》2017,47(1):61-78

Two current trends in the real‐time and embedded systems are the multiprocessor architectures and the partitioning technology that enables several isolated applications with different criticality levels to share the same computer. This paper presents a real‐time platform for multiprocessor and partitioned systems, in which communication requirements are also considered. The paper describes the adaptation of MaRTE OS (a monoprocessor real‐time operating system) to the XtratuM hypervisor for the multiprocessor Intel x86 architecture. This adaptation makes two contributions to ease the development process of future mixed‐criticality applications: firstly, it integrates the hypervisor technology and the fully partitioned scheduling in a multiprocessor environment, and secondly, it provides the basis to interconnect partitioned and non‐partitioned applications via a homogeneous communication subsystem. Copyright © 2016 John Wiley & Sons, Ltd. 相似文献

19.

A massively parallel architecture for self-organizing feature maps

Porrmann M. Witkowski U. Ruckert U. 《Neural Networks, IEEE Transactions on》2003,14(5):1110-1121

A hardware accelerator for self-organizing feature maps is presented. We have developed a massively parallel architecture that, on the one hand, allows a resource-efficient implementation of small or medium-sized maps for embedded applications, requiring only small areas of silicon. On the other hand, large maps can be simulated with systems that consist of several integrated circuits that work in parallel. Apart from the learning and recall of self-organizing feature maps, the hardware accelerates data pre- and postprocessing. For the verification of our architectural concepts in a real-world environment, we have implemented an ASIC that is integrated into our heterogeneous multiprocessor system for neural applications. The performance of our system is analyzed for various simulation parameters. Additionally, the performance that can be achieved with future microelectronic technologies is estimated. 相似文献

20.

Adaptive execution techniques of parallel programs for multiprocessors

Jaejin Lee Jung-Ho Park Honggyu Kim Changhee Jung Daeseob Lim SangYong Han 《Journal of Parallel and Distributed Computing》2010

In simultaneous multithreading (SMT) multiprocessors, using all the available threads (logical processors) to run a parallel loop is not always beneficial due to the interference between threads and parallel execution overhead. To maximize the performance of a parallel loop on an SMT multiprocessor, it is important to find an appropriate number of threads for executing the parallel loop. This article presents adaptive execution techniques that find a proper execution mode for each parallel loop in a conventional loop-level parallel program on SMT multiprocessors. A compiler preprocessor generates code that, based on dynamic feedbacks, automatically determines at run time the optimal number of threads for each parallel loop in the parallel application. We evaluate our technique using a set of standard numerical applications and running them on a real SMT multiprocessor machine with 8 hardware contexts. Our approach is general enough to work well with other SMT multiprocessor or multicore systems. 相似文献