首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
本文实现的SICE(SIMDCEmulator)是一个在串行机的环境下模拟进行SIMD计算机程序设计的软件包。SIC(SIMDC)是作者定义的一种基于C语言的SIMD并行扩展语言,它一方面支持反映SIMD结构特点的的并行语句,更重要的是可支持SIMD结构的定义,能方便的用于SIMD机器的算法研究。  相似文献   

2.
C*语言是通过对ANSIC语言进行语法和语义的扩展得到的支持SIMD模式的数据并行语言。它允许用户基于一个SIMD虚拟机描述数据在各个虚处理机上的分布以及对这些数据的并行计算操作,提供了一种简单,有效并与任何特定的机器无关的数据并行编程模型。  相似文献   

3.
C*语言是通过对ANSIC语言进行进行语法和语义的扩展得的支持SIMD模式的数据并行语言。它允许用户基于一个SIMD虚机描述数据在各个虚处理机上的分布以及对这些数据的并行计算操作,提供了一处简单,有效并与任何特定的机器无关的数据并行编程模型。介绍了C*语言在超级计算机领域中的应用和C*并行扩成分,并给出了典型的编程实例。  相似文献   

4.
数据并行模型应用到MIMD机器上,实现SPMD模式的松散同步的方式越来越受到人们的重视。文中提出了一个以屏构并行系统为环境的数据并行语言Multi-c的设计和实现。正在实现的Muliti-c编译器,以预编译的方式接受SIMD形式的程序说明,放宽同步要求,产生能以SPMK方式在并行系统上运行的C程序。  相似文献   

5.
并行处理机外围子系统的设计和实现技术直接影响整个系统的性能价格比,本文根据SPP体系结构的特点和实际应用需要,在前端服务器与SM/SSM之间设计了专用的I/O处理机,使得系统I/O设备与SM/SSM之间直接进行高速数据传送,从而大大提高系统的I/O性能。在I/O处理机的设计中,采用了i860+82380+SRAM的总体结构,从而实现了处理机访问主存和DMA控制器访问SRAM之间的并行。  相似文献   

6.
PJVM(ParalelJavaVirtualMachine)采用扩充Java语言功能的方法,提供了基于分布式共享存储器的编程接口DSM(DistributedSharingMemory)和基于消息传递的编程接口MPI(MessagePasingInterface),用户可以利用Java语言和扩充的两个编程接口,方便地设计在异构环境下运行的并行/分布程序。  相似文献   

7.
并行数据库上的进行CMD—Join算法   总被引:1,自引:1,他引:1  
李建中  都薇 《软件学报》1998,9(4):256-262
并行数据库在多处理机之间的分布方法对并行数据 算法的性能影响很大,如果在设计并行数据操作算法时充分利用数据分布方法的特点,可以得到十分有效的并行算法。本研究如何充分利用数据分布方法的特点,设计并行数据操作算法的问题,提出了基CMD多维数据分布方法的并行CMD-Join算法,理论分析和实验结果表明,并行CMD-Join算法的效率高于其它并行Join算法。  相似文献   

8.
文中用合并选择的思想及堆上的最佳算法,给出了求解选择问题的一个新算法及其相应的并行化。将串行合并选择算法的复杂度nLogk+O(n)降低到(nLogk)/2+(nLogLogk)/2+O(n),并保持了原并行算法的结构,在SIMD树型机器的并行计算模型上,并行运行  相似文献   

9.
交互式多模型滤波器及其并行实现研究*   总被引:5,自引:1,他引:5  
本文研究交互式多模型滤波器(IMMF).IMMF对于动态模式具有随机突变的一类混合系统的估值具有良好的性能。本文首先给出IMMF的数学描述,揭示IMMF的机理特征,针对IMMF的结构特点,以PD-100多机处理系统为基础,研究了IMMF并行实现的处理器拓扑结构、IMMF任务划分、分配和IMMF并行映射实现,给出了IMMF并行实现算法。在PD-100系统上的仿真表明,本文的并行算法具有加速比线性好,  相似文献   

10.
对具有串联结构的多干扰系统提出了在线DMC控制方案,并在已有DMC控制的基础上给出并行算法及结构上的设计,这种多闭环的近似模型的扩展DMC方法使系统的稳定性,抗扰性及鲁棒性都达到了指标,由于设计中充分利用了系统的冗余信息及并行性,使系统的快速性得到很大提高。  相似文献   

11.
Fork95 is an imperative parallel programming language intended to express algorithms for synchronous shared memory machines (PRAMs). It is based on ANSI C and offers additional constructs to hierarchically divide processor groups into subgroups and manage shared and private address subspaces. Fork95 makes the assembly-level synchronicity of the underlying hardware available to the programmer at the language level. Nevertheless, it supports locally asynchronous computation where desired by the programmer. We present a one pass compiler, fcc, which compiles Fork95 and C programs to the SB-PRAM machine. The SB-PRAM is a lock-step synchronous, massively parallel multiprocessor currently being built at Saarbrücken University, with a physically shared memory and uniform memory access time. We examine three important types of parallel computation frequently used for the parallel solution of real-world problems. While farming and parallel divide-and-conquer are directly supported by Fork95 language constructs, pipelining can be easily expressed using existing language features; an additional language construct for pipelining is not required.  相似文献   

12.
A new parallel algorithm for transforming an arithmetic infix expression into a par se tree is presented. The technique is based on a result due to Fischer (1980) which enables the construction of the parse tree, by appropriately scanning the vector of precedence values associated with the elements of the expression. The algorithm presented here is suitable for execution on a shared memory model of an SIMD machine with no read/write conflicts permitted. It uses O(n) processors and has a time complexity of O(log2n) where n is the expression length. Parallel algorithms for generating code for an SIMD machine are also presented.  相似文献   

13.
根据基于PIM(Processor-In-Memory)技术的数据并行计算机体系结构的特点和面向多媒体计算的应用需求,提出了面向嵌入式SIMD(Single Instruction Multiple Data)计算的数据并行语言PIMC。简单讨论了PIMC语言的形式化定义,并以数据并行图像处理的均值滤波算法为例对语言的使用作了说明。结合其他大量的数据并行编程实例,说明了该语言能够在基于PIM技术的SIMD并行计算机上正确描述基本多媒体处理算法的数据并行实现。  相似文献   

14.
A software behavioural simulator for a new massively parallel single-instruction/multiple data (SIMD) architecture has been developed that can accurately simulate the entire 16, 384 bit-serial processor array. The key to this high performance modelling is the exploitation of an inherent mapping that exists between massively parallel SIMD architectures and the vector architectures used in many high performance scientific super-computers. The new SIMD architecture, called BLITZEN, is based on the Massively Parallel Processor (MPP) built for NASA by Goodyear in the late 1970s. By simulating the full-scale machine with very high performance, the simulator allows development of algorithms and high-level software to proceed before realization of the hardware. This paper describes the SIMD - vector architecture mapping, the highly vectorized simulator in which it is used, and how the result was a simulator that achieved a level of performance three orders of magnitude faster than the conventional uniprocessor approach.  相似文献   

15.
Many sorting algorithms have been studied in the past, but there are only a few algorithms that can effectively exploit both single‐instruction multiple‐data (SIMD) instructions and thread‐level parallelism. In this paper, we propose a new high‐performance sorting algorithm, called aligned‐access sort (AA‐sort), that exploits both the SIMD instructions and thread‐level parallelism available on today's multicore processors. Our algorithm consists of two phases, an in‐core sorting phase and an out‐of‐core merging phase. The in‐core sorting phase uses our new sorting algorithm that extends combsort to exploit SIMD instructions. The out‐of‐core algorithm is based on mergesort with our novel vectorized merging algorithm. Both phases can take advantage of SIMD instructions. The key to high performance is eliminating unaligned memory accesses that would reduce the effectiveness of SIMD instructions in both phases. We implemented and evaluated the AA‐sort on PowerPC 970MP and Cell Broadband Engine platforms. In summary, a sequential version of the AA‐sort using SIMD instructions outperformed IBM's optimized sequential sorting library by 1.8 times and bitonic mergesort using SIMD instructions by 3.3 times on PowerPC 970MP when sorting 32 million random 32‐bit integers. Also, a parallel version of AA‐sort demonstrated better scalability with increasing numbers of cores than a parallel version of bitonic mergesort on both platforms. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

16.
17.
Real-time image analysis requires the use of massively parallel machines. Conventional parallel machines consist of an array of identical processors organized in either single instruction multiple data (SIMD) or multiple instruction multiple data (MIMD) configurations. Machines of this type generally only operate effectively on parts of the image analysis problem. SIMD on the low level processing and MIMD on the high level processing. In this paper we describe the Warwick Pyramid Machine, an architecture consisting of both SIMD and MIMD parts in a multiple-SIMD (MSIMD) organization which can operate effectively at all levels of the image analysis problem.  相似文献   

18.
A reconfigurable network termed as the reconfigurable multi-ring network (RMRN) is described. The RMRN is shown to be a truly scalable network in that each node in the network has a fixed degree of connectivity and the reconfiguration mechanism ensures a network diameter of O(log2 N) for anN-processor network. Algorithms for the two-dimensional mesh and the SIMD or SPMD n-cube are shown to map very elegantly onto the RMRN. Basic message passing and reconfiguration primitives for the SIMD/SPMD RMRN are designed for use as building blocks for more complex parallel algorithms. The RMRN is shown to be a viable architecture for image processing and computer vision problems using the parallel computation of the stereocorrelation imaging operation as an example. Stereocorrelation is one of the most computationally intensive imaging tasks. It is used as a visualization tool in many applications, including remote sensing, geographic information systems and robot vision.An earlier version of this paper was presented at the 1995 International Conference on Parallel and Distributed Processing Techniques and Applications.  相似文献   

19.
神经网络处理系统所能实现神经网络模型的种类越多其通用性越好,应用范围就越广泛.提出了一种神经网络并行处理器的体系结构,能以较高的并行度实现典型的前馈网络-BP网络和典型的反馈网络-Hopfield网络的算法.该处理器以SIMD(Single Instruction Multiple Data)为主要计算结构,并结合这两种网络算法的特点设计了一维脉动阵列和全联通的互连网络,能够方便灵活地实现处理单元之间的数据共享.实验结果表明该体系结构有效地提高了神经网络的运行速度.  相似文献   

20.
The AIS-5000 parallel processor   总被引:1,自引:0,他引:1  
The AIS-5000 is a commercially available massively parallel processor which was designed to operate in an industrial environment. It has fine-grained parallelism with up to 1024 processing elements arranged in a single-instruction multiple-data (SIMD) architecture. The processing elements are arranged in a one-dimensional chain that, for computer vision applications, can be as wide as the image itself. The overall architecture of the system is described. Various components of the system are discussed, including details of the processing elements, data I/O (input/output) pathways and parallel memory organization. A virtual two-dimensional model for programming image-based algorithms for the system is also presented. Performance benchmarks are given for certain simple and complex functions  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号