共查询到20条相似文献,搜索用时 109 毫秒
1.
2.
集群系统中基于MPI的并行GMRES(m)计算通信的研究及应用 总被引:1,自引:1,他引:0
针对求解大型稠密线性方程组的GMRES(m)算法的内在并行性,应用可移植消息传递标准MPI的集群通信机制在分布式存储并行系统上,设计了一种粗粒度、低通信开销的并行算法,并且应用于边界元求解的大型弹性问题的计算中.通过与串行算法进行比较,设计的并行算法具有较高的计算精度和计算效率. 相似文献
3.
4.
稀疏码多址接入(SCMA)作为一个前景广阔的5 G无线空口技术,能够满足海量连接的需求。针对现有SCMA通信系统都是基于并行策略的消息传递算法(MPA)进行多用户检测,存在信息收敛速度不理想的问题,该文提出一种串行策略的多用户检测算法。该算法以资源节点为序,按串行方式依次进行消息更新与传递,保证更新的消息能够立即进入当前迭代过程,改善了消息传递的收敛速度,相比并行策略的多用户检测算法,降低了算法复杂度;同时,充分利用消息间相互关联的特点,融合消息传递步骤,降低了存储器的要求。理论与仿真结果表明,该算法在误比特率(BER)性能与算法复杂度之间可以达到较理想的平衡。 相似文献
5.
设计实现了一个可复用的面向并行DSP应用的消息传递软件框架.为了实现DSP软件的编码能够独立于不同的并行处理器体系结构, 设计分层的软件框架结构和各层中类间的关系.为了DSP应用的开发能够独立于底层的数据通信路径,提高并行系统的可扩展与可配置性,设计基于处理器间数据流模型和硬件平台拓扑模型的节点路由表结构和路由算法. 以TigerSHARC DSP构成的并行DSP系统为例,说明软件框架应用于特定的多处理器平台的方法,实现并验证了消息传递软件框架. 相似文献
6.
目前,时域有限差分方法(Finite Difference Time Domain,FDTD)在电磁数值计算中已获得了广泛应用。对许多复杂电磁问题,FDTD 算法需要耗费巨大的计算机计算时间和存储空间,这成为FDTD 方法亟待解决的难题。本文提出了应用基于消息传递(Message Passing)方式实现FDTD 的并行算法。并对基于MPI 不同通信方式的并行FDTD 进行了效
率比较。采用MPI2.0 单边通信方式中的put 操作和主动对象同步(PSCW)方式,在一套16 个节点的Beowulf 型网络并行计算机系统上,实现了三维FDTD 并行程序,获得了较高的加速比和并行效率。 相似文献
7.
对云计算的基本原理及建立在对这一理论体系进行消息传递算法的理论框架进行分析和研究,提出了云计算在消息传递领域的应用方法,并根据云计算的理论框架得到了消息传递的云计算算法设计模型,指出云计算的分布式及并行化特性,为算法分布及并行化提供了新的思路。 相似文献
8.
一、引言CCITT NO.7信号方式是最适合用于数字通信网的公共信道信号,为了适应多种通信业务的应用,采用了如图1所示的功能级结构。这个结构由消息传递部分(MTP)和各个用户部分(UP)组成。其中消息传递部分作为一个传送系统,在正在通信的用户功能位置之间可靠的传递消息。NO.7信号方式按功能级的概念可分为以下四个功能级: 相似文献
9.
实现了一个用于探索基于片上网络通信架构多核系统设计空间的可配置仿真平台--NoC_MPSim.该平台包含处理器工具链、平台自动化配置脚本以及一个包含处理器、网络适配器以及多种路由器的RTL模型库,可根据用户输入的系统配置信息自动生成周期精确的多核仿真系统.针对片上网络通信架构的特征,定义了基于该通信架构的多核系统的高层次通信抽象模型,并借鉴并行机中的消息传递机制,提出了一种可有效隐藏网络乱序的并行编程模型及其通信原语,并完成其所需要的软\硬件建模.应用提出的编程模型,实现了MUSIC算法基于四核仿真系统的分布式并行计算,并经实验得到该并行MUSIC算法在该系统中加速比可达2.6. 相似文献
10.
11.
《Latin America Transactions, IEEE (Revista IEEE America Latina)》2009,7(1):114-121
In this paper we describe a high performance environment, like cluster computers, with high accuracy obtained by use of C-XSC library. The C-XSC library is a (free) C++ class library for scientific computing for the development of numerical algorithms delivering highly accurate and automatically verified results by use of the interval arithmetic. These calculus in high accuracy must be available for some basic arithmetic operations, mainly the operations that accomplish the summation and dot product. Because of these aspects, we wish to use the high performance through a cluster environment where we have several nodes executing tasks or calculus. The communication will be done by message passing using the MPI communication library. To obtain the high accuracy in this environment extensions or changes in the parallel programs had done to guarantee that the quality of final result done on cluster, where several nodes collaborate for the final result of the calculus, maintain the same result quality obtained in one sequential high accuracy environment. To validate the environment developed in this work we done basic tests about the dot product, the matrix multiplications, the implementation of interval solvers for banded and dense matrices and the implementation of some numeric methods to solve linear systems with the high accuracy characteristic (some of the methods implemented are used in real life applications like hydrodynamic, agriculture and power electric systems). With these tests we done analysis and comparisons about the performance and accuracy obtained with and without the use of C-XSC library in sequential and parallel programs. With the implementation of these routines and methods will be open a large research field about the study of real life applications that need during their resolution (or in part of their resolution) to calculate arithmetic operations with more accuracy than the accuracy obtained by the traditional computational tools. Our software 相似文献
12.
High-performance computing for vision 总被引:2,自引:0,他引:2
Cho-Li Wang Bhat P.B. Prasanna V.K. 《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》1996,84(7):931-946
The main focus of the paper is on effectively using commercial-off-the-shelf (COTS) based general purpose parallel computing platforms to realize high speed implementations of vision tasks. Due to the successful use of the COTS-based systems in a variety of high performance applications, it is attractive to consider their use for vision applications as well. However, the irregular data dependencies in vision tasks lead to large communication overheads in the HPC systems. At the University of Southern California, our research efforts have been directed toward designing scalable parallel algorithms for vision tasks on the HPC systems. In our approach, we use the message passing programming model to develop portable code. Our algorithms are specified using C and MPI. In this paper, we summarize our efforts, and illustrate our approach using several example vision tasks 相似文献
13.
Distributed network computing over local ATM networks 总被引:1,自引:0,他引:1
Mengjou Lin Hsieh J. Du D.H.C. Thomas J.P. MacDonald J.A. 《Selected Areas in Communications, IEEE Journal on》1995,13(4):733-748
Communication between processors has long been the bottleneck of distributed network computing. However, recent progress in switch-based high-speed local area networks (LANs) may be changing this situation. Asynchronous transfer mode (ATM) is one of the most widely-accepted and emerging high-speed network standards which can potentially satisfy the communication needs of distributed network computing. We investigate distributed network computing over local ATM networks. We first study the performance characteristics involving end-to-end communication in an environment that includes several types of workstations interconnected via a Fore Systems' ASX-100 ATM switch. We then compare the communication performance of four different application programming interfaces (APIs). The four APIs were Fore Systems' ATM API, the BSD socket programming interface, Sun's remote procedure call (RPC), and the parallel virtual machine (PVM) message passing library. Each API represents distributed programming at a different communication protocol layer. We evaluated two popular distributed applications, parallel matrix multiplication and parallel partial differential equations, over the local ATM network. The experimental results show that network computing is promising over local ATM networks, provided that the higher level protocols, device drivers, and network interfaces are improved 相似文献
14.
15.
Emmanouel A. Varvarigos 《Telecommunication Systems》2000,13(1):3-20
Communication efficiency is one of the keys to the broad success of parallel computation, as one can see by looking at the
successes of parallel computation, which are currently limited to applications that have small communication requirements,
or applications that use a small number of processors. In order to use fine grain parallel computation for a broader range
of applications, efficient algorithms to execute the underlying interprocessor communications have to be developed. In this
paper we survey several generic static and dynamic communication problems that are important for parallel computation, and
present some general methodologies for addressing these problems. Our objective is to obtain a collection of communication
algorithms to execute certain prototype communication tasks that arise often in applications. These algorithms can be called
as communication primitives by the programmer or the compiler of a multiprocessor computer, in the same way that subroutines
implementing standard functions are called from a library of functions in a conventional computer. We discuss both algorithms
to execute static (deterministic) primitive communication tasks, as well as schemes that are appropriate for dynamic (stochastic)
environments. Our emphasis is on algorithms that apply to many similar problems and can be used in various network topologies.
This revised version was published online in June 2006 with corrections to the Cover Date. 相似文献
16.
A computer-aided design tool has been developed to study the hardware/software structure of various types of data switching systems used in the local loop distribution of computer communication networks. A simulation package is used to evaluate the performance parameters (such as the system throughput, average message delay, and probability of data loss) of local access systems under different input traffic conditions. Two types of traffic commonly used in teleprocessing applications are considered: the inquiry/response mode and the file transfer mode. Design tradeoffs for a line concentrator and a message interswitch are discussed and their performance is compared. The message interswitch permits a number of low-speed terminals to share communication lines and also to gain access to local common resources such as line printers, databases, and optical character readers. 相似文献
17.
嵌入式实时操作系统μC/OS-Ⅱ串口通信的设计与实现 总被引:1,自引:0,他引:1
为解决实时操作系统μC/OS-Ⅱ串口通信设计中信号量、消息邮箱使用方法的问题,提出了一种以STM32V评估板为硬件平台和μC/OS-Ⅱ的串口通信程序设计方案.该方案采用Codex-M3架构的ARM处理器STM32F103VB作为主控制芯片,ST3232作为串口通信电平转换器.软件设计部分描述了信号量、消息邮箱的应用场合... 相似文献
18.
为解决实时操作系统μC/OS-Ⅱ串口通信设计中信号量、消息邮箱使用方法的问题.提出了一种以STM32V评估板为硬件平台和μC/OS-Ⅱ的串口通信程序设计方案。该方案采用Cortex—M3架构的ARM处理器STM32F103VB作为主控制芯片,ST3232作为串口通信电平转换器。软件设计部分描述了信号量、消息邮箱的应用场合和基本操作方法,通过信号量和消息邮箱的配合使用保证任务间的数据传输的同步性。给出了整个程序中的设计思路,程序开发使用STM32F103VB处理器自带的固件库,减少了繁琐的寄存器配置,降低了程序开发强度。实验验证了在2种不用通信速率下数据传输具有误码率低、传输稳定可靠的特点.并且若能够配合相应的数据校验算法就可将其应用于工业现场的数据通信。 相似文献
19.
Current and emerging safety-critical applications such as the automotive X-by-wire systems require a high degree of reliability. These dependable embedded distributed systems require an ultra-reliable communication system to exchange data between the distributed components. In addition to guaranteeing a high level of reliability, these communication systems should facilitate the development of fault-tolerant applications. This can be achieved by providing additional communication system services such as interactive consistency. Interactive consistency on a communication system can be defined as a means to ensure that all non-faulty nodes on the communication system receive a consistent value for any message communicated. This paper describes the adoption of an explicit interactive consistency algorithm on a time-triggered broadcast communication system, using a shared communication medium. This is supported by the development of a prototype implementation of the interactive consistency algorithm. This prototype system demonstrates that interactive consistency is successfully achieved in the presence of a number of faults 相似文献
20.
《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》1972,60(11):1321-1332
A review is given of the use of small digital computers for the processing of data received over communication lines. A detailed discussion is presented of the hardware and software requirements of front-end processors, network processors, remote data concentrators, and message switching systems. Finally, the desirable features common to all communications processors are analyzed. Examples of actual applications are given, so that a realistic basis can be established for the determination of the features which should be included in the design of new communication processors. 相似文献