期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

基于MSP430单片机的Modbus协议软件设计方法 总被引：1，自引：0，他引：1

高旭彬《工矿自动化》2013,39(4):87-90

分析了Modbus协议规范;以MSP430单片机为例,介绍了一种在单片机上实现Modbus通信协议的软件设计方法。该方法将通信流程分解为接收、收转发、发送、发转收4个独立过程,合理分配Modbus报文的接收、校验、解包、打包、响应等任务,实现了Modbus协议的模块化编程。相似文献

2.

一种基于网络分解的多播通讯路由方法

谢澎朱怡安康继昌王雅昆《软件学报》1996,7(10):606-610

有效的消息通讯是提高分布存储器并行计算机性能的关键因素．点对点通讯和广播通讯是２种常用的消息通讯方法，而多播通讯（Ｍｕｌｔｉｃａｓｔｉｎｇ）是指从一个源节点同时给任意多个目标节点发送消息，这种通讯比点对点和广播２种方式更具一般性，适用于很多实际应用的需求．本文针对ＰＡＲ９５并行计算机的二维网格结构，提出一种基于网络分解的多播消息通讯方法，并比较了该方法与用多个点对点方法实现多播通讯的性能. 相似文献

3.

结合通信重排和消息合并的通信调度方法研究

彭晋韬杨章刘青凯张倩《计算机工程与科学》2020,42(2):191-196

网络通信对于高性能计算机应用至关重要。当前,随着数值模拟应用的复杂化和并行规模的不断提升,应用软件对于缓解拥塞和减少通信协议开销的需求愈发迫切。传统的消息合并方法只以减少通信协议开销和延迟为目标,所以针对小消息进行合并。与之不同的是,从调度算法的角度提出了一种通过消息重排以减缓大消息网络拥塞,并基于优先级合并消息来提高网络有效利用率的算法。实验表明,该算法针对真实应用的通信性能最大可以提升41%,平均对每个应用提升了10%。相似文献

4.

Where Does the Time Go in Software DSMs?—Experiences with JIAJIA

下载免费PDF全文

SHI Weisong HU weiwu TANG Zhimin 《计算机科学技术学报》1999,14(3):193-205

1IntroductionSoftwaredistributedsharedmemory(DSM)system,orsharedvirtualmemory(SVM)system,providesanabstractionofsinglesharedspaceontopofthephysicallydistributedmemoriespresentedonnetworkofworkstations.Ithasbeenextensivelystudiedinthepastdecadesinceitcombinestheprogrammabilityofsharedmemorysystemsandscalabilityofdistributedsystems[1].However,theperformancegapbetweensoftwareDSMsystemsandmessagepajssingplatformsremainsexisting,whichpreventstheprevalenceofthesoftwareDSMsystemsgreatly.Ingenera… 相似文献

5.

网络并行计算系统的消息存储器网络接口设计 总被引：4，自引：0，他引：4

武剑锋李三立戈弋《计算机学报》2000,23(2):195-201

文中通过定性分析典型并行应用程序,提出产蒙义了消息传递无关因子Ｒ,即堆中的数据的传递在整个消息传递中所占比例,而且后在一个实际的ＮＰＣ环境中对一组典型并行应用程序进行踪迹统计,证实了Ｒ接近１的分析,根据这个定性分析以及定量统计结构,结合存储器技术的进展,在ＮＰＣ中的网络接口上引入了消息存储器,使得ＮＰＣ中各个结点可以直接访问其它结点的消息存储器,通过竣是出结论,在设置了消息存储器的网络接口的ＮＰＣ相似文献

6.

PC机群上JIAJIA与MPI的比较 总被引：3，自引：2，他引：3

下载免费PDF全文

胡明昌史岗胡伟武唐志敏张福新《软件学报》2003,14(7):1187-1194

对JIAJIA和MPI (message passing interface)是进行了比较.JIAJIA和MPI分别代表共享存储和消息传递的编程模式.MPI显式进行数据传输,编程复杂;JIAJIA由底层维护数据一致性,并附加提供简单的消息传递函数,编程容易、灵活.JIAJIA分配共享内存时开销较大,初始化时间比MPI长.提出了一个关于并行加速比与进程数目之间关系的近似经验公式,推出JIAJIA和MPI性能差距随着进程数目的增多而增大的结论.测试结果表明,大部分应用程序的JIAJIA和MPI版本的并行性能差距不超过10%.对于通信量很小的应用程序,其JIAJIA和MPI的性能差距较小,而通信量本身较大的应用程序,其JIAJIA和MPI的性能差距主要取决于运行时产生的实际通信量. 相似文献

7.

Device level communication libraries for high‐performance computing in Java

Guillermo L. Taboada Juan Tourio Ramn Doallo Aamir Shafi Mark Baker Bryan Carpenter 《Concurrency and Computation》2011,23(18):2382-2403

Since its release, the Java programming language has attracted considerable attention from the high‐performance computing (HPC) community because of its portability, high programming productivity, and built‐in multithreading and networking support. As a consequence, several initiatives have been taken to develop a high‐performance Java message‐passing library to program distributed memory architectures, such as clusters. The performance of Java message‐passing applications relies heavily on the communications performance. Thus, the design and implementation of low‐level communication devices that support message‐passing libraries is an important research issue in Java for HPC. MPJ Express is our Java message‐passing implementation for developing high‐performance parallel Java applications. Its public release currently contains three communication devices: the first one is built using the Java New Input/Output (NIO) package for the TCP/IP; the second one is specifically designed for the Myrinet Express library on Myrinet; and the third one supports thread‐based shared memory communications. Although these devices have been successfully deployed in many production environments, previous performance evaluations of MPJ Express suggest that the buffering layer, tightly coupled with these devices, incurs a certain degree of copying overhead, which represents one of the main performance penalties. This paper presents a more efficient Java message‐passing communications device, based on Java Input/Output sockets, that avoids this buffering overhead. Moreover, this device implements several strategies, both in the communication protocol and in the HPC hardware support, which optimizes Java message‐passing communications. In order to evaluate its benefits, this paper analyzes the performance of this device comparatively with other Java and native message‐passing libraries on various high‐speed networks, such as Gigabit Ethernet, Scalable Coherent Interface, Myrinet, and InfiniBand, as well as on a shared memory multicore scenario. The reported communication overhead reduction encourages the upcoming incorporation of this device in MPJ Express ( http://mpj‐express.org ). Copyright © 2011 John Wiley & Sons, Ltd. 相似文献

8.

Supporting Cost-Effective Fault Tolerance in Distributed Message-Passing Applications with File Operations 总被引：1，自引：0，他引：1

Ouyang Jinsong Maheshwari Piyush 《The Journal of supercomputing》1999,14(3):207-232

In this paper we present an approach to reliable distributed computing, which incorporates fault tolerance into applications at low cost, in terms of both run-time performance and programming effort required to construct reliable application software. In our model fault tolerance is based on distributed consistent checkpointing and rollback-recovery integrated with a user-level reliable transmission protocol. By employing novel techniques 8and algorithms, our approach is distinguished from other consistent checkpointing schemes by the following features: first, minimum communication overhead for constructing a consistent distributed checkpoint and catching messages in transit during checkpointing; second, tolerance to message losses due to site failures or unreliable non-FIFO networks; and third, efficient checkpointing and recovery of persistent state, i.e., user files. Based on the model, a software library prototype called Libra has been implemented for supporting fault tolerance in distributed message-passing applications with file operations. The library provides an easy to use programming interface including message-passing and file I/O primitives, which hides the complexity of both fault-tolerant network communications and checkpointing and recovering user files from the application level. Experience with a number of long-running distributed applications shows that Libra can provide fault tolerance in a cost-effective manner. 相似文献

9.

主动消息与MPI 总被引：1，自引：0，他引：1

赵军锁周恩强《小型微型计算机系统》1999,20(3):209-213

本文详细分析了消息传递的开销问题,并对可降低开销的一种机制：主动消息进行了深入探计,在此基础上给出了将主动消息引入ＭＰ少基于主动消息实现ＭＰＩ的原型。相似文献

10.

LoGPG: Modeling network contention in message-passing programs

Moritz C.A. Frank M.I. 《Parallel and Distributed Systems, IEEE Transactions on》2001,12(4):404-415

In many real applications, for example, those with frequent and irregular communication patterns or those using large messages, network contention and contention for message processing resources can be a significant part of the total execution time. This paper presents a new cost model, called LoGPC, that extends the LogP and LogGP models to account for the impact of network contention and network interface DMA behavior on the performance of message passing programs. We validate LoGPC by analyzing three applications implemented with Active Messages on the MIT Alewife multiprocessor. Our analysis shows that network contention accounts for up to 50 percent of the total execution time. In addition, we show that the impact of communication locality on the communication costs is at most a factor of two on Alewife. Finally, we use the model to identify trade-offs between synchronous and asynchronous message passing styles 相似文献

11.

一种面向大规模并发的Gatherv优化方法

孙浩男王飞魏迪尹万旺史俊达《计算机工程与科学》2022,44(9):1542-1549

MPI不规则集合通信Gatherv为描述并行通信行为提供了极大的灵活性,但其不规则特性带来了较高的实现难度。现有方法存在通信热点突出、内存开销大和访存效率低等问题,难以满足当今大规模并行应用的性能需求。提出一种面向大规模并发的Gatherv优化方法,从优化等级、缓冲区管理等多个关键问题入手,将规则集合通信实现中常用的Binomial-Tree结构用于实现Gatherv,并提出消息链调度机制,进一步降低开销,提升优化效果。测试结果表明,该方法可以有效解决现有方法存在的性能问题,实现Gatherv集合通信性能在大规模并发条件下的高效可扩展。相似文献

12.

网络并行超级计算系统THNPSC—1 总被引：2，自引：0，他引：2

李三立都志辉马群生王小鸽《计算机学报》2001,24(6):627-632

网络并行计算（也称集群式计算）是实现高性能计算的重要方式,该文介绍了一个清华大学研制的网络并行超级计算系统THNPSC－1,它是由Pentium Ⅲ SMP计算结点组成;网络互联采用两种高速网：一种是自制的具有动态仲裁与路由寻经的交叉开关网络THNet,另一种是100Mpbs的Ethernet.THNet中的交叉开关THSwitch是用15万门的ALTERA FPGA芯片构成,THNet还包括具有DMA引擎的网络适配器THNIA.THNet每一端口可以提供数据传输率为1．056Gbps,其聚合频宽可达8．4Gbps;采用固定用户缓冲和扩展的主动消息传递等法,THNet执行用户层的消息传递,旁路操作系统的系统调用,做到零拷贝的消息传递,乒乓测试结果表明：单向消息传递延迟可减少到8μs。THNetl软件包括THNIA驱动程序和支持用户层通信的函数库。此文对相关工作进行了简要对比,并说明了该系统的应用情况。相似文献

13.

A real-time messaging system for token ring networks

Alfred C. Weaver M. Alex Colvin 《Software》1987,17(12):885-897

The Computer Networks Laboratory at the University of Virginia has developed a real-time messaging service that runs on IBM PCs and PC/ATs when interconnected with a Proteon ProNET-10 token ring local area network. The system is a prototype for a real-time communications network to be used aboard ships. The system conforms to the IEEE 802.2 logical link control standard for type I (connectionless, or datagram) service, with an option for acknowledged datagrams. The application environment required substantial network throughput and bounded message delay. Thus, the development philosophy was to emphasize performance initially and to offer only primitive user services. After providing and measuring the performance of a basic datagram service, the intent is to add additional user services one at a time and to retain only those which the user can ‘afford’ in terms of their impact on throughput, delay, and CPU utilization. The current system is programmed in C. The user interface is a set of C procedure calls that initialize tables, reserve buffer space, send and receive messages, and report network status. The system is now operational, and initial performance measurements are complete. Using this system, an individual PC can transmit or receive approximately 200 short (about 100 bytes) messages per second, and the PC/AT operates at nearly 500 short messages per second. 相似文献

14.

Structural testing for message‐passing concurrent programs: an extended test model

Paulo S.L. Souza Simone R.S. Souza Ed Zaluska 《Concurrency and Computation》2014,26(1):21-50

Developing high‐quality, error‐free message‐passing concurrent programs is not trivial. Although a number of different primitives with associated semantics are available to assist such development, they often increase the complexity of the testing process. In this paper, we extend our previous test model for message‐passing programs and present new structural testing criteria, taking into account additional features used in this paradigm, such as collective communication, non‐blocking sends, distinct semantics for non‐blocking receives, and persistent operations. Our new model also recognizes that sender primitives cannot always be matched with every receive primitive. This improvement allows us to remove statically a significant number of infeasible synchronization edges that would otherwise have to be analyzed later by the tester. In this paper, the test model is presented using the Message‐Passing Interface standard; however, our new model has been designed to be flexible, and it can be configured to support a range of different message‐passing environments or languages. We have carried out case studies showing the applicability of the new test model to represent message‐passing programs and also to reveal errors, mainly those errors related to inter‐process communication. In addition to increasing the number of features supported by the test model, we have also reduced the overall cost of testing significantly. Our case studies suggest that the number of synchronization edges can be reduced by up to 93%, mainly by eliminating infeasible edges between unmatchable communication primitives. The main contribution of the paper is to present a more flexible test model that provides improved coverage for message‐passing programs and at the same time reduces the cost of testing significantly. Copyright © 2012 John Wiley & Sons, Ltd. 相似文献

15.

基于一般逻辑拓扑结构的广播通信研究和实现

熊玉庆张祥《计算机研究与发展》2000,37(3):300-306

在分布存储并行计算消息传递系统中,许多广播通信中的消息传递路径是对程序员透明的,程序员不能改变消息传递路径,但应用程序运行时的情况很复杂。程序员根据计算环境及应用程序特征选择消息传递路径,有助于提高广播通信的效能。在通信过程中,消息标志是用来区分消息的,以便接受进程能正确接受消息。然后,消息标志易导致应用程序出错,而且消息标志增加编制程序的复杂性。文中首先给出了逻辑拓扑结构的形式定义及基本性质,提相似文献

16.

Date movement and control substrate for parallel adaptive applications

Kevin Barker Nikos Chrisochoides Jeffrey Dobbelaere Dmian Nave Keshav Pingali 《Concurrency and Computation》2002,14(2):77-101

In this paper, we present the Data Movement and Control Substrate (DMCS), a library which implements low‐latency one‐sided communication primitives for use in parallel adaptive and irregular applications. DMCS is built on top of low‐level, vendor‐specific communication subsystems such as LAPI (Low‐level Application Programme Interface) for IBM SP machines, as well as on widely available message‐passing libraries like MPI for clusters of workstations and PCs. DMCS adds a small overhead to the communication operations provided by the lower communication system. In return, DMCS provides a flexible and easy to understand application program interface for one‐sided communication operations. Furthermore, DMCS is designed so that it can be easily ported and maintained by non‐experts. Copyright © 2002 John Wiley & Sons, Ltd. 相似文献

17.

Active optimistic and distributed message logging for message‐passing applications

Thomas Ropars Christine Morin 《Concurrency and Computation》2011,23(17):2167-2178

Message logging is an attractive solution to provide fault tolerance for message‐passing applications because it is more scalable than coordinated checkpointing. Sender‐based message logging is a well‐known optimization that allows the saving of message payload in the sender memory. Thus, only message reception events have to be logged reliably by using an event logger. This paper proposes solutions to further improve message logging protocol scalability. In existing works on message logging, the event logger has always been considered as a centralized process. We propose a distributed event logger that takes advantage of multi‐core processors that are to be executed in parallel with application processes, leveraging the volatile memory of the nodes to save events reliably. We also propose the combination of our distributed event logger and O2P, an active optimistic message logging protocol using a gossip‐based protocol to disseminate information on new stable events. Our distributed event logger and O2P are implemented in the Open MPI library. Our results show the following: (i) distributed event logging improves message logging protocol scalability and (ii) using O2P with a distributed event logger provides an efficient and scalable fault‐tolerant solution for message‐passing applications. Copyright © 2011 John Wiley & Sons, Ltd. 相似文献

18.

位置透明的MA可靠消息传递机制 总被引：1，自引：0，他引：1

杨娟李建国《计算机应用》2004,24(3):25-26,30

移动Agent系统中的通信机制多由RMI加上消息发送机制实现,在现有的三种主流消息发送机制上进行改进,提出了新的消息转发策略——资源分散模型(Resource Distributed Model)。RDM提供了一种类似于结合了Homeagent和按路径转发方式的寻址策略,达到消息可达的目的,基于RDM的移动服务(Mobile Service)是一种在快速寻址后将消息快速转发的方式,MS减少了消息缓存部件的消息缓存量,并可用多个MS同时寻址从而提高消息发送速度。相似文献

19.

The Design and Performance Evaluation of the DI-Multicomputer

Lynn Choi Andrew A. Chien 《Journal of Parallel and Distributed Computing》1996,36(2):119

In this paper, we propose a new multicomputer node architecture, theDI-multicomputerwhich uses packet routing on a uniform point-to-point interconnect for both local memory access and internode communication. This is achieved by integrating a router into each processor chip and eliminating the memory bus interface. Since communication resources such as pins and wires are allocated dynamically via packet routing, the DI-multicomputer is able to maximize the available communication resources, providing much higher performance for both intranode and internode communication. Multi-packet handling mechanisms are used to implement a high performance memory interface based on packet routing. The DI-multicomputer network interface provides efficient communication for both short and long messages, decoupling the processor from the transmission overhead for long messages while achieving minimum latency for short messages. Trace-driven simulations based on a suite of message passing applications show that the communication mechanisms of the DI-multicomputer can achieve up to four times speedup when compared to existing architectures. 相似文献

20.

Portable and Scalable Algorithm for Irregular All-to-All Communication

《Journal of Parallel and Distributed Computing》2002,62(10):1493-1526

In irregular all-to-all communication, messages are exchanged between every pair of processors. The message sizes vary from processor to processor and are known only at run time. This is a fundamental communication primitive in parallelizing irregularly structured scientific computations. Our algorithm reduces the total number of message start-ups. It also reduces node contention by smoothing out the lengths of the messages communicated. As compared to the earlier approaches, our algorithm provides deterministic performance and also reduces the buffer space at the nodes during message passing. The performance of the algorithm is characterised using a simple communication model of high-performance computing (HPC) platforms. We show the implementation on T3D and SP2 using C and the message passing interface standard. These can be easily ported to other HPC platforms. The results show the effectiveness of the proposed technique as well as the interplay among the machine size, the variance in message length, and the network interface. 相似文献