期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

杨娟邱玉辉李建国郑樯《计算机研究与发展》2005,42(5):830-834

移动agent(MA)系统的通信开销会影响系统执行性能,但由于产生系统通信开销的主要部分集中在MA之间,而MA间的通信量因为受多种因素的影响无法用统一模型进行量化,因此没有一个适用的办法可以减少系统开销．因此从产生系统开销的另一个方面入手,即减少系统中分布式驻留的应用部件间的通信量也是一个可行的办法．构建一个可移动部件和固定部件分离的MA系统模型,并在此基础上使用了健壮性有所增强的Avrampopoulos资源定位策略．相似文献

2.

高性能计算机系统中影响通信因素分析和优化策略的研究

戴梅萼麦联叨《小型微型计算机系统》2002,23(3):262-264

高性能并行计算机系统中由于 CPU速度不断提高 ,通信机制成了影响系统性能的首要因素 .为了优化通信性能 ,本文从硬件和软件两方面综合分析了影响通信的因素 .从硬件上阐述了紧耦合策略 ,以及网络接口电路通过配置协处理器和较大容量存储器的策略 .从软件上提出了固定缓冲区策略和用户层通信策略 .在用户层通信策略中 ,阐述了管理监控进程办法和原子访问办法 ,重点说明了原子访问机制的思想和实施办法 .这些策略对提高机群系统的性能有重要意义相似文献

3.

一种基于动态二进制翻译的硬件支持Profile技术

杨辉兵管海兵梁阿磊《微型电脑应用》2010,26(5)

传统的基于动态二进制翻译器的profiling策略分为3种:基于基本块、基于跳转边、基于路径跟踪。使用纯软件的profiling系统一般地说会带来平均30%的性能开销。如果在动态优化中得到硬件的支持,系统的整体性能将得到显著的提高。其中,软硬件协同设计中的难点,就是软硬件之间的通信开销和软硬件划分。该文针对动态二进制翻译中的优化阶段,使用一种硬件支持的运行是profile收集新方法来取代纯软件的profiling方法,把软硬件之间的通信开销降到最低,并以此来提高动态二进制翻译的整体性能。此方法可以在运行时准确地,并且以很小的开销收集Profile信息,从而更好的优化系统。相似文献

4.

一种改进的并行计算图划分模型

马永刚谭国真杨际祥潘东《小型微型计算机系统》2011,32(3)

图划分成功地应用在许多领域,但应用于并行计算时,使用边割度量通信量,其主要缺点是不能准确代表通信量,而且图划分模型没有考虑通信延迟和通信额外开销的分布对并行性能的影响.提出了改进的图划分模型,该模型将影响并行性能的多个要素(通信延迟、最大的局部通信额外开销和整体通信额外开销)整合到一个统一的代价函数,不仅克服了图划分模型中边割度量的一些缺点,而且可以通过调整加权参数,处理不同的优化目标和强调不同因素对并行性能的影响. 相似文献

5.

强弱状态信息结合的MANET位置服务协议*

宗明王晓东周兴铭《计算机应用研究》2012,29(2):734-738

为了优化位置服务,提出一种强弱状态信息结合的位置服务协议,该协议能动态调整位置服务策略,以适应通信需求的变化。通过实验分析了该协议的存储和通信开销,以及对上层路由协议的报文延迟和送达率的影响。实验结果表明,该协议在保持较高性能的基础上减小了开销。相似文献

6.

Heron环境下基于实例重分配的传输负载优化策略

刘宇于炯蒲勇霖李梓杨张译天《计算机应用研究》2021,38(1):198-203

作为新一代大数据流式计算框架,Heron忽略了任务实例之间不同通信方式的差异以及节点资源利用率不均衡的问题导致系统性能下降。针对这一问题,设计了节点资源限制模型、通信开销优化模型和实例数据流关系模型,并在此基础上提出了Heron环境下基于实例重分配的传输负载优化策略(transmission load optimization strategy based on instance reallocation in Heron,TLIR-Heron)。该策略包括节点资源限制算法和实例重分配算法,通过判定实例重分配条件并执行重分配算法将节点间数据流转换为节点内数据流,从而降低通信开销。实验结果表明,在三组拓扑测试下,TLIR-Heron相较于Heron默认调度策略能够降低节点间通信开销和系统的计算延迟,并提升了计算节点资源利用的均衡性。相似文献

7.

计算机系统中通信机制优化分析

牛国新王育欣《网络与信息》2008,22(1):69

高性能并行计算机系统中由于CPU速度不断提高,通信机制成了影响系统性能的首要因素,为了优化通信性能,本文从硬件和软件两方面综合分析了影响通信的因素. 相似文献

8.

大数据流式计算框架Heron环境下的流分类任务调度策略

张译天于炯鲁亮李梓杨《计算机应用》2019,39(4):1106-1116

新型大数据流式计算框架Apache Heron默认使用轮询调度算法进行任务调度，忽略了拓扑运行时状态以及任务实例间不同通信方式对系统性能的影响。针对这个问题，提出Heron环境下流分类任务调度策略（DSC-Heron），包括流分类算法、流簇分配算法和流分类调度算法。首先通过建立Heron作业模型明确任务实例间不同通信方式的通信开销差异；其次基于流分类模型，根据任务实例间实时数据流大小对数据流进行分类；最后将相互关联的高频数据流整体作为基本调度单元构建任务分配计划，在满足资源约束条件的同时尽可能多地将节点间通信转化为节点内通信以最小化系统通信开销。在包含9个节点的Heron集群环境下分别运行SentenceWordCount、WordCount和FileWordCount拓扑，结果表明DSC-Heron相对于Heron默认调度策略，在系统完成时延、节点间通信开销和系统吞吐量上分别平均优化了8.35%、7.07%和6.83%；在负载均衡性方面，工作节点的CPU占用率和内存占用率标准差分别平均下降了41.44%和41.23%。实验结果表明，DSC-Heron对测试拓扑的运行性能有一定的优化作用，其中对接近真实应用场景的FileWordCount拓扑优化效果最为显著。相似文献

9.

基于M/M/1模型的多用户分布式系统负载平衡 总被引：1，自引：0，他引：1

下载免费PDF全文

陈国栋陈永生《计算机工程》2008,34(23):125-127

针对分布式系统负载平衡问题,对动态全局最优策略进行改进,结合静态全局最优策略,提出动静结合的负载平衡策略。策略克服在较高通信开销时动态负载平衡策略的缺点,有效提高分布式系统的综合性能。仿真结果表明,使用该策略在通信开销较高和系统负载率大于40%的情况下,能够获得比动态负载平衡更小的系统预期响应时间。相似文献

10.

移动Agent系统通信效率的分析与优化 总被引：7，自引：0，他引：7

杨博刘大有杨鲲于万钧《计算机研究与发展》2004,41(4):531-538

通信效率是影响移动Agent系统运行效率的重要因素之一，如何提高通信效率仍是一个有待解决的问题，提出移动Agent系统通信效率优化模型LCEOM.LCEOM具有以下优点：①全面考虑了影响通信效率的主要因素；②能够描述移动Agent的通信任务；③能够定量地分析通信开销；④能够规划出通信开销最小的通信方案；⑤采用压缩和缓冲存储技术进一步提高远程通信的效率。实验表明，在一定条件下，LCEOM模型优于现有方法，能有效提高系统的通信效率。相似文献

11.

Improving communication latency with the write-only architecture

Simon Spacey Wayne Luk Paul H.J. Kelly Daniel Kuhn 《Journal of Parallel and Distributed Computing》2012

This paper introduces a novel execution paradigm called the Write-Only Architecture (WOA) that reduces communication latency overheads by up to a factor of five over previous methods. The WOA writes data through distributed control flow logic rather than using a read–write paradigm or a centralised message hub which allows tasks to be partitioned at a fine-grained level without suffering from excessive communication overheads on distributed systems. In this paper we provide formal assignment results for software benchmarks partitioned using the WOA and previous execution paradigms for distributed heterogeneous architectures along with bounds and complexity information to demonstrate the robust performance improvements possible with the WOA. 相似文献

12.

基于进程投机并行的运行时系统设计与优化

刘雷李晶陈莉冯晓兵《计算机工程》2014,(3):99-102,112

投机并行化是解决遗留串行代码并行化的重要技术,但以往投机并行化运行时系统面临着诸多的性能问题,如任务分配不均衡、通信频繁、冲突代价高,以及进程启动,结柬频繁而导致开销过高等。为此,提出一种基于进程实现的投机并行化运行时系统。采用隐式单程序多数据的并行任务划分和执行模式。通过实现重甩进程的投机任务调度策略和委托正确性检查技术,降低投机进程启动/结束和通信的开销,提高投机进程的利用率,同时利用守护进程与投机进程协同执行的方式,确保在投机进程出现异常情况时程序也能正确执行。实验结果表明,该基于进程实现的投机运行时系统比同类型系统的性能提高231%。相似文献

13.

Scheduling communication in multithreaded programs: experimental results

Juan Carlos Gomez Vernon Rego V. S. Sunderam 《Concurrency and Computation》2006,18(1):1-28

When the critical path of a communication session between end points includes the actions of operating system kernels, there are attendant overheads. Along with other factors, such as functionality and flexibility, such overheads motivate and favor the implementation of communication protocols in user space. When implemented with threads, such protocols may hold the key to optimal communication performance and functionality. Based on implementations of reliable user‐space protocols supported by a threads framework, we focus on our experiences with internal threads' scheduling techniques and their potential impact on performance. We present scheduling strategies that enable threads to do both application‐level and communication‐related processing. With experiments performed on a Sun SPARC‐5 LAN environment, we show how different scheduling strategies yield different levels of application‐processing efficiency, communication latency and packet‐loss. This work forms part of a larger study on the implementation of multiple thread‐based protocols in a single address space, and the benefits of coupling protocols with applications. Copyright © 2005 John Wiley & Sons, Ltd. 相似文献

14.

Local interactions over global broadcasts for improved task allocation in self-organized multi-robot systems

《Robotics and Autonomous Systems》2014,62(10):1453-1462

We present a study of self-organized multi-robot task-allocation, examining performance under local and centralized communication strategies. The results extend our current understanding of the effects of communication by providing evidence that local strategies can improve system performance over centralized strategies, in terms of total task throughput as well as reduced communication overheads. The framework employed is the attractive field model, a generic model of self-organized division of labour derived from observations of ant, human and robot social systems. The framework provides sufficient abstraction to accommodate both communication strategies. Each of the studies used 16 e-puck robots in a simplified manufacturing environment where sensing and communication was realized using camera-based overhead tracking and centralized communication. In terms of task throughput, communication overhead and energy efficiency, the experimental results show that systems with restricted access to information perform better than systems with free flow of information. This suggests a potential paradigm shift where, for self-organizing systems, diminishing access to information renders a system more efficient. 相似文献

15.

Analysis of processor allocation in multiprogrammed,distributed-memory parallel processing systems

Setia S.K. Squillante M.S. Tripathi S.K. 《Parallel and Distributed Systems, IEEE Transactions on》1994,5(4):401-420

A main objective of scheduling independent jobs composed of multiple sequential tasks in shared-memory and distributed-memory multiprocessor computer systems is the assignment of these tasks to processors in a manner that ensures efficient operation of the system. Achieving this objective requires the analysis of a fundamental tradeoff between maximizing parallel execution, suggesting that the tasks of a job be spread across all system processors, and minimizing synchronization and communication overheads, suggesting that the job's tasks be executed on a single processor. The authors consider a class of scheduling policies that represent the essential aspects of this processor allocation tradeoff, and model the system as a distributed fork-join queueing system. They derive an approximation for the expected job response time, which includes the important effects of various parallel processing overheads (such as task synchronization and communication) induced by the processor allocation policy 相似文献

16.

Channel-aware multi-user uplink transmission scheme for SIMO-OFDM systems

ChengKang Pan YueMing Cai YouYun Xu 《中国科学F辑(英文版)》2009,52(9):1678-1687

The problem of medium access control(MAC) in wireless single-input multiple-output-orthogonal frequency division multiplexing(SIMO-OFDM) systems is addressed. Traditional random access protocols have low overheads and inferior performance. Centralized methods have superior performance and high overheads. To achieve the tradeoff between overhead and performance,we propose a channelaware uplink transmission(CaUT) scheme for SIMO-OFDM systems. In CaUT,users transmit requestto-send(RTS) at some subcarriers whos... 相似文献

17.

A study of overheads and accuracy for efficient monitoring of wireless mesh networks

Dhruv Gupta Daniel Wu Prasant Mohapatra Chen-Nee Chuah 《Pervasive and Mobile Computing》2010,6(1):93-111

IEEE 802.11-based wireless mesh networks are being increasingly deployed in enterprize and municipal settings. A lot of work has been done on developing measurement-based schemes for resource provisioning and fault management in these networks. The above goals require an efficient monitoring infrastructure to be deployed, which can provide the maximum amount of information regarding the network status, while utilizing the least possible amount of network resources. However, network monitoring involves overheads, which can adversely impact performance from the perspective of the end user. The impact of monitoring overheads on data traffic has been overlooked in most of the previous works. It remains unclear as to how parameters such as number of monitoring agents, or frequency of reporting monitoring data, among others, impact the performance of a wireless network. In this work, we first evaluate the impact of monitoring overheads on data traffic, and show that even small amounts of overhead can cause a large degradation in the network performance. We then explore several different techniques for reducing monitoring overheads, while maintaining the objective (resource provisioning, fault management, and others) that needs to be achieved. Via extensive simulations and experiments, we validate the efficiency of our proposed approaches in reducing overheads, their impact on the quality of data collected from the network, and the impact they have on the performance of the applications using the collected data. Based on results, we conclude that it is feasible to make the current monitoring techniques more efficient by reducing the communication overheads involved while still achieving the desired application-layer objectives. 相似文献

18.

Parallel computing optimization in the Apollo domain network

Pekergin M.F. 《IEEE transactions on pattern analysis and machine intelligence》1992,18(4):296-303

The performance of parallel computing in a network of Apollo workstations where the processes use the remote procedure call (RPC) mechanism for communication is addressed. The speedup in such systems cannot be accurately estimated without taking into account the relatively large communication overheads. Moreover, it decreases by increasing parallelism when the latter exceeds some certain limit. To estimate the speedup and determine the optimum degree of parallelism, the author characterizes the parallelization and the communication overheads in the system considered. Then, parallel applications are modeled and their execution times are expressed for the general case of nonidentical tasks and workstations. The general case study allows the structural constraints of the applications to be taken into account by permitting their partitioning into heterogeneous tasks. A simple expression of the optimum degree of parallelism is obtained for identical tasks where the inherent constraints are neglected. The fact that the theoretical maximum speedup is bounded by half of the optimum degree of parallelism shows the importance of this measure 相似文献

19.

通过部分Warp重组消除GPGPU控制流的不一致性

沈立杨耀华王志英《计算机工程与科学》2019,41(8):1335-1342

GPU已被广泛应用于当前的高性能计算系统中,但其性能却受到程序运行时不同控制流方向的严重制约。这一问题通常通过动态Warp重组技术来解决,即将一个或多个Warp内沿相同控制流执行的线程组合在一起,构成一个新的Warp。但是,这类方法普遍存在一些不必要的重组,引入了较大的额外性能开销。分析了线程重组的性能开销,并提出了一种称作"部分重组"的性能优化方法。这种方法在保证重组效率的前提下,避免了对包含活跃线程数量较多的Warp的重组,从而有效减少了线程重组引入的性能开销。测试结果表明,部分重组能够在保证重组效率的前提下带来较为明显的性能提升。相似文献

20.

Improving the real-time performance of Ethernet for plant automation （EPA） based industrial networks

Li LU Dong-qin FENGT Jian CHU 《浙江大学学报:C卷英文版》2013,14(6):433-448

Real-time Ethernet (RTE) control systems with critical real-time requirements are called fast real-time (FRT) systems. To improve the real-time performance of Ethernet for plant automation (EPA), we propose an EPA-FRT scheme. The minimum macrocycle of EPA networks is reduced by redefining the EPA network frame format, and the synchronization process is modified to acquire higher accuracy. A multi-segmented topology with a scheduling scheme is introduced to increase effective bandwidth utilization and reduce protocol overheads, and thus to shorten the communication cycle significantly. Performance analysis and practical tests on a prototype system show the effectiveness of the proposed scheme, which achieves the best performance at small periodic payload in large scale systems. 相似文献