期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

王继曾黎富贵《计算机工程与设计》2009,30(21)

介绍RSVP协议基本原理,提出一种基于移动ip网络环境的动态移动RSVP扩展方案.该方案在蜂窝中通过使用移动代理,利用移动ip网络支持组播和GPS预测节点移动,根据信号传输距离计算出移动节点本地蜂窝链路可用时间,建立当前范围预留和预范围预留两条路径,自动进行数据传输切换.实验结果表明,在移动ip网络使用动态移动RSVP比移动RSVP在数据包的投递率和延迟时间方面表现更好. 相似文献

2.

A scalable organization for distributed directories

Alberto Ros Manuel E. Acacio José M. García 《Journal of Systems Architecture》2010,56(2-3):77-87

Although directory-based cache-coherence protocols are the best choice when designing chip multiprocessors with tens of cores on-chip, the memory overhead introduced by the directory structure may not scale gracefully with the number of cores. Many approaches aimed at improving the scalability of directories have been proposed. However, they do not bring perfect scalability and usually reduce the directory memory overhead by compressing coherence information, which in turn results in extra unnecessary coherence messages and, therefore, wasted energy and some performance degradation. In this work, we present a distributed directory organization based on duplicate tags for tiled CMP architectures whose size is independent on the number of tiles of the system up to a certain number of tiles. We demonstrate that this number of tiles corresponds to the number of sets in the private caches. Additionally, we show that the area overhead of the proposed directory structure is 0.56% with respect to the on-chip data caches. Moreover, the proposed directory structure keeps the same information than a non-scalable full-map directory. Finally, we propose a mechanism that takes advantage of this directory organization to remove the network traffic caused by replacements. This mechanism reduces total traffic by 15% for a 16-core configuration compared to a traditional directory-based protocol. 相似文献

3.

基于RSVP的VoD的实际应用与系统性能分析

陆健贤张凌《计算机工程》1999,25(12):81-83

概述了资源预留协议ＲＳＶＰ的特点和工作原理。根据ＩＰ网上一个自行开发的基于ＲＳＶＰ的ＶｏＤ（ＶｉｄｅｏｏｎＤｅｍａｎｄ）实验系统,分析和研究了ＲＳＶＰ在实时传输系统中应用时的实际性能。获得了对建立该类系统具有一定指导意义的研究和实验结果。相似文献

4.

利用RSVP隧道高效实现INTERNET的服务质量请求 总被引：2，自引：0，他引：2

郭国强张尧学傅晓明《计算机研究与发展》2000,37(1):55-60

在ＩＮＴＥＲＮＥＴ上应用ＲＳＶＰ时,新增通信量和时空开销问题是ＲＳＶＰ研究的关键问题之一。文中提出了利用ＲＳＶＰ隧道和ＱｏＳ请求聚合体解决此问题的方案,核心内容是：ＱｏＳ请求聚合体代表多个具有指定特征的ＲＳＶＰ会话消息,隧道段的路由器的ＲＳＶＰ模块对聚合体会话消息设置软状态而不理会构成聚合体的单个会话消息、构成聚合体的每个一ＲＳＶＰ会话消息被封装后以普通数据包形式通过隧道段。相似文献

5.

Experiences implementing efficient Java thread serialization,mobility and persistence

S. Bouchenak D. Hagimont S. Krakowiak N. De Palma F. Boyer 《Software》2004,34(4):355-393

Today, mobility and persistence are important aspects of distributed computing. They have many fields of use such as load balancing, fault tolerance and dynamic reconfiguration of applications. In this context, Java provides many useful mechanisms for the mobility of code via dynamic class loading, and the mobility or persistence of data via object serialization. However, Java does not provide any mechanism for the mobility/persistence of computation (i.e. threads). We designed and implemented a new mechanism, called Java thread serialization, that is used to build thread mobility or thread persistence. Therefore, a running Java thread can, at an arbitrary state of its execution, migrate to a remote machine where it resumes its execution, or be checkpointed on disk for possible subsequent recovery. With our services, migrating a thread is simply performed by the call of our go primitive, and checkpointing/recovering a thread is performed by the call of our store and load primitives. Several projects have recently addressed the issue of Java thread serialization, e.g. Sumatra, Wasp, JavaGo, Brakes, JavaGoX, Merpati. Some of them have attempted to minimize the overhead incurred by the thread serialization mechanism on thread performance, but none of them has been able to completely avoid this overhead. We propose a generic Java thread serialization mechanism that does not impose any performance overhead on serialized threads. This is achieved thanks to the use of type inference and dynamic de‐optimization techniques. In this paper, we describe the design and implementation details of our thread serialization prototype in Sun Microsystems' JDK. We report on experiments conducted with our prototype, present a comparative performance evaluation of the main thread serialization techniques, and confirm the elimination of the performance overhead with our thread serialization mechanism. Copyright © 2003 John Wiley & Sons, Ltd. 相似文献

6.

Epidemic-based reliable and adaptive multicast for mobile ad hoc networks

Oznur Ozkasap Zulkuf Genc Emre Atsan 《Computer Networks》2009,53(9):1409-1430

An emerging approach to distributed systems exploits the self-organization, autonomy and robustness of biological epidemics. In this article, we propose a novel bio-inspired protocol: EraMobile (Epidemic-based Reliable and Adaptive Multicast for Mobile ad hoc networks). We also present extensive performance analysis results for it. EraMobile supports group applications that require high reliability. The protocol aims to deliver multicast data reliably with minimal network overhead, even under adverse network conditions. With an epidemic-based multicast method, it copes with dynamic and unpredictable topology changes due to mobility. Our epidemic mechanism does not require maintaining any tree- or mesh-like structure for multicasting. It requires neither a global nor a partial view of the network, nor does it require information about neighboring nodes and group members. In addition, it substantially lowers overhead by eliminating redundant data transmissions. Another distinguishing feature is its ability to adapt to varying node densities. This lets it deliver data reliably in both sparse networks (where network connectivity is prone to interruptions) and dense networks (where congestion is likely). We describe the working principles of the protocol and study its performance through comparative and extensive simulations in the ns-2 network simulator. 相似文献

7.

Hiding message delivery latency using Direct-to-Cache-Transfer techniques in message passing environments

Farshad Khunjush Nikitas J. Dimopoulos 《Microprocessors and Microsystems》2009,33(7-8):430-440

Communication overhead is the key obstacle to reaching hardware performance limits. The majority is associated with software overhead, a significant portion of which is attributed to message copying. To reduce this copying overhead, we have devised techniques that do not require to copy a received message in order for it to be bound to its final destination. Rather, a late-binding mechanism, which involves address translation and a dedicated cache, facilitates fast access to received messages by the consuming process/thread.We have introduced two policies namely Direct to Cache Transfer (DTCT) and lazy DTCT that determine whether a message after it is bound needs to be transferred into the data cache. We have studied the proposed methods in simulation and have shown their effectiveness in reducing access times to message payloads by the consuming process. 相似文献

8.

层次化移动IPv6环境中的QoS机制 总被引：2，自引：0，他引：2

赵艳琼杨寿保《计算机科学》2004,31(3):44-47

本文针对层次化移动IPv6的网络提出了一种QoS信令,为移动节点提供服务质量的保障。在分析了RSVP信令的特点及其由于对多播的支持而增加的复杂性的基础上,设计了一个简化的RSVP信令,然后把此RSVP信令和层次化的移动协议相结合提出了一个综合的QoS切换方案,并在层次化的移动环境下,对此QoS信令和其它几种QoS机制在资源预留时间和信令负荷方面进行了性能比较。相似文献

9.

ASA-FTL: An adaptive separation aware flash translation layer for solid state drives

《Parallel Computing》2017

The flash-memory based Solid State Drive (SSD) presents a promising storage solution for increasingly critical data-intensive applications due to its low latency (high throughput), high bandwidth, and low power consumption. Within an SSD, its Flash Translation Layer (FTL) is responsible for exposing the SSD’s flash memory storage to the computer system as a simple block device. The FTL design is one of the dominant factors determining an SSD’s lifespan and performance. To reduce the garbage collection overhead and deliver better performance, we propose a new, low-cost, adaptive separation-aware flash translation layer (ASA-FTL) that combines sampling, data clustering and selective caching of recency information to accurately identify and separate hot/cold data while incurring minimal overhead. We use sampling for light-weight identification of separation criteria, and our dedicated selective caching mechanism is designed to save the limited RAM resource in contemporary SSDs. Using simulations of ASA-FTL with both real-world and synthetic workloads, we have shown that our proposed approach reduces the garbage collection overhead by up to 28% and the overall response time by 15% compared to one of the most advanced existing FTLs. We find that the data clustering using a small sample size provides significant performance benefit while only incurring a very small computation and memory cost. In addition, our evaluation shows that ASA-FTL is able to adapt to the changes in the access pattern of workloads, which is a major advantage comparing to existing fixed data separation methods. 相似文献

10.

Lock Coarsening: Eliminating Lock Overhead in Automatically Parallelized Object-Based Programs

Pedro C. Diniz Martin C. Rinard 《Journal of Parallel and Distributed Computing》1998,49(2):858

Atomic operations are a key primitive in parallel computing systems. The standard implementation mechanism for atomic operations uses mutual exclusion locks. In an object-based programming system, the natural granularity is to give each object its own lock. Each operation can then make its execution atomic by acquiring and releasing the lock for the object that it accesses. But this fine lock granularity may have high synchronization overhead because it maximizes the number of executed acquire and release constructs. To achieve good performance it may be necessary to reduce the overhead by coarsening the granularity at which the computation locks objects.In this article we describe a static analysis technique—lock coarsening—designed to automatically increase the lock granularity in object-based programs with atomic operations. We have implemented this technique in the context of a parallelizing compiler for irregular, object-based programs and used it to improve the generated parallel code. Experiments with two automatically parallelized applications show these algorithms to be effective in reducing the lock overhead to negligible levels. The results also show, however, that an overly aggressive lock coarsening algorithm may harm the overall parallel performance by serializing sections of the parallel computation. A successful compiler must therefore negotiate a trade-off between reducing lock overhead and increasing the serialization. 相似文献

11.

抗板级物理攻击的持久存储方法研究

李闽张倩颖王国辉施智平关永《计算机工程》2022,48(2):132-139

为保护文件系统的安全性,提出一种抗板级物理攻击的持久存储方法。利用ARM TrustZone技术构建持久存储架构,实现内存保护机制和持久存储保护服务,提高文件系统的物理安全性。基于片上内存（OCM）在可信执行环境（TEE）中的内核层建立内存保护机制,保证TEE的可信应用能够抵抗板级物理攻击。基于TEE的内存保护机制实现保护文件系统中敏感数据的持久存储保护服务,确保文件系统的机密性和完整性。在物理开发板上实现持久存储架构的原型系统,使用基准测试工具对原型系统进行性能评估,并分析性能损耗的原因。测试结果表明,内存保护机制在保护TEE系统物理安全性时引入的时间开销会随着OCM的增大而减小,持久存储保护服务在保护数据量较小的敏感数据时产生的时间开销在用户可接受范围内。相似文献

12.

A Speculative and Adaptive MPI Rendezvous Protocol Over RDMA-enabled Interconnects

Mohammad J. Rashti Ahmad Afsahi 《International journal of parallel programming》2009,37(2):223-246

Overlapping computation with communication is a key technique to conceal the effect of communication latency on the performance of parallel applications. Message Passing Interface (MPI) is a widely used message passing standard for high performance computing. One of the most important factors in achieving a good level of overlap is the MPI ability to make progress on outstanding communication operations. In this paper, we propose a novel speculative MPI Rendezvous protocol that uses RDMA Read and RDMA Write to effectively improve communication progress and consequently the overlap ability. Performance results based on a modified MPICH2 implementation over 10-Gigabit iWARP Ethernet reveal a significant (80–100%) improvement in receiver side overlap and progress ability. We have also observed up to 30% improvement in application wait time for some NPB applications as well as the RADIX application. For applications that do not benefit from this protocol, an adaptation mechanism is used to stop the speculation to effectively reduce the protocol overhead. 相似文献

13.

OPNET环境下集成服务模型的设计与实现 总被引：1，自引：0，他引：1

下载免费PDF全文

许伟周建中王天慧《计算机工程与科学》2005,27(6):21-23

建模仿真是研究集成服务模型的最好手段之一。论文介绍了集成服务模型体系下RSVP信令的工作机制以及协议结构的设计与实现。在OPNET仿真环境下设计了集成服务原型系统并研究了模型的QoS机理。通过对仿真结果的分析,证实了协议与模型设计的正确性。相似文献

14.

VMMB: Virtual Machine Memory Balancing for Unmodified Operating Systems

Changwoo Min Inhyeok Kim Taehyoung Kim Young Ik Eom 《Journal of Grid Computing》2012,10(1):69-84

Virtualization technology has been widely adopted in Internet hosting centers and cloud-based computing services, since it reduces the total cost of ownership by sharing hardware resources among virtual machines (VMs). In a virtualized system, a virtual machine monitor (VMM) is responsible for allocating physical resources such as CPU and memory to individual VMs. Whereas CPU and I/O devices can be shared among VMs in a time sharing manner, main memory is not amendable to such multiplexing. Moreover, it is often the primary bottleneck in achieving higher degrees of consolidation. In this paper, we present VMMB (Virtual Machine Memory Balancer), a novel mechanism to dynamically monitor the memory demand and periodically re-balance the memory among the VMs. VMMB accurately measures the memory demand with low overhead and effectively allocates memory based on the memory demand and the QoS requirement of each VM. It is applicable even to guest OS whose source code is not available, since VMMB does not require modifying guest kernel. We implemented our mechanism on Linux and experimented on synthetic and realistic workloads. Our experiments show that VMMB can improve performance of VMs that suffers from insufficient memory allocation by up to 3.6 times with low performance overhead (below 1%) for monitoring memory demand. 相似文献

15.

Low complexity LMS-type adaptive algorithm with selective coefficient update for stereophonic acoustic echo cancellation

Khaled Mayyas Author Vitae 《Computers & Electrical Engineering》2009,35(3):450-458

Stereophonic acoustic echo cancellation (SAEC) has brought up recently much attention and found a viable place in a number of hands-free applications. In this paper, we propose an LMS-type algorithm for SAEC based on decomposing the long adaptive filter of each channel of the SAEC system into smaller subfilters. We further reduce the complexity of the algorithm by employing the selective coefficient update (SCU) method in each subfilter. This leads to a significant improvement in the convergence rate of the algorithm with low computational overhead. However, the algorithm has a high final mean-square error (MSE) at steady-state that increases as number of subfilters increases. A combined-error algorithm is presented that achieves fast convergence without compromising the steady state error level. Simulations demonstrate the convergence speed advantages of the combined-error algorithm. 相似文献

16.

Analysis of burstiness monitoring and detection in an adaptive Web system

Katja Gilly Salvador Alcaraz Carlos Juiz Ramon Puigjaner 《Computer Networks》2009,53(5):668-679

Due to the heavy tailed pattern of Internet traffic, it is crucial to monitor the incoming arrival rate in a Web system to preserve its performance. In this study, we focus on the arrival rate processing mechanism as part of the design of an adaptive load balancing Web algorithm. The arrival rate is one of the most important metrics to be monitored in a Web site to avoid the possible congestion of Web servers. Six methods are analysed to detect the burstiness of incoming traffic in a Web system. We define six burstiness factors to be individually included in an adaptive load balancing algorithm, which also needs to monitor some Web servers’ parameters continuously, such as the arrival rate at the server or its CPU utilization in order to avoid an unexpected overload situation.We also define adaptive time slot scheduling based on the burstiness factor, which reduces considerably the overhead of the monitoring process by increasing the monitoring frequency when bursty traffic arrives at the system and by decreasing the frequency when no bursts are detected in the arrival rate. Simulation results of the behaviour of the six burstiness factors and adaptive time slot scheduling when sudden changes are detected in the arrival rate are presented and discussed. We have considered a scenario made up of a locally distributed cluster-based Web information system for simulations. 相似文献

17.

RSVP Browser: Web Browsing on Small Screen Devices

O. de Bruijn R. Spence M. Y. Chong 《Personal and Ubiquitous Computing》2002,6(4):245-252

In this paper, we illustrate the use of space-time trade-offs for information presentation on small screens. We propose the use of Rapid Serial Visual Presentation (RSVP) to provide a rich set of navigational information for Web browsing. The principle of RSVP browsing is applied to the development of a Web browser for small screen devices, the RSVP browser. The results of an experiment in which Web browsing with the RSVP browser is compared with that of a typical WAP browser suggests that RSVP browsing may indeed offer alternative to other forms of Web browsing on small screen devices. 相似文献

18.

Compiler Controlled Prefetching for Multiprocessors Using Low-Overhead Traps and Prefetch Engines

《Journal of Parallel and Distributed Computing》2000,60(5):585-615

In this paper we propose and evaluate a new data-prefetching technique for cache coherent multiprocessors. Prefetches are issued by a functional unit called a prefetch engine which is controlled by the compiler. We let second-level cache misses generate cache miss traps and start the prefetch engine in a trap handler. The trap handler is fast (40–50 cycles) and does not normally delay the program beyond the memory latency of the miss. Once started, the prefetch engine executes on its own and causes no instruction overhead. The only instruction overhead in our approach is when a trap handler completes after data arrives. The advantages of this technique are (1) it exploits static compiler analysis to determine what to prefetch, which is hard to do in hardware, (2) it uses prefetching with very little instruction overhead, which is a limitation for traditional software-controlled prefetching, and (3) it is accurate in the sense that it generates very little useless traffic while maintaining a high prefetching coverage. We also study whether one could emulate the prefetch engine in software, which would not require any additional hardware beyond support for generating cache miss traps and ordinary prefetch instructions. In this paper we present the functionality of the prefetch engine and a compiler algorithm to control it. We evaluate our technique on six parallel scientific and engineering applications using an optimizing compiler with our algorithm and a simulated multiprocessor. We find that the prefetch engine removes up to 67% of the memory access stall time at an instruction overhead less than 0.42%. The emulated prefetch engine removes in general less stall time at a higher instruction overhead. 相似文献

19.

Design of optimized cascade fuzzy controller based on differential evolution: Simulation studies and practical insights

Sung-Kwun Oh Wook-Dong Kim Witold Pedrycz 《Engineering Applications of Artificial Intelligence》2012,25(3):520-532

In this study, we discuss a design of an optimized cascade fuzzy controller for the rotary inverted pendulum system and ball & beam system by using an optimization vehicle of differential evolution (DE). The structure of the differential evolution optimization environment is simple and a convergence to optimal values realized here is very good in comparison to the convergence reported for other optimization algorithms. DE is easy to use given its mathematical operators. It also requires a limited computing overhead. The rotary inverted pendulum system and ball & beam system are nonlinear systems, which exhibit unstable motion. The performance of the proposed fuzzy controller is evaluated from the viewpoint of several performance criteria such as overshoot, steady-state error, and settling time. Their values are obtained through simulation studies and practical, real-world experiments. We evaluate and analyze the performance of the proposed optimal fuzzy controller optimized by Genetic Algorithm (GA), and DE. In this setting, we show the superiority of DE versus other methods being used here as well as highlight the characteristics of this optimization tool. 相似文献

20.

基于HMIPv6的RSVP新方案

下载免费PDF全文

吴进贺辉邹波《计算机工程》2009,35(14):117-119

在移动IPv6环境中,IntServ/RSVP模型很难实现QoS保证。针对该问题提出一种新的资源预留方案,该方案建立在分层移动IPv6协议的基础上。分析移动节点在域内的资源预留机制,与MRSVP, HMRSVP进行性能比较,结果表明该方案具有较高性能。相似文献