共查询到20条相似文献,搜索用时 0 毫秒
1.
We revisit memory hierarchy design viewing memory as an inter-operation communication mechanism. We show how dynamically collected information about inter-operation memory communication can be used to improve memory latency. We propose two techniques: (1) Speculative Memory Cloaking, and (2) Speculative Memory Bypassing. In the first technique, we use memory dependence prediction to speculatively identify dependent loads and stores early in the pipeline. These instructions may then communicate prior to address calculation and disambiguation via a fast communication mechanism. In the second technique, we use memory dependence prediction to speculatively transform DEF-store-load-USE dependence chains within the instruction window into DEF-USE ones. As a result, dependent stores and loads are taken off the communication path resulting in further reduction in communication latency. Experimental analysis shows that our methods, on the average, correctly handle 40% (integer) and 19% (floating point) of all memory loads. Moreover, our techniques result in performance improvements of 4.28% (integer) and 3.20% (floating point) over a highly aggressive, dynamically scheduled processor implementing naive memory dependence speculation. We also study the value and address locality characteristics of the values our methods correctly handle. We demonstrate that our methods are orthogonal to both address and value prediction. 相似文献
2.
《Advanced Robotics》2013,27(12-13):1743-1760
Peristalsis motion like an earthworm has attracted attention in recent years because the movement is useful to progress in small spaces. An earthworm robot with a shape memory alloy, the BioMetal Helix, and a polyester braided tube was studied. The BioMetal is a fiber-like actuator-like muscle tissue; the BioMetal Helix (BMX series) was chosen to obtain good unit contraction force. The BMX, elongated at room temperature, becomes stiff and sharply contracts when a current is fed through it. For the unit expansion force, a polyester braided tube was used. The braided tube maintains a long and thin shape without compression, and when an external force presses the tube in the axial direction, it becomes shorter and thicker. When the external force is removed, the tube lengthens and becomes longer and thinner again. A prototype robot consisting of four units was developed. The robot was designed with three-dimensional computer-aided design, and the expansion and contraction timing of units was calculated through computer simulations. The simulated results closely resembled the experiments and the robot was improved by adaptation according to the simulated results. 相似文献
3.
重复数据删除能够有效地提高存储利用率,现已在备份、归档系统中得到良好应用.然而这种基于比特流的Hash匹配策略对很多应用来说过于严格,例如重复图像删除.为了解决该问题,提出了一种快速精确的图像消冗方法.该方法首先根据Web图像特点给出重复图像定义,然后将图像消冗分为两个阶段.在重复图像发现阶段利用感知Hash等多重过滤技术提高图像检索速度和精度,在重复图像消冗阶段利用模糊逻辑推理选取质心图像以实现消冗.实验结果表明,该方法不仅具有快速、精确的重复图像消冗能力,而且在质心图像的选择上也能满足用户的感知要求. 相似文献
4.
P. Fua 《International Journal of Computer Vision》1998,26(3):215-234
We propose an automated approach to modeling drainage channels—and, more generally, linear features that lie on the terrain—from multiple images. It produces models of the features and of the surrounding terrain that are accurate and consistent and requires only minimal human intervention.We take advantage of geometric constraints and photommetric knowledge. First, rivers flow downhill and lie at the bottom of valleys whose floors tend to be either V- or U-shaped. Second, the drainage pattern appears in gray-level images as a network of linear features that can be visually detected.Many approaches have explored individual facets of this problem. Ours unifies these elements in a common framework. We accurately model terrain and features as 3-dimensional objects from several information sources that may be in error and inconsistent with one another. This approach allows us to generate models that are faithful to sensor data, internally consistent and consistent with physical constraints. We have proposed generic models that have been applied to the specific task at hand. We show that the constraints can be expressed in a computationally effective way and, therefore, enforced while initializing the models and then fitting them to the data. Furthermore, these techniques are general enough to work on other features that are constrained by predictable forces. 相似文献
5.
Martino Ruggiero Alessio Guerri Davide Bertozzi Michela Milano Luca Benini 《International journal of parallel programming》2008,36(1):3-36
The problem of allocating and scheduling precedence-constrained tasks on the processors of a distributed real-time system is NP-hard. As such, it has been traditionally tackled by means of heuristics, which provide only approximate or near-optimal solutions. This paper proposes a complete allocation and scheduling framework, and deploys an MPSoC virtual platform to validate the accuracy of modelling assumptions. The optimizer implements an efficient and exact approach to the mapping problem based on a decomposition strategy. The allocation subproblem is solved through Integer Programming (IP) while the scheduling one through Constraint Programming (CP). The two solvers interact by means of an iterative procedure which has been proven to converge to the optimal solution. Experimental results show significant speed-ups w.r.t. pure IP and CP exact solution strategies as well as high accuracy with respect to cycle-accurate functional simulation. Two case studies further demonstrate the practical viability of our framework for real-life applications. 相似文献
6.
Stanoi I. Mihaila G.A. Lang C.A. Palpanas T. 《Knowledge and Data Engineering, IEEE Transactions on》2007,19(9):1214-1226
Monitoring systems today often involve continuous queries over streaming data in a distributed collaborative fashion. The distribution of query operators over a network of processors, as well as their processing sequence, form a query configuration with inherent constraints on the throughput that it can support. In this paper, we discuss the implications of measuring and optimizing for output throughput, as well as its limitations. We propose to use instead the more granular input throughput and a version of throughput measure, the profiled input throughput, that is focused on matching the expected behavior of the input streams. We show how we can evaluate a query configuration based on profiled input throughput and that the problem of finding the optimal configuration is NP-hard. Furthermore, we describe how we can overcome the complexity limitation by adapting hill-climbing heuristics to reduce the search space of configurations. We show experimentally that the approach used is not only efficient but also effective. 相似文献
7.
Ken’ichiro Ohta Kunihiko Sadakane Akiyoshi Shioura Takeshi Tokuyama 《Algorithmica》2005,42(2):141-158
We propose an efficient and accurate randomized approximation algorithm for
computing the price of European-Asian options.
Our algorithm can be seen as a modification of the approximation algorithm
developed by Aingworth et al. (2000) into a randomized algorithm, which improves
the accuracy theoretically as well as practically.
We also propose a new option named the Saving-Asian option which enjoys advantages
of both the European-Asian and American-Asian options.
It is shown that our approximation algorithm also works for pricing
Saving-Asian options. 相似文献
8.
9.
10.
铁电存储器原理及应用比较 总被引:2,自引:1,他引:2
孙树印 《单片机与嵌入式系统应用》2004,(9):15-18
介绍铁电存储器(FRAM)的一般要领和基本原理,详细分析其读写操作过程及时序。将FRAM与其它存储器进行比较,分析在不同场合中各自的优缺点。最后以FM1808为例说明并行FRAM与8051系列单片机的实际接口,着重分析与使用一般SRAM的不同之处。 相似文献
11.
Kaleigh Smith Pierre-Edouard Landes Joëlle Thollot Karol Myszkowski 《Computer Graphics Forum》2008,27(2):193-200
This paper presents a quick and simple method for converting complex images and video to perceptually accurate greyscale versions. We use a two‐step approach first to globally assign grey values and determine colour ordering, then second, to locally enhance the greyscale to reproduce the original contrast. Our global mapping is image independent and incorporates the Helmholtz‐Kohlrausch colour appearance effect for predicting differences between isoluminant colours. Our multiscale local contrast enhancement reintroduces lost discontinuities only in regions that insufficiently represent original chromatic contrast. All operations are restricted so that they preserve the overall image appearance, lightness range and differences, colour ordering, and spatial details, resulting in perceptually accurate achromatic reproductions of the colour original. 相似文献
12.
本文讨论了基于YHF2仿真机高速通讯设施,双端口存储器DPM。此DPM为多机并发仿真提供了基础。从硬件方面,本文介绍了DPM以及DDA适配器的结构。从软件方面,介绍了高速通讯设备的运行特征,提出了访问DPM的方法。最后,提供了YHF2访问DPM的实例。 相似文献
13.
Junchang Wang Kai Zhang Xinan Tang Bei Hua 《International journal of parallel programming》2013,41(1):137-159
Core-to-core communication is critical to the effective use of multi-core processors. A number of software based concurrent lock-free queues have been proposed to address this problem. Existing solutions, however, suffer from performance degradation in real testbeds, or rely on auxiliary hardware or software timers to handle the deadlock problem when batching is used, making those solutions good in theory but difficult to use in practice. This paper describes the pros and cons of existing concurrent lock-free queues in both dummy and real testbeds and proposes B-Queue, an efficient and practical single-producer-single-consumer concurrent lock-free queue that solves the deadlock problem gracefully by introducing a self-adaptive backtracking mechanism. Experiments show that in real massively-parallel applications, B-Queue is faster than FastForward and MCRingBuffer, the two state-of-the-art concurrent lock-free queues, by up to 10x and 5x, respectively. Moreover, B-Queue outperforms FastForward and MCRingBuffer in terms of stability and scalability, making it a good candidate for fast core-to-core communication on multi-core architectures. 相似文献
14.
针对基于PCI等传统I/O总线的网络I/O方式中网络通信性能受到相应总线接口限制的问题,提出了直接内存通信技术DMC(Direct Memory Communication,DMC)。使用此技术的DMC网卡可直接插入内存插槽中,DMC网卡上的存储空间被系统预留作为通信专用区,并使用与普通内存相同的方法进行管理和访问。待发送的数据使用写普通内存的方法直接写入DMC网卡的通信专用区中,对DMC网卡的通信专用区中收到的网络数据,用户可使用读普通内存的方法获得,从而实现了计算机内存之间的直接通信。因此,DMC技术使网络通信速度不受PCI等传统I/O总线的限制,省略了传统通信机制中网卡设备和内存之间的数据拷贝工作,具有通信速率高、通信延迟小及操作简单的特点。在高速光纤通道交换网中设计了DMC网卡原型,证明了DMC技术的正确性和可行性。 相似文献
15.
Accurate and Fast Proximity Queries Between Polyhedra Using Convex Surface Decomposition 总被引:9,自引:0,他引:9
The need to perform fast and accurate proximity queries arises frequently in physically-based modeling, simulation, animation, real-time interaction within a virtual environment, and game dynamics. The set of proximity queries include intersection detection, tolerance verification, exact and approximate minimum distance computation, and (disjoint) contact determination. Specialized data structures and algorithms have often been designed to perform each type of query separately. We present a unified approach to perform any of these queries seamlessly for general, rigid polyhedral objects with boundary representations which are orientable 2-manifolds. The proposed method involves a hierarchical data structure built upon a surface decomposition of the models. Furthermore, the incremental query algorithm takes advantage of coherence between successive frames. It has been applied to complex benchmarks and compares very favorably with earlier algorithms and systems. 相似文献
16.
金属磁记忆检测技术是对铁磁设备部件隐性损伤进行早期诊断,防止部件突发性疲劳破损的一种无损检测新技术,它可以准确检测出被测对象上以应力集中区为特征的危险部件和部位,是对铁磁部件进行早期诊断唯一行之有效的非破坏测试方法,通过对漏磁场检测,判别磁场的法向分量过零点这个明显特征,从而确定铁磁部件存在缺损或应力集中,这种漏磁检测方法,可以方便快捷地发现应力集中区,实现对设备部件的早期诊断. 相似文献
17.
在物联网中所有智能装置都可以上网,因此所带来的信息总量非常大,但每个东西所传送的信息却很小,因此需要新技术快速地处理或储存这些大量的小信息。基于批次处理循序送出技术,结合存储器式处理技术、快速封包处理技术与远端机器储存日志的方式,提供快速接收并处理物联网信息的方法,并以硬盘效能最大化的方式批次循序更新至硬盘。 相似文献
18.
张宇 《电脑编程技巧与维护》2010,(4):97-98,103
传统的图像处理一般采用像素点赋值的方法,处理速度极慢,对于大型图像几乎无能为力,而Delphi扫描线方法是对图像的每一行进行扫描,获得各像素的内存地址。这种内存操作比常规的像素点赋值效率高很多,从而可以大大提高大型位图图像的处理速度。 相似文献
19.
Y. Yamamoto 《Quantum Information Processing》2006,5(5):299-311
This paper reviews the single photon sources based on semiconductor quantum dots and their applications to quantum information systems. By optically pumping a system consisting of a semiconductor single quantum dot confined in a monolithic microcavity, it is possible to produce a single photon pulse stream at the Fourier transform limit with a negligible jitter. This single photon source is not only useful for BB84 quantum key distribution (QKD), but also find applications in other quantum information systems such as Ekert91/BBM92 QKD and quantum teleportation gate linear optical quantum computers. 相似文献
20.
《Advanced Robotics》2013,27(7):979-1002
In recent years, SLAMMOT (simultaneous localization, mapping and moving object tracking) has attracted widespread attention in the mobile robot field. This paper proposes a new approach, SLAMMOT-SP, which combines SLAMMOT and scene prediction (SP). It extends the SLAMMOT problem to simultaneous map prediction and moving object trajectory prediction. The robot not only passively collects the data and executes SLAMMOT, but actively predicts the scene. The recursive Bayesian formulation of SLAMMOT-SP is derived for real-time operations. A generalized framework for tracking and predicting the moving objects is also proposed. Simulations and experiments show that the proposed SLAMMOT-SP is effective and can be performed in real-time. 相似文献