首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Can a parallel computer with n processors solve a computational problem more than n times faster than a sequential computer? Can it solve it more than n times better? New computational paradigms offer an affirmative answer to the above questions through concrete examples in which the improvement in speed or quality is superlinear in the number of processors used by the parallel computer. Furthermore, the improvement is consistent and provable. All examples are characterized by the presence of one or several real-time input streams. In one of the examples, an exponential improvement in speed is achieved despite the fact that the processors of the parallel computer are significantly slower than their sequential counterpart. In another example, the improvement in quality is unbounded. A metaphor from everyday life motivates each computational paradigm in which a superlinear improvement in performance is exhibited.  相似文献   

2.
DLoVe (Distributed Links over Variables evaluation) is a new model for specifying and implementing virtual reality and other next-generation or non-WIMP user interfaces. Our approach matches the parallel and continuous structure of these interfaces by combining a data-flow or constraint-like component with an event-based component for discrete interactions. Moreover, because the underlying constraint graph naturally lends itself to parallel computation, DLoVe provides for the constraint graph to be partitioned and executed in parallel across several machines, for improved performance. With our system, one can write a program designed for a single machine but can execute it in a distributed environment with minor code modifications. The system also supports mechanics for implementing or transforming single user programs into multi-user programs. We present experiments demonstrating how DLoVe improves performance by dramatically increasing the validity of the rendered frames. We also present performance measures to measure statistical skew in the frames, which we believe is more suitable for interactive systems than traditional measures of parallel systems, such as throughput or frame rate, because they fail to capture the freshness of each rendered frame.  相似文献   

3.
基于图形处理器的并行方体计算   总被引:1,自引:0,他引:1  
方体(cube)计算是数据仓库和联机分析处理(Online analytical processing,OLAP)领域的核心问题,如何提高方体计算性能获得了学术界和工业界的广泛关注,但目前大部分方体算法都没有考虑最新的处理器架构.近年来,处理器从单一计算核心进化为多个或许多个计算核心,如多核CPU、图形处理器(Graphic Processing Units, GPU)等.为了充分利用现代处理器的多核资源,该文提出了基于GPU的并行方体算法GPU-Cubing,算法采用自底向上、广度优先的划分策略,每次并行完成一个cuboid的计算并输出;在计算cuboid过程中多个分区同步处理,分区内多线程并行.GPU-Cubing算法适合GPU体系结构,并行度高.与BUC算法相比,基于真实数据集的完全方体计算可以获得一个数量级以上的加速比,冰山方体获得至少2倍以上的加速.  相似文献   

4.
IXP2400的网络测试系统的多级并行处理技术   总被引:1,自引:0,他引:1  
多级并行处理问题一直是计算机及其网络设计、应用的一个重要问题.本文针对IXP2400这一多核可编程芯片的多处理器并行化问题进行应用研究,提出一种兼顾处理能力与开发灵活性的多级并行技术.以"基于网络处理器的网络测试系统"为应用实例,重点分析微引擎并行方案及线程级静态调度算法,并通过WorkBench仿真及七种以太帧平均最大发送速率实测结果对方案、算法进行验证.最后总结并展望了本文提出技术的前景.  相似文献   

5.
Like colour video displays and laser printers, laser photoplotters are raster scan devices. For such devices, the pixel stream representing the image must be generated in real time, and in the (scan line) order required. However, the typical size of the images photoplotters produce is an order of magnitude higher than that of video displays and laser printers, precluding the use of full-size bitmap memories. These requirements pose particular implementation problems for the raster image processor generating the pixel stream. The parallel RIP system presented here is aimed at high-resolution laser photoplotters, and features a largely scalable performance ranging from 40 to several hundred megapixels per second. It is built with standard components such as graphics microprocessors and VRAM memories. Its architecture is that of a distributed memory multiprocessor system with a global ring-like topology. And most importantly, it can be programmed using the traditional sequential programming paradigm. Only minor additions are needed to sequential graphical algorithms to be executable on the system with an arbitrary number of processors. A prototype 8-processor PRIP system has been built and tested generating printed circuit board images for a direct imaging photoplotter. The prototype exhibits a near-linear speedup with respect to a monoprocessor solution. Architectural simulations indicate that the system can be expanded to well over 10 processors.  相似文献   

6.
为研究并行图形绘制技术,介绍图形绘制的流水线过程,对其内在的可并行性进行分析,研究并行绘制的实现方式,包括流水线并行、数据并行和作业并行,以及前分布拼接合成、中分布拼接合成和后分布拼接合成,讨论并行绘制面临的主要问题及其发展趋势。  相似文献   

7.
文章针对安全多方计算理论的广泛应用价值和研究热潮,综述了安全多方计算理论的研究现状。在分析安全多方计算理论的数学模型、与密码学的关系、应用领域和基础协议的基础上,着重梳理了安全多方计算理论的研究进展,进一步探讨了研究热点和发展趋势。  相似文献   

8.
一种并行分布对象的互操作模型   总被引:3,自引:0,他引:3  
王晨  周颖  张德富 《软件学报》1999,10(8):861-867
并行软件设计本身的复杂性使它的复用成为一个引人注目的问题.分布对象技术不仅可以将并行软件封装成相应的构件,而且提供了利用各种异构系统进行并行计算的可能性,但这样往往会使这些构件的互操作的效率有所降低.文章提出的并行分布对象互操作模型试图解决这一问题.这个模型与分布对象的旧有模型兼容,并且实际测试结果表明,它还能挖掘出并行分布对象间的更多并行性.  相似文献   

9.
高可靠实时分布处理系统的结点机结构研究和实现   总被引:3,自引:0,他引:3       下载免费PDF全文
本文对几种高可靠结点机的硬件体系结构从可靠性、实时性、可实现性等方面进行了分析比较。通过对多计算机组成的结点机结构的详细分析,可以看出容错算法并行实现技术、智能表决技术、实现上的标准化、模块化技术可以使实现的结点机在满足高可靠的同时满足实时性要求,并且易于升级。  相似文献   

10.
数据并行虽然已经获得了广泛的应用,但是,仍然有一些应用程序不适于数据并行语言的并行模式,如树结构算法。数据并行与任务并行的结合可以很好地解决这些问题。该文主要讨论了在数据并行中引入任务并行时,遇到的共享变量、代码生成和处理器分配等问题,比较和分析了基于编译、基于语言和基于协作库的方法。  相似文献   

11.
董育宁 《计算机学报》2003,26(3):332-339
提出了一种在并行机上有效地计算(空间)可变模板的方法,论证了利用一个在图像网格点处计算多项式的优化算法,可以大大减少可变模板的运算量,对于包含非多项式函数的可变模板,可以用函数的泰勒级数展开实现在像素点上的递推运算,详细分析了可变模板中若干常用函数的泰勒展开用于实现模板运算的合理性,准确性和有效性,关于硬件的影响以及该方法的适用范围,也做了讨论。  相似文献   

12.
近年来,计算机硬件技术获得了很大发展,尤其是大内存和多核,但算法效率并没有随着硬件技术的发展而提高,根本原因是没有充分利用CPU缓存以及单线程程序设计的局限性。在联机分析处理领域,数据方体计算是一个重要而又耗时的操作,因此如何提高数据方体的计算效率是该领域的一个研究难点。探讨了基于多核CPU特征的并行立方体算法,提出了MT-Multi-Way(multi-threading multi-way)和MT-BUC(multi-threading bottom-up computation)算法。该算法通过有效的数据划分和多线程协作,避免了Cache竞争,并确保了负载均衡,获得了近似线性加速比。以上述算法为基础,提出了处理立方体算法的多核框架,包括数据划分策略及递归算法的多核处理,指导立方体算法的并行化。  相似文献   

13.
析网络协议并行实现的必要性,探讨端系统与互连设备并行协议系统的实现体系结构和开发途径,通过示例展示了协议并行化技术的应用前景。  相似文献   

14.
ParC is an extension of the C programming language with block-oriented parallel constructs that allow the programmer to express fine-grain parallelism in a shared-memory model. It is suitable for the expression of parallel shared-memory algorithms, and also conducive for the parallelization of sequential C programs. In addition, performance enhancing transformations can be applied within the language, without resorting to low-level programming. The language includes closed constructs to create parallelism, as well as instructions to cause the termination of parallel activities and to enforce synchronization. The parallel constructs are used to define the scope of shared variables, and also to delimit the sets of activities that are influenced by termination or synchronization instructions. The semantics of parallelism are discussed, especially relating to the discrepancy between the limited number of physical processors and the potentially much larger number of parallel activities in a program.  相似文献   

15.
为了满足水声通信以及水下信号处理和目标识别等方面对高速实时并行处理系统的要求,文章设计并研制了一种基于4片SHARC-DSP芯片(ADSP21160)和多通道同步采样ADC芯片的多处理器并行数字信号处理系统,解决了水声通信与阵列信号处理中多通道同步采样和大数据量的高速实时处理的问题,同时该系统具有良好的稳定性和通用性。  相似文献   

16.
In this paper, we propose a method for efficiently computing variant templates for image processing on parallel machines. It is demonstrated that the cumbersome computation of the variant templates can greatly be relieved by the use of an optimised algorithm for evaluating polynomials at grid points. For variant templates containing non-polynomial functions, the Taylor series of the function is exploited for iterative computation purpose. The aspects of validity, accuracy and effectiveness of the series form (for implementing the variant templates) of some commonly used functions are analysed in detail. The influence of hardware, as well as the limitations of the proposed approach are also discussed.  相似文献   

17.
车辆实时监管正面临着不断增长的大规模车辆监测数据的实时处理需求,需要采用分布式的并行计算架构来提升大规模车辆监测数据处理的性能,支撑多样化的车辆监测数据处理任务,应对支撑环境的伸缩性需求。在这种架构下,对系统中不同计算节点间的车辆监测数据处理任务的调度提出了更高的要求。针对这一要求,并结合流式到达及历史积累的车辆监测数据的持续化处理需求以及大规模车辆监测数据实时处理中内存敏感的特征,提出一种基于路由表的并行任务调度算法。该算法基于车辆监测数据时空属性以及各计算节点的内存信息建立路由表,并以路由表的形式来进行任务的并行划分和分配调度,从而使得各计算节点达到负载均衡的状态。实验表明该算法能够使计算节点间的负载差异缩小到12%以内。此外,该算法在某市车辆监管实时系统中的实际应用也证明了其有效性。  相似文献   

18.
为克服mean shift算法计算复杂度高、运行速度慢的缺点,提出一种基于GPU的快速mean shift算法.首先使用k-means算法对图像像素进行预分类,之后在预分类、下采样后缩小的数据集上进行mean shift聚类,以有效地降低算法复杂度.此外,借助GPU的通用计算功能对k-means和mean shift分别进行并行了处理.实验结果表明,通过对图像进行预处理,有效地提高了几何模板查找在强噪声、低信噪比图像中的识别率;同时,改进后的mean shift算法的运行速度提高了近40倍,满足了高速机器视觉检测的实时性要求.  相似文献   

19.
This article describes how a medium-sized, Midwestern power company implemented an IT alignment planning process. the IT alignment planning process was a successful four-year activity that involved, first, a pilot implementation and then a companywide implementation of the IT alignment planning process. Designed to be flexible and to dovetail with corporate strategic planning processes, IT alignment planning achieved acknowledgment and approval in all divisions of the company. the IT alignment planning process improved and facilitated communication on IT and IT projects throughout the company, from the executive level to the operational level, and brought the IT and client units closer together.  相似文献   

20.
任务并行编程模型研究与进展   总被引:1,自引:0,他引:1  
任务并行编程模型是近年来多核平台上广泛研究和使用的并行编程模型,旨在简化并行编程和提高多核利用率.首先,介绍了任务并行编程模型的基本编程接口和支持机制;然后,从3个角度,即并行性表达、数据管理和任务调度介绍任务并行编程模型的研究问题、困难和最新研究成果;最后展望了任务并行未来的研究方向.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号