首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 328 毫秒
1.
本文提出一种求解大规模稀疏矩阵特征问题的并行共轭梯度算法.为了提高算法的并行效率,设计了负载平衡的行划分方式,实现了计算和通信重叠的稀疏矩阵重排序方法,通过预处理减少计算过程中各进程间消息传递的通信量.另外,基于多核处理器高性能并行计算,实现了MPI和细粒度(线程级)OpenMP混合并行算法.在深腾7800并行计算机上对并行算法进行了测试,结果表明在进程数增多时并行算法可保持通信时间稳定性,在并行计算机上有很好的扩展性,适合大规模稀疏特征问题的求解.  相似文献   

2.
王海兵 《计算机应用》2011,31(Z1):172-173,176
通过重载MPI消息传递函数,在重载的MPI函数中调用MPE库中各日志记录函数,实现了大规模面向对象有限元程序自定义并行性能监测。对一个典型冲击动力学问题进行了16 CPU的并行有限元模拟,通过并行性能监测对其有限元并行算法进行了分析。  相似文献   

3.
分布式实时操作系统消息机制的设计与实现   总被引:1,自引:1,他引:0  
随着数字信号处理技术的迅猛发展,针对并行数字信号处理(DSP)应用自主开发了一个满足用户需要的高性能分布式实时操作系统--腾飞分布式实时操作系统(TF-RTOS).消息机制用于线程间的通信,是操作系统中的重要部分.在开发TF-RTOS过程中,从消息命令包、消息队列、消息传递过程和消息原语这4个方面设计并实现了一种直接消息传递的消息机制,该消息机制具有简化线程间通信、增强系统功能、提高系统性能的特点.  相似文献   

4.
Petri网系统的功能划分及其并行算法   总被引:1,自引:1,他引:1       下载免费PDF全文
针对Petri网系统的并行控制与执行,提出Petri网系统的功能划分及其并行算法。采用库所不变量技术对Petri网系统进行功能划分,给出非负库所不变量的功能划分算法。对进程内和进程间的并行性进行分析,给出消息传递环境下Petri网并行算法及应用示例。实验结果表明,该算法能较好地体现Petri网系统的实际运行过程,是实现Petri网系统并行控制和运行的有效方法。  相似文献   

5.
针对大规模结构非线性动力问题的有限元分析非常耗时,基于消息传递接口(MPI)机群环境,提出多种基于并行求解策略的显式有限元并行算法。基于显式消息传递的区域分解技术,采取重叠、非重叠区域分解技术及动态任务分配方法,通过将计算与通信重叠,优化处理器间的通信,对非重叠通信区域分解并行算法、重叠通信区域分解并行算法、群动态任务分配算法、动态任务分配算法及动态负载平衡算法进行研究。为在机群环境下实现非线性动力有限元分析,开发了基于有效并行求解策略的显式有限元并行算法。编写了基于消息传递编程模式的并行有限元程序,在工作站机群上实现了数值算例,分析了算法的性能,并与传统的Newmark算法进行了比较。算例表明:群动态任务分配算法的性能优于动态任务分配算法,低于区域分解算法的性能,动态负载平衡算法最优。对相同规模的问题提出的算法比Newmark算法快,优于Newmark算法。对结构非线性动力问题的有限元分析,所提出的并行算法是可行有效的。  相似文献   

6.
面向对象的并行消息传递库的设计与实现分析   总被引:1,自引:0,他引:1  
马珂绛 《计算机应用》2005,25(3):628-630,636
MPI是并行程序设计中广泛使用的一个消息传递库,虽然标准MPI-2定义了C 绑定,但它并不严格符合面向对象的观点。在分析各类已有面向对象消息传递系统的基础上,用C 设计并实现了一个面向对象的、易于传递对象(包括用户自定义类型和STL容器)的、MPI一致的、类型安全的、基于MPI的并行消息传递库,并给出了相应的使用实例及性能分析。  相似文献   

7.
本文研究实现了一个面向有限差分离散模型的分布并行计算支持库YHLIB。YHLIB库基于MPI消息传递接口设计实现,通过提供有限差分离散模型并行计算接口支持计算区域分解、域间通信、域内通信、循环下标转换、分布式I/O、动态负载平衡等功能,封装了并行计算实现细节,提高了并行程序开发效率。抽象模型实现和实际应用测试表明,YHLIB具有较高的并行效率。  相似文献   

8.
SIFT特征分布式并行提取算法   总被引:1,自引:0,他引:1  
SIFT(scale invariant feature transform)特征在物体检测和识别、图像配准与融合、纹理识别、场景分类、人脸检测、图像检索、三维重建、数字水印、影像追踪等领域具有广泛应用,但存在计算量大、消耗时间长的缺点.基于消息传递机制,采用数据并行策略,提出了在PC机群或COW(cluster ofworkstation)上提取图像SIFT特征的分布式并行算法(DP-SIFT算法):根据特征空间-高斯尺度金字塔的特点提出了高度宽度受限的数据块划分算法,设计了数据分配和特征调整方法;研究了数据块划分和数据发送方法对通信时间的影响,提出了基于消息传递机制的并行图像处理中数据块划分与数据发送方式协同对通信优化的策略;实验结果表明DP-SIFT算法具有良好的加速性能和较高的处理器利用效率,千兆以太网连接32核的PC机群系统图像规模为1024×768时,加速比和处理器效率分别可以达到20和0.6;图像规模为2048×1536时可达18和0.56.  相似文献   

9.
基于MPI的并行计算集群通信及应用   总被引:4,自引:0,他引:4  
对能有效解大型稀疏矩阵方程的LSQR串行算法进行了并行化分析,并应用可移植消息传递标准MPI的集群通信机制在分布式存储并行系统上设计和实现了LSQR并行算法,该并行算法和程序在地震表层模型层析反演中得到了有效的应用。  相似文献   

10.
归一化积相关图像匹配算法中的图像分块并行处理方法   总被引:1,自引:0,他引:1  
对具有良好匹配特性的归一化积相关算法中的相关求和 - ∑∑XY的并行算法进行了较为深入的探讨 ,设计了基于 K元 2立方体网络计算机的图像分块计算 ∑∑XY的并行算法 .对于 N× N的参考图像、M× M的实时图像和K× K的处理元阵列 ( N=BK,K=M,B>1) ,通过将参考图像分成 B× B个图像块就可利用该算法实现归一化积相关图像分块匹配并行算法 .应用实验表明 ,本文设计的并行算法具有很好的并行效率  相似文献   

11.
PVM是目前最有影响的基于消息传递的并行软件,它为用户提供了一种以较小的代价实现高性能计算机的有效途径。本文提出了一种基于PVM平台的数字图象处理算法的平行化方法,该算法充分考虑了数字图象处理的特点,使用“群集”模型,有效提高了数字图象处理的速度,达到理想效果。  相似文献   

12.
A popular approach to providing nonexperts in parallel computing with an easy-to-use programming model is to design a software library consisting of a set of preparallelized routines, and hide the intricacies of parallelization behind the library's API. However, for regular domain problems (such as simple matrix manipulations or low-level image processing applications-in which all elements in a regular subset of a dense data field are accessed in turn) speedup obtained with many such library-based parallelization tools is often suboptimal. This is because interoperation optimization (or: time-optimization of communication steps across library calls) is generally not incorporated in the library implementations. We present a simple, efficient, finite state machine-based approach for communication minimization of library-based data parallel regular domain problems. In the approach, referred to as lazy parallelization, a sequential program is parallelized automatically at runtime by inserting communication primitives and memory management operations whenever necessary. Apart from being simple and cheap, lazy parallelization guarantees to generate legal, correct, and efficient parallel programs at all times. The effectiveness of the approach is demonstrated by analyzing the performance characteristics of two typical regular domain problems obtained from the field of low-level image processing. Experimental results show significant performance improvements over nonoptimized parallel applications. Moreover, obtained communication behavior is found to be optimal with respect to the abstraction level of message passing programs.  相似文献   

13.
We describe a parallel software library, named MEDITOMO, designed for processing MEDIcal images obtained by SPECT (single photon emission computed tomography) TOMOgraphic systems. MEDITOMO is the core library of the PSE (problem solving environment) MEDIGRID, oriented to medical imaging analysis, which the authors are currently developing. MEDIGRID is employed in a Grid-computing infrastructure involving clinical departments and research institutes. The algorithms of MEDITOMO are the standard ones that are usually applied in the SPECT analysis, i.e. the conjugate gradient and the expectation maximization. The main contribution of this work concerns the introduction of the total variation seminorm as the edge-preserving regularization in both algorithms and the development of the parallel software library. Experiments carried out on synthetic and clinical data are shown.  相似文献   

14.
《Parallel Computing》2002,28(7-8):967-993
This paper describes a software architecture that allows image processing researchers to develop parallel applications in a transparent manner. The architecture's main component is an extensive library of data parallel low level image operations capable of running on homogeneous distributed memory MIMD-style multicomputers. Since the library has an application programming interface identical to that of an existing sequential library, all parallelism is completely hidden from the user.The first part of the paper discusses implementation aspects of the parallel library, and shows how sequential as well as parallel operations are implemented on the basis of so-called parallelizable patterns. A library built in this manner is easily maintainable, as extensive code redundancy is avoided. The second part of the paper describes the application of performance models to ensure efficiency of execution on all target platforms. Experiments show that for a realistic application performance predictions are highly accurate. These results indicate that the core of the architecture forms a powerful basis for automatic parallelization and optimization of a wide range of imaging software.  相似文献   

15.
Traditionally parallel compilers have targeted a standard message passing communication library when generating communication code (e.g. PVM, MPI). The standard message passing model dynamically reserves communication resources for each message. For regular, repeating communication patterns, a static communication resource reservation model can be more efficient. By reserving resources once for many communication exchanges, the communication startup time is better amortized. Plus, with a global view of communication, the static model has a wider choice of routes. While the static resource reservation model can be a more efficient communication target for the compiler, this model reveals the problems of scheduling use of limited communication resources. This paper uses the abstraction of a communication resource to define two resource management problems and presents three algorithms that can be used by the compiler to address these problems. Initial measures of the effectiveness of these algorithms are presented from two programs for an $8 \times 8$ iWarp system. © 1997 by John Wiley & Sons, Ltd.  相似文献   

16.
We present the software library STXXL that is an implementation of the C++ standard template library (STL) for processing huge data sets that can fit only on hard disks. It supports parallel disks, overlapping between disk I/O and computation and it is the first I/O‐efficient algorithm library that supports the pipelining technique that can save more than half of the I/Os. STXXL has been applied both in academic and industrial environments for a range of problems including text processing, graph algorithms, computational geometry, Gaussian elimination, visualization, and analysis of microscopic images, differential cryptographic analysis, etc. The performance of STXXL and its applications are evaluated on synthetic and real‐world inputs. We present the design of the library, how its performance features are supported, and demonstrate how the library integrates with STL. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

17.
当前主流GIS软件以及互联网地图应用在WebGIS(网络地理信息系统)解决方案中都广泛采用地图切片(又称瓦片),切片处理服务是实现影像在WebGIS上快速无缝浏览的关键技术。针对目前传统算法以及商业GIS软件在大数据量栅格影像快速瓦片化方面的不足,提出一种名为ParaTile的高效栅格影像快速瓦片化方法,ParaTile基于MPI共享外存的并行技术,利用多进程对原始栅格影像进行数据划分,每个进程对其所划分的区域进行独立读写和计算,而后再按照TMS或者Google Tile定义的标准将瓦片进行编码输出。实验采用不同级别大小的遥感影像进行测试,结果表明ParaTile在面对不同规模的数据时,无论从速度还是算法稳定性上都较现有算法和工具具有显著优势,特别是当数据量越大时,这种优势愈加明显。  相似文献   

18.
Digital image processing systems are complex, being usually composed of different computer vision libraries. Algorithm implementations cannot be directly used in conjunction with algorithms developed using other computer vision libraries. This paper formulates a software solution by proposing a processor with the capability of handling different types of image processing algorithms, which allow the end users to install new image processing algorithms from any library. This approach has other functionalities like capability to process one or more images, manage multiple processing jobs simultaneously and maintain the manner in which an image was processed for later use. It is a computational efficient and promising technique to handle variety of image processing algorithms. To promote the reusability and adaptation of the package for new types of analysis, a feature of sustainability is established. The framework is integrated and tested on a medical imaging application, and the software is made freely available for the reader. Future work involves introducing the capability to connect to another instance of processing service with better performance. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

19.
选用Linux集群系统,由穿梭箱XPC节点和通过双千兆位以太网交换机连接的2 GHz速龙处理器构成。将MPICH-2、open Mosix与Linux远程进程调用(RPC)、创建子进程及低层的不同通信机制情况进行了比较。对消息传递库及单系统映像软件包与低层基元的直接使用效果进行对比。运用应用数据包、直接自主编写分布式应用程序、并行基元上进行分层以创建分布式语言。实验结果表明,某些情况下性能很差,低于硬件标准,这也是软件开发商今后要着重关注的问题。  相似文献   

20.
This paper describes a general methodology for developing parallel image processing algorithms based on message passing for high resolution images (on the order of several Gigabytes). These algorithms have been applied to histological images and must be executed on massively parallel processing architectures. Advances in new technologies for complete slide digitalization in pathology have been combined with developments in biomedical informatics. However, the efficient use of these digital slide systems is still a challenge. The image processing that these slides are subject to is still limited both in terms of data processed and processing methods. The work presented here focuses on the need to design and develop parallel image processing tools capable of obtaining and analyzing the entire gamut of information included in digital slides. Tools have been developed to assist pathologists in image analysis and diagnosis, and they cover low and high-level image processing methods applied to histological images. Code portability, reusability and scalability have been tested by using the following parallel computing architectures: distributed memory with massive parallel processors and two networks, INFINIBAND and Myrinet, composed of 17 and 1024 nodes respectively. The parallel framework proposed is flexible, high performance solution and it shows that the efficient processing of digital microscopic images is possible and may offer important benefits to pathology laboratories.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号