期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

O-Buffer based IFT watershed from markers for large medical datasets

Ernesto Coto Sren Grimm David Williams 《Computers & Graphics》2007,31(6):848-863

The watershed transform from markers is a very popular image segmentation operator. The image foresting transform (IFT) watershed is a common method to compute the watershed transform from markers using a priority queue, but which can consume too much memory when applied to three-dimensional medical datasets. This is a considerable limitation on the applicability of the IFT watershed, as the size of medical datasets keeps increasing at a faster pace than physical memory technologies develop. This paper presents the O-IFT watershed, a new type of IFT watershed based on the O-Buffer framework, and introduces an efficient data representation which considerably reduces the memory consumption of the algorithm. In addition, this paper introduces the O-Buckets, a new implementation of the priority queue which further reduces the memory consumption of the algorithm. The new O-IFT watershed with O-Buckets allows the application of the watershed transform from markers to large medical datasets. 相似文献

2.

交通网络最短路径标号算法的实现与效率分析 总被引：6，自引：0，他引：6

下载免费PDF全文

陈洁陆锋《中国图象图形学报》2005,10(9):1134-1138

标号算法是交通网络最短路径算法族中应用最广泛的算法,其中以各种D ijkstra算法为核心的标号设定算法是各种商用G IS平台网络分析算法的首选。然而,同样隶属于标号算法的标号改正算法在交通网络路径分析中却罕有应用。为了将标号改正算法应用于交通网络路径分析,首先讨论了标号算法的基本结构;然后分析了标号设定算法和标号改正算法的实现过程、复杂度、运行特点和适用性,进而选择了标号设定和标号改正算法中公认的几种优秀算法———基于逼近桶结构和改进四叉堆的D ijkstra算法(D IKBA与D IKQH)以及Pallottino算法(TWO-Q),并结合交通网络邻接链表结构予以实现;最后采用城市交通网络数据,对几种算法的实际运行效率进行了对比试验,试验结果表明,标号改正算法和标号设定算法优点各异;由于交通网络路径算法的应用越来越强调动态性和网络适用性,而且标号改正算法较之标号设定算法具有更大的适用范围,因此其在交通网络路径分析中具有极大的应用潜力。相似文献

3.

BPF 重建算法的 CUDA 并行实现

伍绍佳陈皓廖丽桂建保《集成技术》2014,3(5):61-68

反投影滤波(Backprojection-Filter,BPF)算法凭借其可实现感兴趣区域重建的优点,近年来逐渐被应用到锥束CT中。但是,由于算法的复杂性,实践中存在耗时问题,同时其GPU加速的实现亦存在显存不足等问题。因此,文章提出了一种基于CUDA的BPF并行加速算法。通过设计高效的算法框架,在保留其重建精度的前提下,有效地减少所需显存。此外,总结了正投影算法及BPF算法中采用的加速策略,如利用算法特征加速等,并引入显存池的概念优化算法架构。仿真实验结果表明,在精确重建的前提下,采用新框架重建512×512×512数据只需8.055 s,感兴趣区域重建只需4.566 s,只需1.523 s便可输出第一部分数据,且能把显存占用从2.5 GB减少到100 MB以下,适用于大数据重建。相似文献

4.

颅面外科手术仿真中的网格简化研究

成雄姚莉秀张治国杨杰《计算机仿真》2007,24(8):178-181

快速高质量的网格简化是颅颌面手术仿真中的影响网格的实时绘制和软组织变形建模的一个关键步骤.文中提出了一种改进最小二次误差准则网格简化算法.该算法中将边折叠代价计算、边折叠生成点的最优值计算和边折叠操作集成到一个管道中,并且用固定大小的最小代价选择替代堆来取代传统渐进网格算法中的大数据量的贪婪队列结构,从而大大减少了计算运行复杂度.计算机仿真结果显示,三角形面片的数目简化到原来的20%时仍能满足手术仿真中交互绘制的要求.与基于贪婪队列结构的渐进网格简化算法相比,所提出的改进算法能够将网格简化速度提高三倍左右,而内存的占用仅为原来的50%不到,Hausdorff距离误差也相对变小. 相似文献

5.

基于配对堆改进的Dijkstra算法 总被引：1，自引：0，他引：1

下载免费PDF全文

张林广方金云申排伟《中国图象图形学报》2007,12(5):922-926

在GIS网络分析系统中,Dijkstra算法是求解最短路径的经典算法。为了进一步提高求解最短路径的效率和节省系统的内存空间,提出了使用一种新式的数据结构——配对堆,以便通过实现可降级的优先队列来改进Dijkstra算法,然后通过研究配对堆的基本操作,给出了使用配对堆结构实现Dijkstra算法的方法和流程,并分析了其算法复杂度。该算法在VegaGIS系统中实现,取得到了较好的效果。相似文献

6.

Priority queues and sorting methods for parallel simulation

Grammatikakis M.D. Liesche S. 《IEEE transactions on pattern analysis and machine intelligence》2000,26(5):401-422

The authors examine the design, implementation, and experimental analysis of parallel priority queues for device and network simulation. They consider: 1) distributed splay trees using MPI; 2) concurrent heaps using shared memory atomic locks; and 3) a new, more general concurrent data structure based on distributed sorted lists, designed to provide dynamically balanced work allocation and efficient use of shared memory resources. We evaluate performance for all three data structures on a Cray-TSESOO system at KFA-Julich. Our comparisons are based on simulations of single buffers and a 64×64 packet switch which supports multicasting. In all implementations, PEs monitor traffic at their preassigned input/output ports, while priority queue elements are distributed across the Cray-TBE virtual shared memory. Our experiments with up to 60000 packets and two to 64 PEs indicate that concurrent priority queues perform much better than distributed ones. Both concurrent implementations have comparable performance, while our new data structure uses less memory and has been further optimized. We also consider parallel simulation for symmetric networks by sorting integer conflict functions and implementing a packet indexing scheme. The optimized message passing network simulator can process ~500 K packet moves in one second, with an efficiency that exceeds ~50 percent for a few thousand packets on the Cray-T3E with 32 PEs. All developed data structures form a parallel library. Although our concurrent implementations use the Cray-TSE ShMem library, portability can be derived from Open-MP or MP1-2 standard libraries, which will provide support for one-way communication and shared memory lock mechanisms 相似文献

7.

基于动态纹理载入的大规模数据场体绘制 总被引：1，自引：1，他引：0

下载免费PDF全文

郑杰姬红兵《中国图象图形学报》2008,13(2):316-321

为克服图形硬件对传统纹理映射体绘制的限制,提出了一种在普通PC上进行大规模数据场体绘制的有效方法。该方法中,体数据被划分为合适大小的数据块,这些数据块被动态的载入图形硬件,并利用3维纹理映射进行绘制。在整个绘制过程中,仅有一个数据块存储在图形硬件上,有效地提高了对大规模体数据的绘制能力。同时,充分利用目前PC图形硬件成熟的可编程特性,通过对梯度的实时计算来减少在传统纹理映射体绘制中巨大的内存消耗。实验结果表明,该方法在普通PC上可以对超过纹理内存容量的大规模体数据进行交互式体绘制。相似文献

8.

Efficient computation of 3D Morse–Smale complexes and persistent homology using discrete Morse theory

David Günther Jan Reininghaus Hubert Wagner Ingrid Hotz 《The Visual computer》2012,28(10):959-969

We propose an efficient algorithm that computes the Morse–Smale complex for 3D gray-scale images. This complex allows for an efficient computation of persistent homology since it is, in general, much smaller than the input data but still contains all necessary information. Our method improves a recently proposed algorithm to extract the Morse–Smale complex in terms of memory consumption and running time. It also allows for a parallel computation of the complex. The computational complexity of the Morse–Smale complex extraction solely depends on the topological complexity of the input data. The persistence is then computed using the Morse–Smale complex by applying an existing algorithm with a good practical running time. We demonstrate that our method allows for the computation of persistent homology for large data on commodity hardware. 相似文献

9.

A Constant-time Algorithm for Erosions/Dilations with Applications to Morphological Texture Feature Computation

《Real》2000,6(3):223-239

A computationally efficient algorithm for computing erosions and dilations by one- dimensional grayscale structuring elements with constant slope is proposed. The computational complexity of this algorithm is independent of the size of the support of the structuring function. This is a generalization of the method proposed by Van Herk for the case of erosion and dilation by flat one-dimensional structuring elements. By appropriate combinations of these structuring elements, it is possible to approximate many useful structuring elements. This enables efficient computation of granulometries where the number of operations depends linearly on the number of openings. Theoretical and experimental results comparing the complexity of this algorithm with other standard techniques is presented. Two memory efficient algorithms are then presented. Several implementation issues in computing a granulometry and moments of the associated morphological pattern spectrum are then addressed. An efficient implementation of granulometries for large images on machines with limited memory, by dividing the image into smaller rectangular patches is then discussed. The optimum size of these patches is a function of the specific hardware and has been obtained experimentally for three different hardware platforms. Finally, parallel implementation of the different algorithms on two multi-processor machines is discussed. 相似文献

10.

Analysis of a parallel volume rendering system based on theshear-warp factorization

Lacroute P. 《IEEE transactions on visualization and computer graphics》1996,2(3):218-231

This paper presents a parallel volume rendering algorithm that can render a 256×256×225 voxel medical data set at over 15 Hz and a 512×512×334 voxel data set at over 7 Hz on a 32-processor Silicon Graphics Challenge. The algorithm achieves these results by minimizing each of the three components of execution time: computation time, synchronization time, and data communication time. Computation time is low because the parallel algorithm is based on the recently-reported shear-warp serial volume rendering algorithm which is over five times faster than previous serial algorithms. The algorithm uses run-length encoding to exploit coherence and an efficient volume traversal to reduce overhead. Synchronization time is minimized by using dynamic load balancing and a task partition that minimizes synchronization events. Data communication costs are low because the algorithm is implemented for shared-memory multiprocessors, a class of machines with hardware support for low-latency fine-grain communication and hardware caching to hide latency. We draw two conclusions from our implementation. First, we find that on shared-memory architectures data redistribution and communication costs do not dominate rendering time. Second, we find that cache locality requirements impose a limit on parallelism in volume rendering algorithms. Specifically, our results indicate that shared-memory machines with hundreds of processors would be useful only for rendering very large data sets 相似文献

11.

Concurrent computation of topological watershed on shared memory parallel machines

《Parallel Computing》2017

The watershed transform is considered as the most appropriate method for image segmentation in the field of mathematical morphology. In the following paper, we present an adapted topological watershed algorithm suited for a rapid and effective implementation on Shared Memory Parallel Machine (SMPM). The introduced algorithm allows a parallel watershed computing while preserving the given topology. No prior minima extraction is needed, nor the use of any sorting step or hierarchical queue. The strategy that guides the parallel watershed computing, labeled SDM-Strategy (equivalent to Split-Distributes and Merge), is also presented. Experimental analyses such as execution time, performance enhancement, cache consumption, efficiency and scalability are also presented and discussed. 相似文献

12.

一种基于结构相似性的图聚类算法

金超张龙波王海鹏安建瑞怀浩王晓丹《计算机与现代化》2016,(3):19

图聚类是发现网络中潜在结构的一项重要任务。提出一种基于结构相似性的图聚类算法GNSCAN,给出该算法的相关定义以及算法的执行过程。采用真实数据集对该算法进行测试,从理论分析及结果2方面证明GNSCAN算法在效率上比GN算法得到明显的提高。在GNSCAN算法的基础上,提出一种改进的GNSCAN算法IGNSCAN,算法时间复杂度得到进一步降低。相似文献

13.

基于实时梯度计算的快速纹理映射体绘制

郑杰姬红兵杨万海《计算机科学》2007,34(2):230-233

本文分析讨论了一种基于实时梯度计算的纹理映射体绘制算法。在绘制过程中实时计算每个体素上的梯度，有效地减少了在传统基于纹理映射的体绘制中，对梯度预先计算的耗时操作，也降低了内存的消耗，加快了整个绘制过程。同时，采用三维Sobel算子对梯度进行计算，并进行归一化处理，有效地提高了绘制图像的质量。在实现中充分利用了目前PC图形硬件成熟的可编程特性，特别是fragment program，来完成梯度的实时计算。最后对医学体数据进行绘制，得到了理想的结果。相似文献

14.

基于多效用阈值的分布式高效用序列模式挖掘

曾毅张福泉《计算机工程与设计》2020,41(2):449-457

针对序列模式的高效用模式挖掘过程中搜索空间大、计算复杂度高的问题,提出一种基于多效用阈值的分布式高效用序列模式挖掘算法。采用数组结构保存模式的效用信息,解决效用矩阵导致的内存消耗大的缺点。设计1-项集与2-项集的深度剪枝策略,深入地缩小候选模式的搜索空间,减少搜索时间成本与缓存成本。提出挖掘算法的分布式实现方案,通过并行处理进一步降低模式挖掘的时间。基于中等规模与大规模的序列数据集分别进行实验,实验结果表明,该算法有效减少了候选模式的数量,降低了挖掘的时间成本与存储成本,对于大数据集表现出较好的可扩展能力与稳定性。相似文献

15.

Fast and lock-free concurrent priority queues for multi-thread systems

《Journal of Parallel and Distributed Computing》2005,65(5):609-627

We present an efficient and practical lock-free implementation of a concurrent priority queue that is suitable for both fully concurrent (large multi-processor) systems as well as pre-emptive (multi-process) systems. Many algorithms for concurrent priority queues are based on mutual exclusion. However, mutual exclusion causes blocking which has several drawbacks and degrades the overall performance of the system. Non-blocking algorithms avoid blocking, and several implementations have been proposed. Previously known non-blocking algorithms of priority queues did not perform well in practice because of their complexity, and they are often based on non-available atomic synchronization primitives. Our algorithm is based on the randomized sequential list structure called Skiplist, and a real-time extension of our algorithm is also described. In our performance evaluation we compare our algorithm with a well-representable set of earlier known implementations of priority queues. The experimental results clearly show that our lock-free implementation outperforms the other lock-based implementations in practical scenarios for 3 threads and more, both on fully concurrent as well as on pre-emptive systems. 相似文献

16.

改进的标记分水岭遥感影像分割方法* 总被引：3，自引：0，他引：3

张桂峰巫兆聪易俐娜《计算机应用研究》2010,27(2):760-763

在Meyer算法的基础上,提出一种改进的标记分水岭遥感影像分割方法,该方法针对高空间分辨率遥感影像的特点,依据梯度影像的分布特征自动提取合适的标记影像。浸没过程中,非标记像素按照梯度值由小到大进行处理,并使用正反两个队列记录当前处理的像素。实验证明,将该算法用于高分辨率遥感影像分割,不仅获得高质量的分割结果,而且具有极高的运行效率与空间利用效率。相似文献

17.

基于Spark的MapReduce相似度计算效率优化

廖彬张陶于炯国冰磊刘炎《计算机科学》2017,44(8):46-53

随着互联网的用户及内容呈指数级增长,大规模数据场景下的相似度计算对算法的效率提出了更高的要求。为提高算法的执行效率,对MapReduce架构下的算法执行缺陷进行了分析,结合Spark适于迭代型及交互型任务的特点,基于二维划分算法将算法从MapReduce平台移植到Spark平台;同时,通过参数调整、内存优化等方法进一步提高算法的执行效率。通过2组数据集分别在3组不同规模的集群上的实验表明,与MapReduce相比,在Spark平台下算法的执行效率平均提高了4.715倍,平均能耗效率只有Hadoop能耗的24.86%,能耗效率提升了4倍左右。相似文献

18.

A pipelined array architecture for Euclidean distance transformation and its FPGA implementation

《Microprocessors and Microsystems》2005,29(8-9):405-410

The Euclidean Distance Transform (EDT) is an important tool in image analysis and machine vision. This paper provides an area-efficient hardware solution to the computation of EDT on a binary image. An O(n) hardware algorithm for computing EDT of an n×n image is presented. A pipelined 2D array architecture for harware implementation is designed. The architecture has a regular structure with locally connected identical processing elements. Further, pipelining reduces hardware resources. Such an array architecture is easily scalable to handle images of different sizes and is suitable for implementation on reconfigurable devices like FPGAs. Results of FPGA-based implementation shows that the hardware can process about 6000 images of size 512×512 per second which is much higher than the video rate of 30 frames per second. 相似文献

19.

一种精简的二维DWT结构设计

曹志研季振洲胡铭曾《计算机工程》2007,33(23):228-229

设计了一种低功耗的二维离散小波变换(DWT)结构，用于无线传感器网络中的图像压缩。该结构实现了精简复杂性的(5,3)整数离散小波变换，采用流水线和延迟线技术，在获得高运算吞吐率的同时，使数据尽可能被处理单元高效利用，以减少对片内存储器和片外存储器的访问次数。多级二维DWT采用展开方法实现，这种方法可尽早开始下一级变换，不需要大的片内存储器和片内存取操作。模拟试验和FPGA实现验证了系统在满足需要性能的前提下具有低复杂性、低功耗、片内存储器小等优点。相似文献

20.

信息物理融合系统的动态多优先级调度

刘纯尧张立臣《计算机科学》2015,42(1):28-32

信息物理融合系统(Cyber-physical Systems,CPS)的复杂和异构性给设计者带来了不少挑战,其中任务的多样性使得传统的调度策略不能满足CPS的性能需求.提出了专门针对基于大规模传感器网络的CPS的动态多优先级调度策略.根据任务类型分配4级缓存队列:第1级是来自控制器待处理的实时任务,拥有最高的可抢占式优先级;第2级是来自控制器待转发的实时任务,拥有次高的可抢占式优先级;第3级是来自其他节点待转发的非实时任务,拥有第三高的非抢占式优先级;第4级是来自本地待发送的非实时任务,拥有最低的非抢占式优先级.设计了抢占与非抢占混合的动态调度策略来减少任务的平均等待时间,加入了等待时间阈值机制来保证第4级任务的公平性.通过理论分析和仿真实验对调度策略的性能做了评价.仿真结果显示,动态多优先级调度策略在提高系统性能和稳定性上要优于传统优先级调度. 相似文献