首页 | 本学科首页   官方微博 | 高级检索  
     

CUDA下地质图像边缘检测算法并行优化
引用本文:张晗,钱育蓉,侯海耀.CUDA下地质图像边缘检测算法并行优化[J].计算机工程与设计,2019,40(3):691-698.
作者姓名:张晗  钱育蓉  侯海耀
作者单位:新疆大学 软件学院,新疆 乌鲁木齐,830008;新疆大学 软件学院,新疆 乌鲁木齐,830008;新疆大学 软件学院,新疆 乌鲁木齐,830008
基金项目:国家自然科学基金;国家自然科学基金;新疆维吾尔自治区项目
摘    要:为提高地质图像边缘检测Prewitt算法计算速度,结合算法计算密集和数据密集的特点,从核函数计算层面,提出基于调整线程块坐标优化线程发散方法和采用本地变量访存优化指令延迟设计思想;从CPU-GPU数据传输开销层面,提出基于CUDA流优化数据传输开销方法。经测试,当设置线程块规模为32*32、采用独立的局部变量替代索引访存和使用CUDA工作流分块计算时,对大于6168*6168尺寸的地质图像加速比可提高120倍以上。该并行优化方案易于实现,可应用于大规模地质图像边缘检测领域。

关 键 词:边缘检测  PREWITT算子  并行计算  图形处理器  统一计算设备架构

Parallel optimization of edge detection algorithm for geological images under CUDA
ZHANG Han,QIAN Yu-rong,HOU Hai-yao.Parallel optimization of edge detection algorithm for geological images under CUDA[J].Computer Engineering and Design,2019,40(3):691-698.
Authors:ZHANG Han  QIAN Yu-rong  HOU Hai-yao
Affiliation:(School of Software,Xinjiang University,Urumqi 830008,China)
Abstract:To improve the computational speed of the Prewitt algorithm for edge detection of geological images,the computationally intensive and data-intensive features of the algorithm were combined.From the aspect of kernel function calculation,a method of optimizing thread divergence based on adjusting thread block coordinates and adopting local variables was proposed to optimize instruction latency.From the aspect of CPU-GPU data transmission overhead,a method based on CUDA stream to optimize data transmission overhead was proposed.Testing results show that,when setting the thread block size to 32*32,using independent local variables instead of indexed memory access,and using CUDA workflow block calculations,the acceleration ratio for geological images larger than 6168*6168 can be increased by more than 120 times.This parallel optimization scheme is easy to implement and can be applied in the field of large-scale geological image edge detection.
Keywords:edge detection  Prewitt operator  parallel computing  GPU  CUDA
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号