GPU近实时线性双目立体代价聚合 Near real time linear stereo cost aggregation on GPU期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

GPU近实时线性双目立体代价聚合

引用本文：	陈彬,陈和平,李晓卉.GPU近实时线性双目立体代价聚合[J].中国图象图形学报,2014,19(10):1481-1489.

作者姓名：	陈彬陈和平李晓卉

作者单位：	武汉科技大学信息科学与工程学院, 武汉 430081;武汉科技大学计算机科学与技术学院, 武汉 430074;武汉科技大学信息科学与工程学院, 武汉 430081

基金项目：	国家自然科学基金项目（61105070）

摘要：	目的近年来双目视觉领域的研究重点逐步转而关注其“实时化”策略的研究，而立体代价聚合是双目视觉中最为复杂且最为耗时的步骤，为此，提出一种基于GPU通用计算(GPGPU)技术的近实时双目立体代价聚合算法。方法选用一种匹配精度接近于全局匹配算法的局部算法——线性立体匹配算法(linear stereo matching)作为代价聚合策略；结合线性代价聚合的原理，对其主要步骤(代价计算、均值滤波及系数求解等)的计算流程进行有针对性地并行优化。结果对于相同的实验样本，用本文方法在NVIDA GTX780 实验平台上能在更短的时间计算出代价矩阵，与原有的CPU实现方法相比，代价聚合的效率平均有了数十倍的提升。结论实时双目立体代价聚合方法，为在个人通用PC平台上实时获取高质量双目视觉深度信息提供了一个高效可靠的途径。
关键词：	双目视觉代价聚合 GPU通用计算并行计算
收稿时间：	2014/1/13 0:00:00
修稿时间：	6/5/2014 12:00:00 AM
Near real time linear stereo cost aggregation on GPU

Chen Bin,Chen Heping and Li Xiaohui.Near real time linear stereo cost aggregation on GPU[J].Journal of Image and Graphics,2014,19(10):1481-1489.

Authors:	Chen Bin Chen Heping and Li Xiaohui

Affiliation:	School of Information Science and Engineering, Wuhan University of Science and Technology, Wuhan 430081, China;School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan 430074, China;School of Information Science and Engineering, Wuhan University of Science and Technology, Wuhan 430081, China

Abstract:	Objective Stereo vision depends on feasible approaches for real-time/hardware implementation. Cost aggregation, the most complex part of the stereo matching algorithm, substantially affects the overall running time. Therefore, this study proposes a novel parallelization strategy to map the stereo cost aggregation of graphics processing units (GPUs) using compute unified device architecture (CUDA). Method The linear stereo matching algorithm is selected as the stereo cost aggregation strategy in the proposed approach. Linear stereo matching with constant complexity can achieve more accurate disparity maps than global disparity optimization methods. Although its computation complexity is considerably less than that of most global approaches, linear stereo matching, even when optimized by some effective strategies, remains to demonstrate a performance that exceeds real-time or near real-time requirements for practical applications. The parallelization strategy introduced in this study is based on a separable filter with linear complexity in the filter window size and with proven efficiency on GPU platforms. The computation for each step (cost computation, mean filter, and coefficients computation) of the cost aggregation is reformulated, and the rational use of different types of GPU memory is ensured. This study proposes several parallelization optimizations to increase parallelism degree and data throughput. After being optimized by these parallelization optimizations, our approach ensures that the computation of each CUDA thread is independent of other threads and maximizes parallelism degree. These parallelization optimizations also reduce the complexity of each thread from the exponential relationship to the linear relationship with window radius and further improve the efficiency. The efficiency of the memory access and the data throughput are also dramatically improved in our final implementation, cached by texture or shared memories in certain circumstances. These experimental results show that the proposed strategy is effective and efficient. Result We dramatically accelerate the stereo cost aggregation on GPUs under the assistance of the outstanding parallel computation performance of GPUs. Compared with the original CPU implementation accelerated by the integral image technology, our CUDA implementation on a specific NVIDIA GTX780 GPU provides, on the same stereo image pairs, accurate cost matrix within a significantly shorter running time (less than 80 ms) and improves the average efficiency by tenfold. Our approach also outperforms other real-time or near real-time stereo cost aggregation implementations on GPUs. Conclusion The proposed approach outperforms the previous constant time stereo solutions and produces accurate results comparable with those of adaptive weight aggregation on GPUs with CUDA. It also provides an efficient and feasible method to obtain an accurate disparity map on general PC platforms in real time.

Keywords:	stereo vision cost aggregation general purpose GPU parallel computing

	点击此处可从《中国图象图形学报》浏览原始摘要信息
	点击此处可从《中国图象图形学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏