首页 | 本学科首页   官方微博 | 高级检索  
     

结合深度学习与条件随机场的遥感图像分类
引用本文:夏梦,曹国,汪光亚,尚岩峰.结合深度学习与条件随机场的遥感图像分类[J].中国图象图形学报,2017,22(9):1289-1301.
作者姓名:夏梦  曹国  汪光亚  尚岩峰
作者单位:南京理工大学计算机科学与工程学院, 南京 210094,南京理工大学计算机科学与工程学院, 南京 210094,南京理工大学计算机科学与工程学院, 南京 210094,公安部第三研究所, 上海 201204
基金项目:国家自然科学基金项目(61371168);江苏省科技支撑计划基金项目(BE2014646);南京市科技计划基金项目(201505026)
摘    要:目的 为进一步提高遥感影像的分类精度,将卷积神经网络(CNN)与条件随机场(CRF)两个模型结合,提出一种新的分类方法。方法 首先采用CNN对遥感图像进行预分类,并将其类成员概率定义为CRF模型的一阶势函数;然后利用高斯核函数的线性组合定义CRF模型的二阶势函数,用全连接的邻域结构代替常见的4邻域或8邻域;接着加入区域约束,使用Mean-shift分割方法得到超像素,通过计算超像素的后验概率均值修正各像素的分类结果,鼓励连通区域结果的一致性;最后采用平均场近似算法实现整个模型的推断。结果 选用3组高分辨率遥感图像进行地物分类实验。本文方法不仅能抑制更多的分类噪声,同时还可以改善过平滑现象,保护各类地物的边缘信息。实验采用类精度、总体分类精度OA、平均分类精度AA,以及Kappa系数4个指标进行定量分析,与支持向量机(SVM)、CNN和全连接CRF相比,最终获得的各项精度均得到显著提升,其中,AA提高3.28个百分点,OA提高3.22个百分点,Kappa提高5.07个百分点。结论 将CNN与CRF两种模型融合,不仅可以获得像元本质化的特征,而且同时还考虑了图像的空间上下文信息,使分类更加准确,后加入的约束条件还能进一步保留地物目标的局部信息。本文方法适用于遥感图像分类领域,是一种精确有效的分类方法。

关 键 词:遥感图像分类  深度学习  卷积神经网络  条件随机场  势函数  区域约束
收稿时间:2017/3/31 0:00:00
修稿时间:2017/5/26 0:00:00

Remote sensing image classification based on deep learning and conditional random fields
Xia Meng,Cao Guo,Wang Guangya and Shang Yanfeng.Remote sensing image classification based on deep learning and conditional random fields[J].Journal of Image and Graphics,2017,22(9):1289-1301.
Authors:Xia Meng  Cao Guo  Wang Guangya and Shang Yanfeng
Affiliation:School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China,School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China,School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China and The Third Research Institute of The Ministry of Public Security, Shanghai 201204, China
Abstract:Objective Remote sensing image classification refers to the use of computers to analyze the spectral and spatial information of various land cover objects in remote sensing images,divide feature space into non-overlapping subspaces,and place a pixel into a specific subspace.In computer vision,this procedure aims to assign a predefined semantic label to each pixel in an image.This process is also called "semantic segmentation." The rapid development of computer application technology,aerospace,and sensor technology in recent years has resulted in numerous methods for acquiring different types of remote sensing image data.As an important aspect of remote sensing technology,the classification of high-resolution remote sensing imagery has gained considerable attention.A novel image classification method is proposed in this study.This method is based on a fully connected conditional random field (CRF) model,which is combined with a convolutional neural network (CNN).These two models are merged to utilize their respective advantages to further improve classification accuracy for remote sensing images.Method On the one hand,most traditional classification methods typically rely on artificial experiences to extract the characteristics of training samples.After learning,a single-layer feature without a hierarchical structure is obtained.These methods generally have shallow structures,and the features they produced are relatively simple.By contrast,as a new research direction in the field of machine learning,deep learning can transform the feature representation of training samples from the original space into a new feature space layer by layer,as well as learn to automatically yield a hierarchical feature representation,which is conducive to classification and feature visualization.For the past years,this new subject has achieved a significant breakthrough in the field of computer vision applications,such as visual recognition challenges,image classification,and object detection.As one of its representatives,CNN has been widely used in pattern recognition to avoid the complex preprocessing of images.We use CNN in this study to replace the traditional classification methods to obtain essential features of the input image.On the other hand,traditional classification methods are based on the spectral statistical characteristics of pixels.These methods are also known as pixel-wise classification methods.They analyze the spectral information of each pixel individually by using a statistical learning algorithm,such as support vector machine (SVM),maximum likelihood classification,minimum distance method,decision tree,and k-means clustering.These methods typically produce high classification errors and results with low accuracies because they do not consider the rich spatial contextual information of images.We draw support from the probabilistic graphical model,which is one of the research hot spots in machine learning and pattern recognition,to solve this problem.When this model is utilized,researchers cannot only use Bayesian probability statistic theory to solve the problem,but also mature graph theory to deal with contextual information.As an excellent representative of a probabilistic graphical model,the CRF model for 1D sequence data processing was proposed by Lafferty in 2001.This model can incorporate spatial contextual information in the aspects of labels and observed data.The uniqueness of this model is that it can be flexible to modeling posterior distribution directly.The early CRF model was mainly used in natural language processing and speech recognition fields,and then it was successfully applied to image processing by Kumar and Hebert in 2003.Although considerable research has been conducted on CRF models,the conventional CRF still exhibits oversmoothing problems.Therefore,we add regional restriction (RR) to enhance the consistency of the classification results in connected areas to protect the edge structure of land cover objects.In summary,the steps of our proposed method are as follows.We preclassify the entire remote sensing image into certain land cover types via CNN using the results of class membership probabilities as the unary potential in the CRF model.The pairwise potential of CRF is defined by a linear combination of Gaussian kernels,which forms a fully connected neighbor structure instead of the common four-neighbor or eight-neighbor structure.RR is also incorporated into the framework to promote the consistency of connected areas.We use the mean shift algorithm to obtain superpixels and correct the classification results by calculating their average posterior probabilities.A highly efficient approximate inference algorithm,namely,mean field inference,is generated for the final model.Result Our experimental results,which are based on three different remote sensing images,demonstrate that the proposed classification framework exhibits competitive quantitative and qualitative performances,which effectively alleviate salt-and-pepper classification noise,improve the oversmoothing phenomenon,and protect the edge structure of land cover objects.The experiments are conducted using class accuracy,overall classification accuracy (OA),average classification accuracy (AA),and the kappa coefficient for the entire quantitative analysis.Compared with those of SVM,CNN,and fully connected CRF,the final accuracies of our experiments are significantly improved.AA is increased by 3.28 percentage points,OA is increased by 3.22 percentage points,and the kappa coefficient is increased by 5.07 percentage points.Conclusion Traditional classification methods have two shortcomings.The first problem is insufficient feature extraction,which leads inaccurate classification results.The second problem is that pixel-based methods only consider the information of single points and disregard the mutual influence of surrounding points.The combination of CNN and CRF cannot only obtain the essential characteristics of pixels,but also considers the contextual information of an image.Therefore,our method can achieve accurate classification results.Moreover,the integration of RR can protect the edge structure of land cover objects to yield a satisfactory classification performance.The proposed method is accurate and effective,and it can be used in remote sensing image classification.
Keywords:remote sensing image classification  deep learning  convolutional neural network (CNN)  conditional random fields (CRFs)  potential function  regional restriction (RR)
点击此处可从《中国图象图形学报》浏览原始摘要信息
点击此处可从《中国图象图形学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号