首页 | 本学科首页   官方微博 | 高级检索  
     

融合双注意力机制3D U-Net的肺肿瘤分割
引用本文:郝晓宇,熊俊峰,薛旭东,石军,文可,韩文廷,李骁扬,赵俊,傅小龙. 融合双注意力机制3D U-Net的肺肿瘤分割[J]. 中国图象图形学报, 2020, 25(10): 2119-2127
作者姓名:郝晓宇  熊俊峰  薛旭东  石军  文可  韩文廷  李骁扬  赵俊  傅小龙
作者单位:中国科学技术大学计算机科学与技术学院, 合肥 230026;上海交通大学生物医学工程学院, 上海 200240;腾讯医疗健康, 上海 200000;中国科学技术大学附属第一医院 肿瘤放疗科, 合肥 230001;上海交通大学附属胸科医院放射肿瘤科, 上海 200030
基金项目:国家重点研发计划项目(2016YFB1000403);中央高校基本科研业务费专项资金资助
摘    要:目的 精确的肺肿瘤分割对肺癌诊断、手术规划以及放疗具有重要意义。计算机断层扫描(computed tomography,CT)是肺癌诊疗中最重要的辅助手段,但阅片是一项依靠医生主观经验、劳动密集型的工作,容易造成诊断结果的不稳定,实现快速、稳定和准确的肺肿瘤自动分割方法是当前研究的热点。随着深度学习的发展,使用卷积神经网络进行肺肿瘤的自动分割成为了主流。本文针对3D U-Net准确度不足,容易出现假阳性的问题,设计并实现了3维卷积神经网络DAU-Net(dual attention U-Net)。方法 首先对数据进行预处理,调整CT图像切片内的像素间距,设置窗宽、窗位,并通过裁剪去除CT图像中的冗余信息。DAU-Net以3D U-Net为基础结构,将每两个相邻的卷积层替换为残差结构,并在收缩路径和扩张路径中间加入并联在一起的位置注意力模块和通道注意力模块。预测时,采用连通域分析对网络输出的二值图像进行后处理,通过判断每个像素与周围26个像素的连通关系获取所有的连通域,并清除最大连通域外的其他区域,进一步提升分割精度。结果 实验数据来自上海胸科医院,总共1 010例肺癌患者,每例数据只包含一个病灶,专业的放射科医师提供了金标准,实验采用十折交叉验证。结果表明,本文提出的肺肿瘤分割算法与3D U-Net相比,Dice系数和哈斯多夫距离分别提升了2.5%和9.7%,假阳性率减少了13.6%。结论 本文算法能够有效提升肺肿瘤的分割精度,有助于实现肺癌的快速、稳定和准确分割。

关 键 词:U-Net  计算机断层扫描(CT)  肺部肿瘤  分割  注意力机制
收稿时间:2020-06-10
修稿时间:2020-07-07

3D U-Net with dual attention mechanism for lung tumor segmentation
Hao Xiaoyu,Xiong Junfeng,Xue Xudong,Shi Jun,Wen Ke,Han Wenting,Li Xiaoyang,Zhao Jun,Fu Xiaolong. 3D U-Net with dual attention mechanism for lung tumor segmentation[J]. Journal of Image and Graphics, 2020, 25(10): 2119-2127
Authors:Hao Xiaoyu  Xiong Junfeng  Xue Xudong  Shi Jun  Wen Ke  Han Wenting  Li Xiaoyang  Zhao Jun  Fu Xiaolong
Affiliation:School of Computer Science and Technology, University of Science and Technology of China, Hefei 230026, China;School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China;Tencent HealthCare, Co. Ltd., Shanghai 200000, China;Department of Radiation Oncology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei 230001, China; Department of Radiation Oncology, Shanghai Chest Hospital, Shanghai Jiao Tong University, Shanghai 200030, China
Abstract:Objective Precise lung tumor segmentation is a necessary step in computer-aided diagnosis, surgical planning, and radiotherapy of lung cancer. Computed tomography (CT) images are important auxiliary tools in clinical medicine. The diagnosis of lung cancer tumors is labor intensive that requires professional radiologists to carefully examine hundreds of CT slices for finding and confirming the location of tumor lesions, and final reports need to be verified by other experienced radiologists. This process consumes time and effort. Doctors commonly make different diagnoses at the same time, and the same doctor may make different decisions at different times because of the difference in their subjective experience. To solve the above problems, increasing scientific researchers have devoted to the field of medical imaging by continuously promoting the combination of artificial intelligence and medical imaging, and the automatic segmentation of lung tumors has been widely investigated. To address the problems that 3D U-Net is insufficiently accurate and is prone to produce false positive pixels, this paper proposes a new network named dual attention U-Net (DAU-Net) that incorporates dual attention mechanisms and residual modules. A post processing method based on connected component analysis is used to remove the false positive regions outside the region of interest. Method In accordance with the characteristics of lung CT images, we proposed a pipeline to preprocess CT images, which was divided into three steps. Standardizing pixel pitch was the first step that needs to be performed because different pixel spacings will affect the speed and quality of network convergence in the training process. The thickness of all 2D slices is 5 mm, and the range of in-plane resolution varies from 0.607 mm to 0.976 mm. Thus, linear interpolation was applied to each CT slice to obtain 1 mm in-plane resolution. The interpolated CT images still exist in 3D form. The window width and window level were then set to 1 600 and -200, respectively, that is, the pixel values in the CT image greater than 600 were set to 600 and those less than -1 000 were set to -1 000. The intensity values of images were truncated to the range of [-1 000, 600] and linearly normalized to [0,1] to enhance the regions of interest when using CT images, which is helpful for the automatic segmentation of lesions. This step will make the size of each CT image less than N×512×512, where N is the number of slices. After padding to N×512×512, the CT images and their corresponding annotations were cropped to a constant size of N×320×260 from a fixed coordinate (0, 90, 130) of the very beginning slice, and interpolation was used to scale the size of the images to 64×320×260. The main architecture of the network adopts the 3D form of the U-Net by replacing every two adjacent convolutional layers with a residual structure and adding two attention mechanisms to the middle of the contraction path and the expansion path to obtain DAU-Net. The network can alleviate degradation, gradient disappearance, and gradient explosion caused by the increase in the depth of the neural network by adding the residual structures. Similar to U-Net, encoder-decoder networks can merge high-resolution feature maps with position information and low-resolution feature maps with contextual information through skip connections to capture targets of different scales. However, they cannot take advantage of the positional relationship of different objects in global images and association between different categories. To retain the advantages of encoder-decoder structures and overcome the above problems, a position attention module and a channel attention module connected in parallel are combined with 3D U-Net. The position attention module can encode context information from a wide range into local features and the channel attention module can find the dependency relationship between different channels, thereby strengthening the interdependent features. The network can perform end-to-end training and it was trained by optimizing soft dice loss in this work. After inference, connected component analysis is used to remove the false positive regions that are wrongly segmented by only keeping the largest connected component and discarding other parts. Considering that this paper uses a 3D CNN(convolutional neural network), a 26-neighborhood connected component analysis method is used to determine the connection relationship between a central pixel and its 26 adjacent pixels. The output of the network has two channels, and softmax is used to make the output between zero and one. In binarization, only the channel index with a high probability is selected to obtain the final binary result where the connected component analysis method is applied. This postprocessing method effectively improves the segmentation accuracy and decreases the false positive rate (FPR). The premise of using this method is that the dataset we use contains only one lesion per case. Result We retrospectively collected data from patients in Shanghai Chest Hospital from 2013 to 2017. The study was approved by Shanghai Chest Hospital, Shanghai Jiao Tong University. Ethical approval (ID: KS 1716) was obtained for use of the CT images. Experienced radiologists provided the gold standard of each case. In the experiment, we compared the standard 3D U-Net and the reproduced 3D attention U-Net. The experiment used 10-fold cross-validation for all networks, and we adopted the widely used Dice, Hausdorff distance (HD), FPR, and true positive rate to evaluate the predicted outputs. The results show that the proposed DAU-Net has powerful performance in the lung tumor segmentation task, and the postprocessing method can effectively reduce the interference of false positive regions on the segmentation results. Compared with 3D U-Net, Dice and HD are improved by 2.5% and 9.7%, respectively, and FPR is reduced by 13.6%. Conclusion The proposed lung tumor segmentation algorithm can effectively improve the accuracy of tumor segmentation and help to achieve rapid, stable, and accurate segmentation of lung cancer.
Keywords:U-Net  computed tomography(CT)  lung tumor  segmentation  attention mechanism
点击此处可从《中国图象图形学报》浏览原始摘要信息
点击此处可从《中国图象图形学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号