首页 | 本学科首页   官方微博 | 高级检索  
     

C2 Transformer U-Net:面向跨模态和上下文语义的医学图像分割模型
引用本文:周涛,侯森宝,陆惠玲,刘赟璨,党培.C2 Transformer U-Net:面向跨模态和上下文语义的医学图像分割模型[J].电子与信息学报,2023,45(5):1807-1816.
作者姓名:周涛  侯森宝  陆惠玲  刘赟璨  党培
作者单位:1.北方民族大学计算机科学与工程学院 银川 7500212.宁夏医科大学理学院 银川 7500043.北方民族大学图像图形智能处理国家民委重点实验室 银川 750021
基金项目:国家自然科学基金(62062003),宁夏自治区重点研发计划(2020BEB04022),宁夏自然科学基金(2022AAC03149),北方民族大学引进人才科研启动项目(2020KYQD08)
摘    要:跨模态的医学图像可以在同一病灶处提供更多的语义信息,针对U-Net网络主要使用单模态图像用于分割,未充分考虑跨模态、上下文语义相关性的问题,该文提出面向跨模态和上下文语义的医学图像分割C2 Transformer U-Net模型。该模型的主要思想是:首先,在编码器部分提出主干、辅助U-Net网络结构,来提取不同模态的语义信息;然后,设计了多模态上下文语义感知处理器(MCAP),有效地提取同一病灶跨模态的语义信息,跳跃连接中使用主网络的两种模态图像相加后传入Transformer解码器,增强模型对病灶的表达能力;其次,在编-解码器中采用预激活残差单元和Transformer架构,一方面提取病灶的上下文特征信息,另一方面使网络在充分利用低层和高层特征时更加关注病灶的位置信息;最后,使用临床多模态肺部医学图像数据集验证算法的有效性,对比实验结果表明所提模型对于肺部病灶分割的Acc, Pre, Recall, Dice, Voe与Rvd分别为:97.95%, 94.94%, 94.31%, 96.98%, 92.57%与93.35%。对于形状复杂肺部病灶的分割,具有较高的精度和相对较低的冗余度,总体上优于现有的先进方法。

关 键 词:医学图像分割    跨模态语义    上下文语义    Transformer    U-Net
收稿时间:2022-04-14

C2 Transformer U-Net: A Medical Image Segmentation Model for Cross-modality and Contextual Semantics
ZHOU Tao,HOU Senbao,LU Huiling,LIU Yuncan,DANG Pei.C2 Transformer U-Net: A Medical Image Segmentation Model for Cross-modality and Contextual Semantics[J].Journal of Electronics & Information Technology,2023,45(5):1807-1816.
Authors:ZHOU Tao  HOU Senbao  LU Huiling  LIU Yuncan  DANG Pei
Affiliation:1.School of Computer Science and Engineering, North Minzu University, Yinchuan 750021, China2.School of Science, Ningxia Medical University, Yinchuan 750004, China3.Key Laboratory of Image and Graphics Intelligent Processing of State Ethnic Affairs Commission, North Minzu University, Yinchuan 750021, China
Abstract:Cross-modal medical images can provide more semantic information at the same lesion. In view of the U-Net network uses mainly single-modal images for segmentation, the cross-modal and contextual semantic correlations are not fully considered. Therefore, a cross-modal and contextual semantic-oriented medical image segmentation C2 Transformer U-Net model is proposed. The main idea of this model is: first, a backbone and auxiliary U-Net network structure is proposed in the encoder part to extract semantic information of different modalities; Then, the Multi-modal Context semantic Awareness Processor (MCAP) is designed to extract effectively the semantic information of the same lesion across modalities. After adding the two modal images using the backbone network in the skip connection, it is passed to the Transformer decoder. This enhances the expression ability of the model to the lesion; Secondly, the pre-activated residual unit and Transformer architecture are used in the encoder-decoder. On the one hand, the contextual feature information of the lesion is extracted, and on the other hand, the network pays more attention to the location information of the lesion when making full use of low-level and high-level features; Finally, the effectiveness of the algorithm is verified by using a clinical multi-modal lung medical image dataset. Comparative experimental results show that the Acc, Pre, Recall, Dice, Voe and Rvd of the proposed model for lung lesion segmentation are: 97.95%, 94.94%, 94.31%, 96.98%, 92.57% and 93.35%. For the segmentation of lung lesions with complex shapes, it has high accuracy and relatively low redundancy. Overall, it outperforms existing state-of-the-art methods.
Keywords:
点击此处可从《电子与信息学报》浏览原始摘要信息
点击此处可从《电子与信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号