首页 | 本学科首页   官方微博 | 高级检索  
     

名词引导局部特征提取的基于文本的实例分割方法
引用本文:郑剑,沈士涛,于祥春,庞庆威,吴宗錝.名词引导局部特征提取的基于文本的实例分割方法[J].计算机应用研究,2023,40(4):1263-1267.
作者姓名:郑剑  沈士涛  于祥春  庞庆威  吴宗錝
作者单位:江西理工大学信息工程学院,江西理工大学信息工程学院,江西理工大学信息工程学院,江西理工大学信息工程学院,江西理工大学信息工程学院
基金项目:江西省教育厅科学技术研究项目(GJJ190468)
摘    要:局部特征信息在图像分割中扮演着重要角色,然而基于文本的实例分割任务具有对输入文本表达式的依赖性,无法直接从原始的输入图像中提取局部特征信息。针对这一问题,提出了一种具体的名词引导局部特征提取的深度神经网络模型(NgLFNet),NgLFNet模型可根据输入文本表达式中的关键名词来自动挖掘待分割对象的局部特征信息。具体地,该模型首先通过语句分析得到关键名词;其次通过文本和图像编码器提取相应特征,并利用关键名词通过多头注意力机制获取高关注区域局部特征;然后逐步融合多模态特征;最后在解码修正模块利用得到的局部特征对预测掩膜进行更细致的修正,从而得到最终结果。将该方法与多种主流基于文本的实例分割方法进行对比,实验结果表明该方法提升了分割效果。

关 键 词:图像处理  深度学习  基于文本的实例分割  多模态特征  特征融合  注意力机制
收稿时间:2022/7/18 0:00:00
修稿时间:2023/3/7 0:00:00

Referring image segmentation method based on local feature extraction guided by noun
Zheng Jian,Shen shitao,Yu Xiangchun,Pang Qingwei and Wu Zongzong.Referring image segmentation method based on local feature extraction guided by noun[J].Application Research of Computers,2023,40(4):1263-1267.
Authors:Zheng Jian  Shen shitao  Yu Xiangchun  Pang Qingwei and Wu Zongzong
Affiliation:School of Information and Engineering,Jiangxi University of Science and Technology,,,,
Abstract:Local feature information plays an important role in image segmentation, however, the referring image segmentation task is dependent on the text expression, so it is impossible to extract local feature information directly from the original reference image. In order to solve this problem, this paper proposed a specific noun-guided local feature extraction deep neural network model(NgLFNet). The NgLFNet model can automatically mine the local feature information of the object to be segmented according to the key nouns in the input text expression. Specifically, the model firstly obtainsed key nouns in text through sentence analysis. Secondly, extracting corresponding features through text and image encoders, and using the key nouns to obtain local features of high-interest regions through the multi-head attention mechanism. Then the multi-modal features were gradually fused to learn. Finally, the decoding and correction module used the obtained local features to perform more detailed corrections on the prediction mask to obtain the final result. The proposed method is compared with a variety of mainstream referring segmentation methods and the experimental results show that the proposed method improves the accuracy of text-based instance segmentation task.
Keywords:image processing  deep learning  referring image segmentation  multimodal features  feature fusion  attention mechanism
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号