名词引导局部特征提取的基于文本的实例分割方法 Referring image segmentation method based on local feature extraction guided by noun期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

名词引导局部特征提取的基于文本的实例分割方法

引用本文：	郑剑,沈士涛,于祥春,庞庆威,吴宗錝.名词引导局部特征提取的基于文本的实例分割方法[J].计算机应用研究,2023,40(4):1263-1267.

作者姓名：	郑剑沈士涛于祥春庞庆威吴宗錝

作者单位：	江西理工大学信息工程学院,江西理工大学信息工程学院,江西理工大学信息工程学院,江西理工大学信息工程学院,江西理工大学信息工程学院

基金项目：	江西省教育厅科学技术研究项目(GJJ190468)

摘要：	局部特征信息在图像分割中扮演着重要角色，然而基于文本的实例分割任务具有对输入文本表达式的依赖性，无法直接从原始的输入图像中提取局部特征信息。针对这一问题，提出了一种具体的名词引导局部特征提取的深度神经网络模型(NgLFNet),NgLFNet模型可根据输入文本表达式中的关键名词来自动挖掘待分割对象的局部特征信息。具体地，该模型首先通过语句分析得到关键名词；其次通过文本和图像编码器提取相应特征，并利用关键名词通过多头注意力机制获取高关注区域局部特征；然后逐步融合多模态特征；最后在解码修正模块利用得到的局部特征对预测掩膜进行更细致的修正，从而得到最终结果。将该方法与多种主流基于文本的实例分割方法进行对比，实验结果表明该方法提升了分割效果。
关键词：	图像处理深度学习基于文本的实例分割多模态特征特征融合注意力机制
收稿时间：	2022/7/18 0:00:00
修稿时间：	2023/3/7 0:00:00
Referring image segmentation method based on local feature extraction guided by noun

Zheng Jian,Shen shitao,Yu Xiangchun,Pang Qingwei and Wu Zongzong.Referring image segmentation method based on local feature extraction guided by noun[J].Application Research of Computers,2023,40(4):1263-1267.

Authors:	Zheng Jian Shen shitao Yu Xiangchun Pang Qingwei and Wu Zongzong

Affiliation:	School of Information and Engineering,Jiangxi University of Science and Technology,,,,

Abstract:	Local feature information plays an important role in image segmentation, however, the referring image segmentation task is dependent on the text expression, so it is impossible to extract local feature information directly from the original reference image. In order to solve this problem, this paper proposed a specific noun-guided local feature extraction deep neural network model(NgLFNet). The NgLFNet model can automatically mine the local feature information of the object to be segmented according to the key nouns in the input text expression. Specifically, the model firstly obtainsed key nouns in text through sentence analysis. Secondly, extracting corresponding features through text and image encoders, and using the key nouns to obtain local features of high-interest regions through the multi-head attention mechanism. Then the multi-modal features were gradually fused to learn. Finally, the decoding and correction module used the obtained local features to perform more detailed corrections on the prediction mask to obtain the final result. The proposed method is compared with a variety of mainstream referring segmentation methods and the experimental results show that the proposed method improves the accuracy of text-based instance segmentation task.

Keywords:	image processing deep learning referring image segmentation multimodal features feature fusion attention mechanism

	点击此处可从《计算机应用研究》浏览原始摘要信息
	点击此处可从《计算机应用研究》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏