基于学习主动中心轮廓模型的场景文本检测 Scene Text Detection Based on Learning Active Center Contour Model期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于学习主动中心轮廓模型的场景文本检测

引用本文：	谢斌红,秦耀龙,张英俊.基于学习主动中心轮廓模型的场景文本检测[J].计算机工程,2022,48(3):244-252+262.

作者姓名：	谢斌红秦耀龙张英俊

作者单位：	太原科技大学计算机科学与技术学院, 太原 030024

基金项目：	山西省重点研发计划(重点)高新领域项目(201703D111027);山西省重点研发计划项目(201803D121048,201803D121055)。

摘要：	在场景文本检测领域，存在由于文本尺寸波动较大导致的小文本漏检、大文本欠检测和多尺度文本边界检测错误的情况。针对上述问题，提出一种基于学习主动中心轮廓模型的场景文本检测网络。在残差网络ResNet的基础上构建多尺度特征权重融合模型，对输入的场景文本图片进行多尺度特征提取和权重融合，并计算出最终的特征融合图，适应场景文本长宽比变化较大的情况。在此基础上，将融合后的特征图输入到学习主动中心轮廓模型预测文本框的中心点和边界，该模型为场景文本检测提供丰富先验知识，以解决多尺度文本检测框包含过多背景或部分包围文本造成的边界检测错误问题。在MSRA-TD500、IC13、IC15和IC17MLT数据集上的实验结果表明，该网络能够提高多尺度场景文本检测的准确率，其中在MSRA-TD50数据集上F-measure为0.83，相较于MSR方法提升1%，在IC13数据集上F-measure为0.91，相较于PixelLink网络提升2%，在IC15数据集上F-measure值为0.87，相较于PSENet网络提升1%，在IC17MLT数据集上F-measure值为0.74，相较于TridentNet网络提升1%。
关键词：	场景文本检测多尺度特征提取权重融合主动轮廓模型学习主动中心轮廓模型
收稿时间：	2021-02-07
修稿时间：	2021-03-25
Scene Text Detection Based on Learning Active Center Contour Model

XIE Binhong,QIN Yaolong,ZHANG Yingjun.Scene Text Detection Based on Learning Active Center Contour Model[J].Computer Engineering,2022,48(3):244-252+262.

Authors:	XIE Binhong QIN Yaolong ZHANG Yingjun

Affiliation:	School of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan 030024, China

Abstract:	In the field of scene text detection,there are several problems such as missing small text and insufficient precision for large text and multi-scale text boundary detection errors caused by large text size fluctuation.To solve the above problems,a scene text detection network based on a Learning Active Center Contour(LACC)model is proposed.First,the Multi-scale Feature Weight Fusion(MSWF)model is constructed on the basis of a Residual Network(ResNet)to extract multi-scale features and fuse weights of the input scene text images. Then the final feature fusion map is calculated to adapt to the situation where the aspect ratio of the scene text changes significantly.Finally the feature fusion map is then input into the LACC model to predict the center point and boundary of the text box,which provides rich prior knowledge for scene text detection to solve the problem of boundary detection errors caused by multi-scale text detection boxes containing too many backgrounds or partially enclosing text. Experimental results on MSRA-TD500,IC13,IC15 and IC17 MLT datasets show that this network can improve the accuracy of text detection in multi-scale scenarios.The F-measure on MSRA-TD50 datasets is 0.83,which is 1% higher than the MSR method.The F-measure on IC13 datasets is 0.91,which is 2% higher than the PixelLink network.The F-measure on IC15 datasets is 0.87,which is1% higher than PSENet.The F-measure on IC17 MLT datasets is 0.74,which is 1% higher than TridentNet.

Keywords:	scene text detection multi-scale feature extraction weight fusion active contour model Learning Active Center Contour(LACC)model
本文献已被维普万方数据等数据库收录！
	点击此处可从《计算机工程》浏览原始摘要信息
	点击此处可从《计算机工程》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏