首页 | 本学科首页   官方微博 | 高级检索  
     

基于学习主动中心轮廓模型的场景文本检测
引用本文:谢斌红,秦耀龙,张英俊.基于学习主动中心轮廓模型的场景文本检测[J].计算机工程,2022,48(3):244-252+262.
作者姓名:谢斌红  秦耀龙  张英俊
作者单位:太原科技大学 计算机科学与技术学院, 太原 030024
基金项目:山西省重点研发计划(重点)高新领域项目(201703D111027);山西省重点研发计划项目(201803D121048,201803D121055)。
摘    要:在场景文本检测领域,存在由于文本尺寸波动较大导致的小文本漏检、大文本欠检测和多尺度文本边界检测错误的情况。针对上述问题,提出一种基于学习主动中心轮廓模型的场景文本检测网络。在残差网络ResNet的基础上构建多尺度特征权重融合模型,对输入的场景文本图片进行多尺度特征提取和权重融合,并计算出最终的特征融合图,适应场景文本长宽比变化较大的情况。在此基础上,将融合后的特征图输入到学习主动中心轮廓模型预测文本框的中心点和边界,该模型为场景文本检测提供丰富先验知识,以解决多尺度文本检测框包含过多背景或部分包围文本造成的边界检测错误问题。在MSRA-TD500、IC13、IC15和IC17MLT数据集上的实验结果表明,该网络能够提高多尺度场景文本检测的准确率,其中在MSRA-TD50数据集上F-measure为0.83,相较于MSR方法提升1%,在IC13数据集上F-measure为0.91,相较于PixelLink网络提升2%,在IC15数据集上F-measure值为0.87,相较于PSENet网络提升1%,在IC17MLT数据集上F-measure值为0.74,相较于TridentNet网络提升1%。

关 键 词:场景文本检测  多尺度特征提取  权重融合  主动轮廓模型  学习主动中心轮廓模型  
收稿时间:2021-02-07
修稿时间:2021-03-25

Scene Text Detection Based on Learning Active Center Contour Model
XIE Binhong,QIN Yaolong,ZHANG Yingjun.Scene Text Detection Based on Learning Active Center Contour Model[J].Computer Engineering,2022,48(3):244-252+262.
Authors:XIE Binhong  QIN Yaolong  ZHANG Yingjun
Affiliation:School of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan 030024, China
Abstract:In the field of scene text detection,there are several problems such as missing small text and insufficient precision for large text and multi-scale text boundary detection errors caused by large text size fluctuation.To solve the above problems,a scene text detection network based on a Learning Active Center Contour(LACC)model is proposed.First,the Multi-scale Feature Weight Fusion(MSWF)model is constructed on the basis of a Residual Network(ResNet)to extract multi-scale features and fuse weights of the input scene text images. Then the final feature fusion map is calculated to adapt to the situation where the aspect ratio of the scene text changes significantly.Finally the feature fusion map is then input into the LACC model to predict the center point and boundary of the text box,which provides rich prior knowledge for scene text detection to solve the problem of boundary detection errors caused by multi-scale text detection boxes containing too many backgrounds or partially enclosing text. Experimental results on MSRA-TD500,IC13,IC15 and IC17 MLT datasets show that this network can improve the accuracy of text detection in multi-scale scenarios.The F-measure on MSRA-TD50 datasets is 0.83,which is 1% higher than the MSR method.The F-measure on IC13 datasets is 0.91,which is 2% higher than the PixelLink network.The F-measure on IC15 datasets is 0.87,which is1% higher than PSENet.The F-measure on IC17 MLT datasets is 0.74,which is 1% higher than TridentNet.
Keywords:scene text detection  multi-scale feature extraction  weight fusion  active contour model  Learning Active Center Contour(LACC)model
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号