首页 | 本学科首页   官方微博 | 高级检索  
     

基于深度学习的场景文本检测算法研究
引用本文:熊炜,艾美慧,杨荻椿,李利荣,刘敏,王娟,曾春艳.基于深度学习的场景文本检测算法研究[J].光电子.激光,2021,32(7):728-734.
作者姓名:熊炜  艾美慧  杨荻椿  李利荣  刘敏  王娟  曾春艳
作者单位:湖北工业大学电气与电子工程学院,湖北武汉430068;美国南卡罗来纳大学计算机科学与工程系,南卡哥伦比亚29201;湖北工业大学电气与电子工程学院,湖北武汉430068
基金项目:国家自然科学基金资助项目(61571182,61601177)、湖北省自然科学基金项目(2019CFB530)、湖北省科技厅重大专项 (2019ZYYD020)和国家留学基金项目(201808420418)资助项目 (1.湖北工业大学 电气与电子工程学院,湖北 武汉 430068; 2.美国南卡罗来纳大学计算机科学与工程系,南卡 哥伦比亚 29201)
摘    要:针对自然场景中任意形状文本图像因文本行难以区分导致的信息丢失问题,提出了 一种基于深度学习的场景文本检测算法。首先构建特征提取模块,使用Resnet50作为骨干 网络,在增加跨层连接的金字塔网络结构中引入并联的空洞卷积模块,以提取更多语义信息; 其次,对得到的特征图进行多尺度特征融合,学习不同尺度的特征;最后预测出不同内核大 小的文本实例,并通过尺度扩展逐渐扩大文本行区域,直到得到最终的检测结果。实验结果 表明,该方法在SCUT-CTW1500弯曲文本数据集上的准确率、召回率及F1值分别达到88.5%、 77.0%和81.3%,相比其他基于分割的算法,该算 法对弯曲文本的检测效果良好,具有一定的 应用价值。

关 键 词:场景文本检测  深度学习  特征提取  多尺度特征融合  空洞空间金字塔
收稿时间:2020/12/26 0:00:00

Research on scene text detection algorithm based on de ep learning
XIONG We,AI Meihui,YANG Dichun,LI Lirong,LIU Min,WANG Juan and ZENG Chunyan.Research on scene text detection algorithm based on de ep learning[J].Journal of Optoelectronics·laser,2021,32(7):728-734.
Authors:XIONG We  AI Meihui  YANG Dichun  LI Lirong  LIU Min  WANG Juan and ZENG Chunyan
Affiliation:School of Electrical and Electronic Engineering,Hubei University of Technolog y,Wuhan,Hubei 430068,China ;Department of Computer Science and Engineering,Uni versity of South Carolina,Columbia,SC 29201,USA,School of Electrical and Electronic Engineering,Hubei University of Technolog y,Wuhan,Hubei 430068,China,School of Electrical and Electronic Engineering,Hubei University of Technolog y,Wuhan,Hubei 430068,China,School of Electrical and Electronic Engineering,Hubei University of Technolog y,Wuhan,Hubei 430068,China,School of Electrical and Electronic Engineering,Hubei University of Technolog y,Wuhan,Hubei 430068,China,School of Electrical and Electronic Engineering,Hubei University of Technolog y,Wuhan,Hubei 430068,China and School of Electrical and Electronic Engineering,Hubei University of Technolog y,Wuhan,Hubei 430068,China
Abstract:To solve the problem of information loss of arbitrary shape text detect ion in natural scenes,a scene text detection algorithm based on deep learning is propo sed.Firstly,a feature extraction module was constructed,Resnet50was used as the backbone net work,and a parallel void convolution module was introduced into the pyramid network struc ture with cross-layer connections to extract more semantic information.Secondly,multi- scale feature fusion is performed on the obtained feature map to learn features of different s cales.Finally, the kernel size of text line of different scales is predicted and the predicted text line area is gradually expanded until the final detection result is obtained.Experimental re sults show that the evaluation indexes of accuracy,recall and F1-measure of the model on SCUT -CTW1500curved text dataset reach 88.5%,77.0% and 81.3%,respectively.Compared with othe r segmentation-based algorithms,this algorithm has a good detection effect on cu rved text;therefore,it has certain application value.
Keywords:scene text detection  deep learning  feature extraction  multi-scale feature fu sion  dilated spatial pyramid
本文献已被 万方数据 等数据库收录!
点击此处可从《光电子.激光》浏览原始摘要信息
点击此处可从《光电子.激光》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号