基于深度学习的场景文本检测算法研究 Research on scene text detection algorithm based on de ep learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于深度学习的场景文本检测算法研究

引用本文：	熊炜,艾美慧,杨荻椿,李利荣,刘敏,王娟,曾春艳.基于深度学习的场景文本检测算法研究[J].光电子．激光,2021,32(7):728-734.

作者姓名：	熊炜艾美慧杨荻椿李利荣刘敏王娟曾春艳

作者单位：	湖北工业大学电气与电子工程学院,湖北武汉430068;美国南卡罗来纳大学计算机科学与工程系,南卡哥伦比亚29201;湖北工业大学电气与电子工程学院,湖北武汉430068

基金项目：	国家自然科学基金资助项目(61571182,61601177)、湖北省自然科学基金项目(2019CFB530)、湖北省科技厅重大专项 (2019ZYYD020)和国家留学基金项目(201808420418)资助项目 (1.湖北工业大学电气与电子工程学院,湖北武汉 430068； 2.美国南卡罗来纳大学计算机科学与工程系,南卡哥伦比亚 29201)

摘要：	针对自然场景中任意形状文本图像因文本行难以区分导致的信息丢失问题,提出了一种基于深度学习的场景文本检测算法。首先构建特征提取模块,使用Resnet50作为骨干网络,在增加跨层连接的金字塔网络结构中引入并联的空洞卷积模块,以提取更多语义信息；其次,对得到的特征图进行多尺度特征融合,学习不同尺度的特征；最后预测出不同内核大小的文本实例,并通过尺度扩展逐渐扩大文本行区域,直到得到最终的检测结果。实验结果表明,该方法在SCUT-CTW1500弯曲文本数据集上的准确率、召回率及F1值分别达到88.5%、 77.0%和81.3%,相比其他基于分割的算法,该算法对弯曲文本的检测效果良好,具有一定的应用价值。
关键词：	场景文本检测深度学习特征提取多尺度特征融合空洞空间金字塔
收稿时间：	2020/12/26 0:00:00
Research on scene text detection algorithm based on de ep learning

XIONG We,AI Meihui,YANG Dichun,LI Lirong,LIU Min,WANG Juan and ZENG Chunyan.Research on scene text detection algorithm based on de ep learning[J].Journal of Optoelectronics·laser,2021,32(7):728-734.

Authors:	XIONG We AI Meihui YANG Dichun LI Lirong LIU Min WANG Juan and ZENG Chunyan

Affiliation:	School of Electrical and Electronic Engineering,Hubei University of Technolog y,Wuhan,Hubei 430068,China ;Department of Computer Science and Engineering,Uni versity of South Carolina,Columbia,SC 29201,USA,School of Electrical and Electronic Engineering,Hubei University of Technolog y,Wuhan,Hubei 430068,China,School of Electrical and Electronic Engineering,Hubei University of Technolog y,Wuhan,Hubei 430068,China,School of Electrical and Electronic Engineering,Hubei University of Technolog y,Wuhan,Hubei 430068,China,School of Electrical and Electronic Engineering,Hubei University of Technolog y,Wuhan,Hubei 430068,China,School of Electrical and Electronic Engineering,Hubei University of Technolog y,Wuhan,Hubei 430068,China and School of Electrical and Electronic Engineering,Hubei University of Technolog y,Wuhan,Hubei 430068,China

Abstract:	To solve the problem of information loss of arbitrary shape text detect ion in natural scenes,a scene text detection algorithm based on deep learning is propo sed.Firstly,a feature extraction module was constructed,Resnet50was used as the backbone net work,and a parallel void convolution module was introduced into the pyramid network struc ture with cross-layer connections to extract more semantic information.Secondly,multi- scale feature fusion is performed on the obtained feature map to learn features of different s cales.Finally, the kernel size of text line of different scales is predicted and the predicted text line area is gradually expanded until the final detection result is obtained.Experimental re sults show that the evaluation indexes of accuracy,recall and F1-measure of the model on SCUT -CTW1500curved text dataset reach 88.5%,77.0% and 81.3%,respectively.Compared with othe r segmentation-based algorithms,this algorithm has a good detection effect on cu rved text;therefore,it has certain application value.

Keywords:	scene text detection deep learning feature extraction multi-scale feature fu sion dilated spatial pyramid
本文献已被万方数据等数据库收录！
	点击此处可从《光电子．激光》浏览原始摘要信息
	点击此处可从《光电子．激光》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏