首页 | 本学科首页   官方微博 | 高级检索  
     

基于注意力机制的复杂场景文本检测
引用本文:刘燕,温静.基于注意力机制的复杂场景文本检测[J].计算机科学,2020,47(7):135-140.
作者姓名:刘燕  温静
作者单位:山西大学计算机与信息技术学院 太原030006;山西大学计算机与信息技术学院 太原030006
基金项目:云南省应用基础研究计划;山西省工程项目;国家自然科学基金
摘    要:传统的文本检测方法大多采用自下而上的流程,它们通常从低级语义字符或笔画检测开始,然后进行非文本组件过滤、文本行构建和文本行验证。复杂场景中文字的造型、尺度、排版以及周围环境的剧烈变化,导致人的视觉系统是在不同的视觉粒度下完成文本检测任务的,而这些自底向上的传统方法的性能很大程度上依赖于低级特征的检测,难以鲁棒地适应不同粒度下的文本特征。近年来,深度学习方法被应用于文本检测中来保留不同分辨率下的文本特征,但已有的方法在对网络中各层特征提取的过程中没有明确重点特征信息,在各层之间的特征映射中会有信息丢失,造成一些非文本目标被误判,使得检测过程不仅耗时,而且会产生大量误检和漏检。为此,提出一种基于注意力机制的复杂场景文本检测方法,该方法的主要贡献是在VGG16中引入了视觉注意层,在细粒度下利用注意力机制增强网络内全局信息中的显著信息。实验表明,在载有GPU的Ubuntu环境下,该方法在复杂场景文本图片的检测中能保证文本区域的完整性,减少检测区域的碎片化,同时能获得高达87%的查全率和89%的查准率。

关 键 词:文本检测  深度学习  注意力机制

Complex Scene Text Detection Based on Attention Mechanism
LIU Yan,WEN Jing.Complex Scene Text Detection Based on Attention Mechanism[J].Computer Science,2020,47(7):135-140.
Authors:LIU Yan  WEN Jing
Affiliation:(School of Computer and Information Technology,Shanxi University,Taiyuan 030006,China)
Abstract:Most of the traditional text detection methods are developed in the bottom-up manner,which usually start with low-level semantic character or stroke detection,followed by non-text component filtering,text line construction,and text line validation.However,the modeling,scale,typesetting and surrounding environment of the characters in the complex scene change drastically,and the task of detecting text is carried up by human under variety of visual granularities.It’s difficult for these bottom-up traditional methods to maintain the text features under different resolution,due to their dependency on the low lever features.Recently,deep learning methods have been widely used in text detection in order to extract more features under different scale.However,in the existing methods,the key feature information is not emphasized during the feature extraction process of each layer,and will be lost in the layer-to-layer feature mapping process.Therefore,the missing information will also lead to a lot of false-alarm and leak detection,which causes much more time-consuming.This paper proposes a complex scene text detection method based on the attention mechanism.The main contribution of this method is to introduce a visual attention layer in VGG16,and use the attention mechanism to enhance the significant information in the global information in the network.Experiments show that in the Ubuntu environment with GPU,this method can ensure the integrity of the text area in the detection of complex scene text pictures,reduce the fragmentation of the detection area and can achieve up to 87%recall rate and 89%precision rate.
Keywords:Text detection  Deep learning  Attention mechanism
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号