首页 | 本学科首页   官方微博 | 高级检索  
     

角度优化网络的印章文字检测与识别算法
引用本文:肖进胜,赵陶,熊闻心,杨天,姚渭箐.角度优化网络的印章文字检测与识别算法[J].电子与信息学报,2021,43(11):3327-3334.
作者姓名:肖进胜  赵陶  熊闻心  杨天  姚渭箐
作者单位:1.武汉大学电子信息学院 武汉 4300722.国网湖北省电力有限公司信息通信公司 武汉 430077
基金项目:国家自然科学基金(61471272),国网湖北省电力有限公司2019年科技项目(52153318004G)
摘    要:利用光学字符识别方法对印章文字进行检测与识别,能够加快各类合同的分类处理速度与鉴别效率。该文针对圆形印章文字呈环形排列的特点,利用极坐标展开对印章文字进行预处理,克服了印章文字方向不统一的问题。对于展开后上下起伏的文本区域,利用带角度信息的联结文本提议网(CTPN)对印章文字区域进行检测,并使用贝塞尔拟合文本区域,实现了对印章区域的准确检测。最后利用注意力转移机制和该文匹配算法对检测的文字区域进行识别,输出印章文字内容。运用该算法对输出印章文字内容自制的中文印章数据集进行实验,印章内容的文字检测F值可以达到84.73%,文字识别召回率达到84.4%,表明该算法可以有效地检测识别印章内容,对文档的分类与鉴别研究具有重要的意义。

关 键 词:图像处理    印章识别    循环神经网络    极坐标转换
收稿时间:2020-11-30

Seal Text Detection and Recognition Algorithm with Angle Optimization Network
Jinsheng XIAO,Tao ZHAO,Wenxin XIONG,Tian YANG,Weiqing YAO.Seal Text Detection and Recognition Algorithm with Angle Optimization Network[J].Journal of Electronics & Information Technology,2021,43(11):3327-3334.
Authors:Jinsheng XIAO  Tao ZHAO  Wenxin XIONG  Tian YANG  Weiqing YAO
Affiliation:1.School of Electronic Information, Wuhan University, Wuhan 430072, China2.State Grid Hubei Information & Telecommunication Company Limited, Wuhan 430077, China
Abstract:Using the methods of Optical Character Recognition (OCR) to detect and recognize the seal characters can speed up the classification speed and identification efficiency of all kinds of contracts. According to the characteristics of the cycle seal characters arranged in a ring, polar coordinate conversion is used to preprocess the seal characters, which overcomes the problem that the direction of the seal characters is not uniform. The Connectionist Text Proposal Network (CTPN) with angle information is used to detect the undulating text area, and the Bezier curve is used to achieve the accurate detection of the seal area. Finally, a method combined with the attention mechanism and the matching algorithm is used to recognize the detected text area and the seal text content is obtained. Using this algorithm to test the self-made Chinese seal data set, the F-measure of the seal content can reach 84.73%, and the recall rate of the character recognition is 84.4%, which shows that this algorithm can detect and recognize the seal content effectively, and has an important meaning for the research of document classification and identification.
Keywords:
点击此处可从《电子与信息学报》浏览原始摘要信息
点击此处可从《电子与信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号