首页 | 本学科首页   官方微博 | 高级检索  
     

基于数字结构特征的发票号码识别算法
引用本文:崔文成 任磊 刘阳 邵虹. 基于数字结构特征的发票号码识别算法[J]. 数据采集与处理, 2017, 32(1): 119-125
作者姓名:崔文成 任磊 刘阳 邵虹
作者单位:沈阳工业大学信息科学与工程学院,沈阳,110870
摘    要:由于印章覆盖、发票折痕等干扰因素的存在,一些发票号码区域会出现噪声粘连区域,这些区域会导致发票号码无法正常分割。针对这一问题,提出了噪声粘连区域修复算法,有效地避免了该情况对数字分割的影响。针对普通发票号码的字体结构和特点,提出了基于数字结构特征的发票号码识别算法。首先定义数字结构特征,包括4种填充区域、2种字符穿越数和4种镂空区域,构成待识别数字的10维特征向量;进而与标准模板库中数字进行模板特征匹配,求得距离最小值所对应的数字作为识别结果。将所提出的方法和基于改进的左右轮廓特征的印刷体数字识别方法进行对比,实验结果表明,本文所提出的识别算法拥有更高的准确率和更快的识别速度,以及对噪声有更强的鲁棒性。

关 键 词:发票号码识别;噪声粘连区域;数字结构特征

Invoice Number Recognition Algorithm Based on Numerical Structure Characteristics
Cui Wencheng,Ren Lei,Liu Yang,Shao Hong. Invoice Number Recognition Algorithm Based on Numerical Structure Characteristics[J]. Journal of Data Acquisition & Processing, 2017, 32(1): 119-125
Authors:Cui Wencheng  Ren Lei  Liu Yang  Shao Hong
Affiliation:School of Information Science and Engineering, Shenyang University of Technology, Shenyang, 110870, China
Abstract:Interference factors such as seal cover, invoice crease and so on, cause noise adhesion in number area of some invoice, which would seriously lead to the invoice number segmentation error. Aiming at this problem, a noise adhesion area repairing algorithm is proposed. At the same time, according to the font structure and characteristics of ordinary invoice number, invoice number recognition algorithm based on characteristics of digital structure is proposed. Firstly, define number structure features, including four kinds of fill area, two kinds of number of passing through the character, and four kinds of hollow area, which constitute a 10-dimensional feature vector of the number to be identified. Then, match the feature vector with the template features in the standard template library, by obtaining the Euclidean distance, and regard the corresponding number with the minimum Euclidean distances as the last recognition result. The proposed method and printed number recognition method based on the improved left and right contour features are compared. Experimental results indicate that the proposed identification algorithm has higher accuracy, faster recognition speed and stronger robustness to noise.
Keywords:invoice number recognition   noise adhesion area   numerical structure characteristics
点击此处可从《数据采集与处理》浏览原始摘要信息
点击此处可从《数据采集与处理》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号