首页 | 本学科首页   官方微博 | 高级检索  
     

基于灰度图像的表格框线去除算法
引用本文:张重阳,陈强,娄震,杨静宇.基于灰度图像的表格框线去除算法[J].计算机研究与发展,2005,42(4):635-639.
作者姓名:张重阳  陈强  娄震  杨静宇
作者单位:1. 南京理工大学计算机科学与技术系,南京,210094;中创软件工程股份有限公司,济南,250014
2. 南京理工大学计算机科学与技术系,南京,210094
基金项目:高等学校博士点科研基金项目(20020288013)
摘    要:笔画与表格框线的交叠的现象在表格型文档中普遍存在,严重影响了文档自动处理系统的性能.现有的去线算法大部分都是基于二值图像的,许多有用的局部信息已经丢失.提出了直接利用图像灰度信息的灰值线检测与去除算法.首先利用图像的边缘特征检测直线以及字线的相交位置;然后通过对直线上相交点对的分析确定字线的交叠方式,并将这些方式归纳为穿透和未穿透两类简单的形式;最后将直线划分为保护区和擦除区两部分,保护区内的像素在去线过程中被保留,而擦除区内的像素则利用灰度形态学算法来擦除.在我国现行支票上的实验表明算法是有效的.

关 键 词:文档处理  表格处理  直线检测  直线去除

A Form Frame Line Removal Algorithm Based on Gray-Level Image
Zhang Chongyang,Chen Qiang,LOU Zhen,Yang Jingyu.A Form Frame Line Removal Algorithm Based on Gray-Level Image[J].Journal of Computer Research and Development,2005,42(4):635-639.
Authors:Zhang Chongyang  Chen Qiang  LOU Zhen  Yang Jingyu
Affiliation:Zhang Chongyang 1,2,Chen Qiang1,Lou Zhen1,and Yang Jingyu1 1
Abstract:Preprocess procedure is an important procedure in a document image analysis (DIA) system. In practical document images, characters usually overlap with the preprinted form frames, creating tremendous problems for the recognition engines. Most of the form frame line removal algorithms are based on bi-level images, which have lost much useful information during the binary stage. Proposed in this paper is a line removal algorithm directly based on gray-level images. First, cross-points of characters and lines are detected by Soble gradient. Then the overlapping types of characters and lines are converted into touch type or crossover type by cross-points analysis. Finally, lines are removed with topological method. Experiment results on 1225 real life character string images demonstrate the efficiency of this algorithm. The recognition rate is improved from 75.9% to 91.4%.
Keywords:document processing  form processing  line detection  line removal
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号