首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
扫描文档图像纠偏的关键是对图像偏转角度进行快速准确的估计。传统的基于图片自身纹理结构的算法,如Hough变换、Radon变换,不仅易受文档自身特殊结构或噪声影响,而且单幅图像纠偏的平均耗时较长。提出了一种基于低秩矩阵分解理论扫描文档图像的批量纠偏方法,该方法将批量图像构造成一个较大的矩阵,通过迭代对每一列进行适当地旋转,达到矩阵具有较低秩的目的,进而实现对每副图像偏转角度的恰当估计及纠偏。实验结果表明,该方法不仅具有较高纠偏的精度,而且单幅图片的平均耗时也小于现有的图片纠偏算法。  相似文献   

2.
基于视窗的OCR页面图像倾斜检测方法   总被引:2,自引:0,他引:2       下载免费PDF全文
文档在扫描输入过程中,所生成的页面图像一般都存在一定的角度倾斜,当页面图像倾斜角度过大时,将对进一步的版面分析以及字符识别产生不良影响。为了快速准确地检测页面图像倾斜角度和降低计算量,提出了一种基于视窗变换的页面图像倾斜检测方法,该算法首先对视窗中的文字及图片的细节部分进行模糊,然后对其边沿进行直线拟合,以便快速检测页面图像倾斜角度。实验结果表明,该方法能快速准确地检测出各类页面图像的倾斜角度,并具有良好的适应性。  相似文献   

3.
计算机光学乐谱识别技术是将传统的纸质型乐谱转化为计算机能够“读懂”的数字音乐,在计算机音乐领域中具有重要的应用价值、乐谱识别系统的输入是乐谱扫描图像,而扫描过程中出现的图像倾斜现象,会给识别过程中的谱线定位和谱段切割带来诸多困难,必须对图像作有效的倾斜校正以保证系统的性能。为此,提出了一种快速的乐谱图像倾角检测方法。该方法首先利用乐谱文档的自身结构特点,对图像进行预处理,滤除乐谱图像中不具备方向性的干扰像素,然后通过多组图像水平投影队列间的交叉相关性计算对倾角进行检测。其特点是在确保检测倾角精度的同时具有非常高的执行效率。实验结果表明这一方法是有效、实用的。  相似文献   

4.
一种改进的中文文档图像倾斜检测方法   总被引:4,自引:0,他引:4  
孙楠  刘志文 《计算机仿真》2006,23(9):184-187
图像获取设备将纸质文档转换为文档图像时,经常会使文档图像出现某种程度的倾斜,从而可能使后续的文档版面理解和OCR识别算法失败。文中提出一种基于近邻法的中文图像的倾斜角度检测方法,并采用最小二乘法减小倾斜估计的误差,从而大大优化了运算速度,增强了算法的鲁棒性,与现有方法相比,具有运算速度快,检测精度高的优势。算法在Visual C++下编程加以实现,通过对检测库中100幅倾斜中文文档图像的检测证明,该方法具有精度高和适应性强的特点。  相似文献   

5.
在对文档图像进行光学字符识别时,由于书籍扭曲的存在,识别率会降低。对于 含有页眉页脚线的扭曲文档图像,提出一种快速校正方法。首先分别检测并定位图像中的页眉 线,保存页眉线的坐标信息。根据等比算法计算页眉线上各点在校正时所需向上或向下移动的 距离,然后以此距离为参数扫描图像,计算页眉页脚线之间的各个目标像素校正所需移动的距 离,同时进行像素点的移动重构图像,最终得到校正的图像。实验结果表明,该方法校正效果明显, 对于包含页眉页脚线的扭曲文档图像有较好的校正效果,校正后OCR 识别率大幅度提高。  相似文献   

6.
针对文本图像倾斜检测的问题, 提出了一种新的基于几何约束的文本图像倾斜角自动检测算法。该算法采用边界标记自动机的方法对一组同行字符轮廓进行检测从而得到该组字符轮廓的最低点信息, 再用矩的方法剔除噪声字符, 并确定页面的倾斜角度。实验结果表明, 该算法在检测效率与准确率上都有了明显的提高, 同时在处理较大倾斜角和较少字符数目的倾斜检测中也有较好的执行效率。因此, 该算法可广泛应用于包括英文、中文、日文在内的多种语言文本图像的倾斜检测中。  相似文献   

7.
谢凤英  姜志国  汪雷 《计算机应用》2006,26(7):1587-1589
针对扫描背景不定且含有图表信息的复杂文本图像,提出了一种有效的倾斜检测方法。该方法首先通过对梯度图像的统计分析,自适应地选取到了包含文字的特征子区;在特征子区内,论文把文字行间的空白条带看作一条隐含的线,用优化理论计算出空白条带的倾斜角度,这也就是文本的倾斜角度。实验结果表明,该倾斜检测方法不受扫描背景、边界大小、文本布局及行间距等情况的限制,具有速度快、精度高、适应性强的特点。  相似文献   

8.
Document representation and its application to page decomposition   总被引:6,自引:0,他引:6  
Transforming a paper document to its electronic version in a form suitable for efficient storage, retrieval, and interpretation continues to be a challenging problem. An efficient representation scheme for document images is necessary to solve this problem. Document representation involves techniques of thresholding, skew detection, geometric layout analysis, and logical layout analysis. The derived representation can then be used in document storage and retrieval. Page segmentation is an important stage in representing document images obtained by scanning journal pages. The performance of a document understanding system greatly depends on the correctness of page segmentation and labeling of different regions such as text, tables, images, drawings, and rulers. We use the traditional bottom-up approach based on the connected component extraction to efficiently implement page segmentation and region identification. A new document model which preserves top-down generation information is proposed based on which a document is logically represented for interactive editing, storage, retrieval, transfer, and logical analysis. Our algorithm has a high accuracy and takes approximately 1.4 seconds on a SGI Indy workstation for model creation, including orientation estimation, segmentation, and labeling (text, table, image, drawing, and ruler) for a 2550×3300 image of a typical journal page scanned at 300 dpi. This method is applicable to documents from various technical journals and can accommodate moderate amounts of skew and noise  相似文献   

9.
一种基于Hough变换的文档图像倾斜纠正方法   总被引:10,自引:2,他引:8  
李政  杨扬  颉斌  王宏 《计算机应用》2005,25(3):583-585
在对文本扫描输入的过程中,文本图像不可避免地会发生倾斜,倾斜校正将为图文分割、文字识别等后续处理工作创造良好的条件。提出了一种基于Hough变换的检测图像倾斜度的方法,为了克服Hough变换计算量大的缺点,该方法首先选取局部代表性子区域并提取其图像水平边缘,然后对提取的水平边缘进行两级Hough变换,从而实现了准确性与快速性的很好结合。  相似文献   

10.
目的 在光学字符识别中,文本图像经常会出现一定角度的倾斜.为将倾斜的文本图像校正,以便于字符识别中的后续处理,快速准确地检测倾斜文本图像的倾角是非常重要的.方法 对基于投影轮廓的算法进行改进,提出了一种两级投影直方图方差的算法(TPHV).首先在预定的角度范围内以一定角度步长对选定的图像区域做多方向投影,以获取投影直方图;然后计算各角度投影直方图的均方差,求出所有投影直方图方差的最大差分,将对应的投影角度作为倾角的粗略估值,最后以粗略估值为中心,以第1次投影步长为半径的角度范围内,再次以给定的检测精度为步长进行投影,重复第1次投影的工作,求出投影直方图方差的最大值,以对应的角度作为图像倾角的检测值.结果 该算法能够处理各种复杂的文本图像;对于诸如2 480×3 508像素的较大图像,可在200 ms左右的时间内完成倾角的检测;可检测的倾角范围不受限制;对相关网站提供的5组共500幅测试图像检测误差绝对值均值不超过0.5°,最大值不超过0.7°,检测误差的方差不超过0.1.结论 实验结果表明,该算法具有明显优势:速度快,倾斜角度检测精度高,误差集中,检测范围大,对噪声不敏感,具有广泛的适用性,适合于复杂的排版方式.  相似文献   

11.
在对文本扫描输入的过程中,文本图像不可避免地会发生倾斜,倾斜校正将为图文分割、文字识别等后续处理工作创造良好的条件。基于可变模板技术,提出一种新的倾斜检测方法。在构造表征扫描文本的可变模板和定义合适的能量函数的基础上,采用遗传算法进行快速优化搜索。实验结果表明,该方法能够精确检测文本的倾斜角度,并且不受倾角大小和文字方向的影响,具有较强的抗噪声性和较快的收敛速度。  相似文献   

12.
Character groundtruth for real, scanned document images is crucial for evaluating the performance of OCR systems, training OCR algorithms, and validating document degradation models. Unfortunately, manual collection of accurate groundtruth for characters in a real (scanned) document image is not practical because (i) accuracy in delineating groundtruth character bounding boxes is not high enough, (ii) it is extremely laborious and time consuming, and (iii) the manual labor required for this task is prohibitively expensive. Ee describe a closed-loop methodology for collecting very accurate groundtruth for scanned documents. We first create ideal documents using a typesetting language. Next we create the groundtruth for the ideal document. The ideal document is then printed, photocopied and then scanned. A registration algorithm estimates the global geometric transformation and then performs a robust local bitmap match to register the ideal document image to the scanned document image. Finally, groundtruth associated with the ideal document image is transformed using the estimated geometric transformation to create the groundtruth for the scanned document image. This methodology is very general and can be used for creating groundtruth for documents in typeset in any language, layout, font, and style. We have demonstrated the method by generating groundtruth for English, Hindi, and FAX document images. The cost of creating groundtruth using our methodology is minimal. If character, word or zone groundtruth is available for any real document, the registration algorithm can be used to generate the corresponding groundtruth for a rescanned version of the document  相似文献   

13.
We present here an enhanced algorithm (e-PCP) for skew detection in scanned documents, based on the work on Piecewise Covering by Parallelogram (PCP) for robust determination of skew angles [C.-H. Chou, S.-Y. Chu, F. Chang, Estimation of skew angles for scanned documents based on piecewise covering by parallelograms, Pattern Recognition 40 (2007) 443-455]. Our algorithm achieves even better robustness for detection of skew angle than the original PCP algorithm. We have shown accurate determination of skew angles in document images where the original PCP algorithm fails. Further, the increased robustness of performance is achieved with reduced number of computation compared to the originally proposed PCP algorithm. The e-PCP algorithm also outputs a confidence measure which is important in automated systems to filter cases where the estimated skew angle may not be very accurate and thus can be handled by manual intervention. The proposed algorithm was tested extensively on all categories of real time documents and comparisons with PCP method is also provided. Useful details regarding faster execution of the proposed algorithm is provided in Appendix.  相似文献   

14.
一种快速的文本倾斜检测方法   总被引:2,自引:0,他引:2  
文本的倾斜检测是将文本转换成数字形式的过程中的第一步工作,也是很重要的一步工作。因为后续的很多工作都是基于摆正的文本。文章提出了一种全新的倾斜检测与纠正方法。其特点在于:一、与文本的纹理无关,从而适应各种图文混排及各种书写方向并存等复杂情形;二、运算量小,只需进行一次旋转和四次对图像的部分投影。  相似文献   

15.
Skew estimation and page segmentation are the two closely related processing stages for document image analysis. Skew estimation needs proper page segmentation, especially for document images with multiple skews that are common in scanned images from thick bound publications in 2-up style or postal envelopes with various printed labels. Even if only a single skew is concerned for a document image, the presence of minority regions of different skews or undefined skew such as noise may severely affect the estimation for the dominant skew. Page segmentation, on the other hand, may need to know the exact skew angle of a page in order to work properly. This paper presents a skew estimation method with built-in skew-independent segmentation functionality that is capable of handling document images with multiple regions of different skews. It is based on the convex hulls of the individual components (i.e. the smallest convex polygon that fully contains a component) and that of the component groups (i.e. the smallest convex polygon that fully contain all the components in a group) in a document image. The proposed method first extracts the convex hulls of the components, segments an image into groups of components according to both the spatial distances and size similarities among the convex hulls of the components. This process not only extracts the hints of the alignments of the text groups, but also separate noise or graphical components from that of the textual ones. To verify the proposed algorithms, the full sets of the real and the synthetic samples of the University of Washington English Document Image Database I (UW-I) are used. Quantitative and qualitative comparisons with some existing methods are also provided.  相似文献   

16.
The digitalization processes of documents produce frequently images with small rotation angles. The skew angles in document images degrade the performance of optical character recognition (OCR) tools. Therefore, skew detection of document images plays an important role in automatic document analysis systems. In this paper, we propose a Rectangular Active Contour Model (RAC Model) for content region detection and skew angle calculation by imposing a rectangular shape constraint on the zero-level set in Chan–Vese Model (C-V Model) according to the rectangular feature of content regions in document images. Our algorithm differs from other skew detection methods in that it does not rely on local image features. Instead, it uses global image features and shape constraint to obtain a strong robustness in detecting skew angles of document images. We experimented on different types of document images. Comparing the results with other skew detection algorithms, our algorithm is more accurate in detecting the skews of the complex document images with different fonts, tables, illustrations, and layouts. We do not need to pre-process the original image, even if it is noisy, and at the same time the rectangular content region of a document image is also detected.  相似文献   

17.
归纳现有的多种倾斜字符校正方法,提出一种基于Radon变换用于一般字符校正的方法。该方法对文本图像做简单的数学形态学处理,利用Radon变换求出文本倾斜的角度。实验结果表明:与传统方法相比,本算法减少了计算量,提高了校正效率,且具有较强的适用性和鲁棒性。  相似文献   

18.
Among the most commonly used compression algorithms for document images are those defined by the Consultative Committee for International Telephone and Telegraph (CCITT). CCITT Group III compression is used in all facsimile transmission by modem over analog telephone lines. CCITT Group IV is used in digital transmission and storage of document images. Sufficient readily interpretable spatial information exists in these compressed document images to enable their characterization. In particular, it is possible to locate the positions of the bottoms of both black and white structures. Using the bottoms of black structures we can determine the peak strength of their alignment in order to determine the dominant skew angle of the image. This method can be expanded, by finding minor peaks, to identify multiple skew angles in single images. The angular distributions of the peak alignments of both white and black structures are assembled to form an alignment signature. Logotypes can be designed which generate distinct alignment signatures that are detectable in the compressed representation.  相似文献   

19.
文本页面图像的图文分割与分类算法   总被引:2,自引:0,他引:2       下载免费PDF全文
为了能对包含不规则图片区和表格的倾斜文本页面图像进行图文分割与分类,提出了一种新的图文分割和分类算法。该算法先采用数学形态学和分级霍夫变换来进行文本倾斜的检测和校正;然后为了使算法能够对包含不规则图片区的文本页面图像进行处理,提出在传统的投影轮廓切割算法中,引入中点切割的过程,以便利用一系列的矩形来近似地逼近不规则的图片区。对于分割后的图像,则提出利用黑白像素比(Rbw)和近邻像素间的交叉相关性(Rcc)两个特征来作为分类的判据。实验结果证明,算法速度快、可靠性高。该算法只适用于二值图像。  相似文献   

20.
A new signal processing method is developed for estimating the skew angle in text document images. Detection of the skew angle is an important step in text processing tasks such as optical character recognition (OCR) and computerized filing. Based on a recently introduced multiline-fitting algorithm, the proposed method reformulates the skew detection problem into a special parameter-estimation framework such that a signal structure similar to the one in the field of sensor array processing is obtained. In this framework, straight lines in an image are modeled as wavefronts of propagating planar waves. Certain measurements are defined in this virtual propagation environment such that the large amount of coherency that exists between the locations of the pixels on parallel lines is exploited to enhance a subspace in the space spanned by the measurements. The well-studied techniques of sensor array processing (e.g., the ESPRIT algorithm) are then exploited to produce a closed form and high-resolution estimate for the skew angle.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号