首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.
The existing skew estimation techniques usually assume that the input image is of high resolution and that the detectable angle range is limited. We present a more generic solution for this task that overcomes these restrictions. Our method is based on determination of the first eigenvector of the data covariance matrix. The solution comprises image resolution reduction, connected component analysis, component classification using a fuzzy approach, and skew estimation. Experiments on a large set of various document images and performance comparison with two Hough transform-based methods show a good accuracy and robustness for our method. Received October 10, 1998 / Revised version September 9, 1999  相似文献   

4.
In this paper, an integrated offline recognition system for unconstrained handwriting is presented. The proposed system consists of seven main modules: skew angle estimation and correction, printed-handwritten text discrimination, line segmentation, slant removing, word segmentation, and character segmentation and recognition, stemming from the implementation of already existing algorithms as well as novel algorithms. This system has been tested on the NIST, IAM-DB, and GRUHD databases and has achieved accuracy that varies from 65.6% to 100% depending on the database and the experiment.  相似文献   

5.
An architecture for handwritten text recognition systems   总被引:1,自引:1,他引:0  
This paper presents an end-to-end system for reading handwritten page images. Five functional modules included in the system are introduced in this paper: (i) pre-processing, which concerns introducing an image representation for easy manipulation of large page images and image handling procedures using the image representation; (ii) line separation, concerning text line detection and extracting images of lines of text from a page image; (iii) word segmentation, which concerns locating word gaps and isolating words from a line of text image obtained efficiently and in an intelligent manner; (iv) word recognition, concerning handwritten word recognition algorithms; and (v) linguistic post-pro- cessing, which concerns the use of linguistic constraints to intelligently parse and recognize text. Key ideas employed in each functional module, which have been developed for dealing with the diversity of handwriting in its various aspects with a goal of system reliability and robustness, are described in this paper. Preliminary experiments show promising results in terms of speed and accuracy. Received October 30, 1998 / Revised January 15, 1999  相似文献   

6.
A Document Skew Detection Method Using the Hough Transform   总被引:4,自引:0,他引:4  
Document image processing has become an increasingly important technology in the automation of office documentation tasks. Automatic document scanners such as text readers and OCR (Optical Character Recognition) systems are an essential component of systems capable of those tasks. One of the problems in this field is that the document to be read is not always placed correctly on a flatbed scanner. This means that the document may be skewed on the scanner bed, resulting in a skewed image. This skew has a detrimental effect on document on document analysis, document understanding, and character segmentation and recognition. Consequently, detecting the skew of a document image and correcting it are important issues in realising a practical document reader. In this paper we describe a new algorithm for skew detection. We then compare the performance and results of this skew detection algorithm to other publidhed methods form O'Gorman, Hinds, Le, Baird, Posel and Akuyama. Finally, we discuss the theory of skew detection and the different apporaches taken to solve the problem of skew in documents. The skew correction algorithm we propose has been shown to be extremenly fast, with run times averaging under 0.25 CPU seconds to calculate the angle on the DEC 5000/20 workstation. Received: 21 November 1998, Received in revised form: 25 August 1999, Accepted: 20 October 1999  相似文献   

7.
Document representation and its application to page decomposition   总被引:6,自引:0,他引:6  
Transforming a paper document to its electronic version in a form suitable for efficient storage, retrieval, and interpretation continues to be a challenging problem. An efficient representation scheme for document images is necessary to solve this problem. Document representation involves techniques of thresholding, skew detection, geometric layout analysis, and logical layout analysis. The derived representation can then be used in document storage and retrieval. Page segmentation is an important stage in representing document images obtained by scanning journal pages. The performance of a document understanding system greatly depends on the correctness of page segmentation and labeling of different regions such as text, tables, images, drawings, and rulers. We use the traditional bottom-up approach based on the connected component extraction to efficiently implement page segmentation and region identification. A new document model which preserves top-down generation information is proposed based on which a document is logically represented for interactive editing, storage, retrieval, transfer, and logical analysis. Our algorithm has a high accuracy and takes approximately 1.4 seconds on a SGI Indy workstation for model creation, including orientation estimation, segmentation, and labeling (text, table, image, drawing, and ruler) for a 2550×3300 image of a typical journal page scanned at 300 dpi. This method is applicable to documents from various technical journals and can accommodate moderate amounts of skew and noise  相似文献   

8.
Extracting curved text lines using local linearity of the text line   总被引:1,自引:0,他引:1  
In order to enhance the ability of document analysis systems, we need a text line extraction method which can handle not only straight text lines but also text lines in various shapes. This paper proposes a new method called Extended Linear Segment Linking (ELSL for short), which is able to extract text lines in arbitrary orientations and curved text lines. We also consider the existence of both horizontally and vertically printed text lines on the same page. The new method can produce text line candidates for multiple orientations. We verify the ability of the method by some experiments as well. Received December 21, 1998 / Revised version September 2, 1999  相似文献   

9.
Document image segmentation is the first step in document image analysis and understanding. One major problem centres on the performance analysis of the evolving segmentation algorithms. The use of a standard document database maintained at the Universities/Research Laboratories helps to solve the problem of getting authentic data sources and other information, but some methodologies have to be used for performance analysis of the segmentation. We describe a new document model in terms of a bounding box representation of its constituent parts and suggest an empirical measure of performance of a segmentation algorithm based on this new graph-like model of the document. Besides the global error measures, the proposed method also produces segment-wise details of common segmentation problems such as horizontal and vertical split and merge as well as invalid and mismatched regions. Received July 14, 2000 / Revised June 12, 2001[-1mm]  相似文献   

10.
Automatic character recognition and image understanding of a given paper document are the main objectives of the computer vision field. For these problems, a basic step is to isolate characters and group words from these isolated characters. In this paper, we propose a new method for extracting characters from a mixed text/graphic machine-printed document and an algorithm for distinguishing words from the isolated characters. For extracting characters, we exploit several features (size, elongation, and density) of characters and propose a characteristic value for classification using the run-length frequency of the image component. In the context of word grouping, previous works have largely been concerned with words which are placed on a horizontal or vertical line. Our word grouping algorithm can group words which are on inclined lines, intersecting lines, and even curved lines. To do this, we introduce the 3D neighborhood graph model which is very useful and efficient for character classification and word grouping. In the 3D neighborhood graph model, each connected component of a text image segment is mapped onto 3D space according to the area of the bounding box and positional information from the document. We conducted tests with more than 20 English documents and more than ten oriental documents scanned from books, brochures, and magazines. Experimental results show that more than 95% of words are successfully extracted from general documents, even in very complicated oriental documents. Received August 3, 2001 / Accepted August 8, 2001  相似文献   

11.
为解决朝鲜语古籍数字化中朝汉文种混排字符切分困难的问题,提出一种朝鲜语古籍图像的文字切分算法。针对古籍列与列之间存在不连续间隔线、倾斜或者粘连等问题,提出一种基于连通域投影的列切分方法。利用连通域的删除、合并、拆分等操作对文字进行切分。使用一种多步切分法完成了具有文字大小不一,横向、纵向混合排版特点图像的字符切分工作。对于粘连字,采用改进的滴水算法进行有效切分。实验结果表明所提出的算法能够很好地完成朝、汉文种混排,文字大小不一,排版情况复杂的朝鲜语古籍图像的文字切分工作。该算法的列切分准确率为97.69%,字切分准确率为87.79%。  相似文献   

12.
Transforming paper documents into XML format with WISDOM++   总被引:1,自引:1,他引:0  
The transformation of scanned paper documents to a form suitable for an Internet browser is a complex process that requires solutions to several problems. The application of an OCR to some parts of the document image is only one of the problems. In fact, the generation of documents in HTML format is easier when the layout structure of a page has been extracted by means of a document analysis process. The adoption of an XML format is even better, since it can facilitate the retrieval of documents in the Web. Nevertheless, an effective transformation of paper documents into this format requires further processing steps, namely document image classification and understanding. WISDOM++ is a document processing system that operates in five steps: document analysis, document classification, document understanding, text recognition with an OCR, and transformation into HTML/XML format. The innovative aspects described in the paper are: the preprocessing algorithm, the adaptive page segmentation, the acquisition of block classification rules using techniques from machine learning, the layout analysis based on general layout principles, and a method that uses document layout information for conversion to HTML/XML formats. A benchmarking of the system components implementing these innovative aspects is reported. Received June 15, 2000 / Revised November 7, 2000  相似文献   

13.
A new thresholding method, called the noise attribute thresholding method (NAT), for document image binarization is presented in this paper. This method utilizes the noise attribute features extracted from the images to make the selection of threshold values for image thresholding. These features are based on the properties of noise in the images and are independent of the strength of the signals (objects and background) in the image. A simple noise model is given to explain these noise properties. The NAT method has been applied to the problem of removing text and figures printed on the back of the paper. Conventional global thresholding methods cannot solve this kind of problem satisfactorily. Experimental results show that the NAT method is very effective. Received July 05, 1999 / Revised July 07, 2000  相似文献   

14.
An efficient algorithm is presented in this paper for correcting skew of text lines in scanned document images. In this method, the cross-correlation between two lines in the image with a fixed distance is calculated. The correlation functions for all pairs of lines in the image are accumulated. The shift for which the accumulated cross-correlation function takes the maximum is then used for determining the skew angle. The image is rotated in the opposite direction for skew correction. The correlation function can be calculated without multiplications for binary images, thus the algorithm can be very efficiently implemented. The method can be used directly for gray-scale and color images as well as binary images. It has been tested on scanned document images with good results.  相似文献   

15.
基于视窗的OCR页面图像倾斜检测方法   总被引:2,自引:0,他引:2       下载免费PDF全文
文档在扫描输入过程中,所生成的页面图像一般都存在一定的角度倾斜,当页面图像倾斜角度过大时,将对进一步的版面分析以及字符识别产生不良影响。为了快速准确地检测页面图像倾斜角度和降低计算量,提出了一种基于视窗变换的页面图像倾斜检测方法,该算法首先对视窗中的文字及图片的细节部分进行模糊,然后对其边沿进行直线拟合,以便快速检测页面图像倾斜角度。实验结果表明,该方法能快速准确地检测出各类页面图像的倾斜角度,并具有良好的适应性。  相似文献   

16.
Performance evaluation is crucial for improving the performance of OCR systems. However, this is trivial and sophisticated work to do by hand. Therefore, we have developed an automatic performance evaluation system for a printed Chinese character recognition (PCCR) system. Our system is characterized by using real-world data as test data and automatically obtaining the performance of the PCCR system by comparing the correct text and the recognition result of the document image. In addition, our performance evaluation system also provides some evaluation of performance for the segmentation module, the classification module, and the post-processing module of the PCCR system. For this purpose, a segmentation error-tolerant character-string matching algorithm is proposed to obtain the correspondence between the correct text and the recognition result. The experiments show that our performance evaluation system is an accurate and powerful tool for studying deficiencies in the PCCR system. Although our approach is aimed at the PCCR system, the idea also can be applied to other OCR systems.  相似文献   

17.
Text line segmentation in handwritten documents is an important task in the recognition of historical documents. Handwritten document images contain text lines with multiple orientations, touching and overlapping characters between consecutive text lines and different document structures, making line segmentation a difficult task. In this paper, we present a new approach for handwritten text line segmentation solving the problems of touching components, curvilinear text lines and horizontally overlapping components. The proposed algorithm formulates line segmentation as finding the central path in the area between two consecutive lines. This is solved as a graph traversal problem. A graph is constructed using the skeleton of the image. Then, a path-finding algorithm is used to find the optimum path between text lines. The proposed algorithm has been evaluated on a comprehensive dataset consisting of five databases: ICDAR2009, ICDAR2013, UMD, the George Washington and the Barcelona Marriages Database. The proposed method outperforms the state-of-the-art considering the different types and difficulties of the benchmarking data.  相似文献   

18.
The performance of the algorithms for the extraction of primitives for the interpretation of line drawings is usually affected by the degradation of the information contained in the document due to factors such as low print contrast, defocusing, skew, etc. In this paper, we are proposing two algorithms for the extraction of primitives with good performance under degradation. The application of the algorithms is restricted to line drawings composed of horizontal and vertical lines. The performance of the algorithms has been evaluated by using a protocol described in the literature. Received: 6 August 1996 / Accepted: 16 July 1997  相似文献   

19.
谢凤英  姜志国  汪雷 《计算机应用》2006,26(7):1587-1589
针对扫描背景不定且含有图表信息的复杂文本图像,提出了一种有效的倾斜检测方法。该方法首先通过对梯度图像的统计分析,自适应地选取到了包含文字的特征子区;在特征子区内,论文把文字行间的空白条带看作一条隐含的线,用优化理论计算出空白条带的倾斜角度,这也就是文本的倾斜角度。实验结果表明,该倾斜检测方法不受扫描背景、边界大小、文本布局及行间距等情况的限制,具有速度快、精度高、适应性强的特点。  相似文献   

20.
Dot-matrix text recognition is a difficult problem, especially when characters are broken into several disconnected components. We present a dot-matrix text recognition system which uses the fact that dot-matrix fonts are fixed-pitch, in order to overcome the difficulty of the segmentation process. After finding the most likely pitch of the text, a decision is made as to whether the text is written in a fixed-pitch or proportional font. Fixed-pitch text is segmented using a pitch-based segmentation process that can successfully segment both touching and broken characters. We report performance results for the pitch estimation, fixed-pitch decision and segmentation, and recognition processes. Received October 18, 1999 / Revised April 21, 2000  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号