首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 14 毫秒
The large volume of mail and the increased cost of handling it has made postal automation an important domain for pattern recognition and computer vision research. A substantial amount of work is being done to design an automatic mail sorting system which can read and interpret the destination address on a mail piece and direct it to the appropriate bin. Robust optical character recognition (OCR) systems are now available which can read printed characters with great accuracy (> 99%). But, in order to read the destination address, the region in the image containing the address must first be located. Even though several approaches to address block location have been proposed in the literature, it remains a difficult problem. A simple method is presented for automatically identifying regions in envelope images which are candidates for being the destination address. The envelope image is considered to contain different textured regions, one of which corresponds to the text-content in the image. Thus, a texture-based segmentation method is used to identify the regions of text in the image. The method for texture discrimination is based on Gabor filters which have been successfully used earlier for a variety of texture classification and segmentation tasks. It is shown that only a small number of even-symmetric Gabor filters are needed in this application. The success of the texture-based segmentation algorithm for identifying address blocks is demonstrated on a number of test images. These results also demonstrate the invariance of the method to the orientation of text in the envelope image and the variations in the size and font of the text.  相似文献   

This study presents a new method, namely the multi-plane segmentation approach, for segmenting and extracting textual objects from various real-life complex document images. The proposed multi-plane segmentation approach first decomposes the document image into distinct object planes to extract and separate homogeneous objects including textual regions of interest, non-text objects such as graphics and pictures, and background textures. This process consists of two stages—localized histogram multilevel thresholding and multi-plane region matching and assembling. Then a text extraction procedure is applied on the resultant planes to detect and extract textual objects with different characteristics in the respective planes. The proposed approach processes document images regionally and adaptively according to their respective local features. Hence detailed characteristics of the extracted textual objects, particularly small characters with thin strokes, as well as gradational illuminations of characters, can be well-preserved. Moreover, this way also allows background objects with uneven, gradational, and sharp variations in contrast, illumination, and texture to be handled easily and well. Experimental results on real-life complex document images demonstrate that the proposed approach is effective in extracting textual objects with various illuminations, sizes, and font styles from various types of complex document images.  相似文献   

In any image segmentation problem, there exist uncertainties. These uncertainties occur from gray level and spatial ambiguities in an image. As a result, accurate segmentation of text regions from non-text regions (graphics/images) in mixed and complex documents is a fairly difficult problem. In this paper, we propose a novel text region segmentation method based on digital shearlet transform (DST). The method is capable of handling the uncertainties arising in the segmentation process. To capture the anisotropic features of the text regions, the proposed method uses the DST coefficients as input features to a segmentation process block. This block is designed using the neutrosophic set (NS) for management of the uncertainty in the process. The proposed method is experimentally verified extensively and the performance is compared with that of some state-of-the-art techniques both quantitatively and qualitatively using benchmark dataset.  相似文献   

Document layout analysis or page segmentation is the task of decomposing document images into many different regions such as texts, images, separators, and tables. It is still a challenging problem due to the variety of document layouts. In this paper, we propose a novel hybrid method, which includes three main stages to deal with this problem. In the first stage, the text and non-text elements are classified by using minimum homogeneity algorithm. This method is the combination of connected component analysis and multilevel homogeneity structure. Then, in the second stage, a new homogeneity structure is combined with an adaptive mathematical morphology in the text document to get a set of text regions. Besides, on the non-text document, further classification of non-text elements is applied to get separator regions, table regions, image regions, etc. The final stage, in refinement region and noise detection process, all regions both in the text document and non-text document are refined to eliminate noises and get the geometric layout of each region. The proposed method has been tested with the dataset of ICDAR2009 page segmentation competition and many other databases with different languages. The results of these tests showed that our proposed method achieves a higher accuracy compared to other methods. This proves the effectiveness and superiority of our method.  相似文献   

Document Segmentation is a process that aims to filter documents while identifying certain regions of interest. Generally, the regions of interest include texts, graphics (image occupied regions) and the background. This paper presents a novel top-bottom approach to perform document segmentation using texture features that are extracted from the specified/selected documents. A mask of suitable size is used to summarize textural features, and statistical parameters are captured as blocks in document images. Four textural features that are extracted from masks using the gray level co-occurrence matrix (glcm) include entropy, contrast, energy and homogeneity. Furthermore, two statistical parameters extracted from corresponding masks are the modal and median pixel values. The extracted attributes allow the classification of each mask or block as text, graphics, and background. A feedforward network is trained on the 6 extracted attributes, using documents obtained from a public database ; an error rate of 15.77 % is achieved. Furthermore, it is shown that this novel approach produces promising performance in segmenting documents and is expected to be significantly efficient for content-based information retrieval systems. Detection of duplicate documents within large databases is another potential area of application.  相似文献   

Marginal noise is a common phenomenon in document analysis which results from the scanning of thick documents or skew documents. It usually appears in the front of a large and dark region around the margin of document images. Marginal noise might cover meaningful document objects, such as text, graphics and forms. The overlapping of marginal noise with meaningful objects makes it difficult to perform the task of segmentation and recognition of document objects. This paper proposes a novel approach to remove marginal noise. The proposed approach consists of two steps which are marginal noise detection and marginal noise deletion. Marginal noise detection will reduce an original document image into a smaller image, and then find marginal noise regions according to the shape length and location of the split blocks. After the detection of marginal noise regions, different removal methods are performed. A local thresholding method is proposed for the removal of marginal noise in gray-scale document images, whereas a region growing method is devised for binary document images. Experimenting with a wide variety of test samples reveals the feasibility and effectiveness of our proposed approach in removing marginal noises.  相似文献   

Text extraction in mixed-type documents is a pre-processing and necessary stage for many document applications. In mixed-type color documents, text, drawings and graphics appear with millions of different colors. In many cases, text regions are overlaid onto drawings or graphics. In this paper, a new method to automatically detect and extract text in mixed-type color documents is presented. The proposed method is based on a combination of an adaptive color reduction (ACR) technique and a page layout analysis (PLA) approach. The ACR technique is used to obtain the optimal number of colors and to convert the document into the principal of them. Then, using the principal colors, the document image is split into the separable color plains. Thus, binary images are obtained, each one corresponding to a principal color. The PLA technique is applied independently to each of the color plains and identifies the text regions. A merging procedure is applied in the final stage to merge the text regions derived from the color plains and to produce the final document. Several experimental and comparative results, exhibiting the performance of the proposed technique, are also presented.  相似文献   

彩色扫描文档图像中图文分割算法   总被引:1,自引:0,他引:1  
针对彩色扫描文档图像的背景噪声以及文本区的纹理干扰,提出一种利用图像处理技术、结合彩色文档图像自身特点、对文档插图区域进行定位与分割的方法.首先生成一组减弱了文本区纹理信息的多尺度特征缩图;然后采用基于连通度的标记分割法去掉文本区域信息,确定图像区域;最后融合多尺度缩图信息实施图文分割.实验结果表明:该方法对于提高扫描文档图像的压缩比是行之有效的.  相似文献   

This paper presents a new knowledge-based system for extracting and identifying text-lines from various real-life mixed text/graphics compound document images. The proposed system first decomposes the document image into distinct object planes to separate homogeneous objects, including textual regions of interest, non-text objects such as graphics and pictures, and background textures. A knowledge-based text extraction and identification method obtains the text-lines with different characteristics in each plane. The proposed system offers high flexibility and expandability by merely updating new rules to cope with various types of real-life complex document images. Experimental and comparative results prove the effectiveness of the proposed knowledge-based system and its advantages in extracting text-lines with a large variety of illumination levels, sizes, and font styles from various types of mixed and overlapping text/graphics complex compound document images.  相似文献   

Document representation and its application to page decomposition   总被引:6,自引:0,他引:6  
Transforming a paper document to its electronic version in a form suitable for efficient storage, retrieval, and interpretation continues to be a challenging problem. An efficient representation scheme for document images is necessary to solve this problem. Document representation involves techniques of thresholding, skew detection, geometric layout analysis, and logical layout analysis. The derived representation can then be used in document storage and retrieval. Page segmentation is an important stage in representing document images obtained by scanning journal pages. The performance of a document understanding system greatly depends on the correctness of page segmentation and labeling of different regions such as text, tables, images, drawings, and rulers. We use the traditional bottom-up approach based on the connected component extraction to efficiently implement page segmentation and region identification. A new document model which preserves top-down generation information is proposed based on which a document is logically represented for interactive editing, storage, retrieval, transfer, and logical analysis. Our algorithm has a high accuracy and takes approximately 1.4 seconds on a SGI Indy workstation for model creation, including orientation estimation, segmentation, and labeling (text, table, image, drawing, and ruler) for a 2550×3300 image of a typical journal page scanned at 300 dpi. This method is applicable to documents from various technical journals and can accommodate moderate amounts of skew and noise  相似文献   

Parameter-free geometric document layout analysis   总被引:1,自引:0,他引:1  
Automatic transformation of paper documents into electronic documents requires geometric document layout analysis at the first stage. However, variations in character font sizes, text line spacing, and document layout structures have made it difficult to design a general-purpose document layout analysis algorithm for many years. The use of some parameters has therefore been unavoidable in previous methods. The authors propose a parameter-free method for segmenting the document images into maximal homogeneous regions and identifying them as texts, images, tables, and ruling lines. A pyramidal quadtree structure is constructed for multiscale analysis and a periodicity measure is suggested to find a periodical attribute of text regions for page segmentation. To obtain robust page segmentation results, a confirmation procedure using texture analysis is applied to only ambiguous regions. Based on the proposed periodicity measure, multiscale analysis, and confirmation procedure, we could develop a robust method for geometric document layout analysis independent of character font sizes, text line spacing, and document layout structures. The proposed method was experimented with the document database from the University of Washington and the MediaTeam Document Database. The results of these tests have shown that the proposed method provides more accurate results than previous ones  相似文献   

Learning texture discrimination masks   总被引:6,自引:0,他引:6  
A neural network texture classification method is proposed in this paper. The approach is introduced as a generalization of the multichannel filtering method. Instead of using a general filter bank, a neural network is trained to find a minimal set of specific filters, so that both the feature extraction and classification tasks are performed by the same unified network. The authors compute the error rates for different network parameters, and show the convergence speed of training and node pruning algorithms. The proposed method is demonstrated in several texture classification experiments. It is successfully applied in the tasks of locating barcodes in the images and segmenting a printed page into text, graphics, and background. Compared with the traditional multichannel filtering method, the neural network approach allows one to perform the same texture classification or segmentation task more efficiently. Extensions of the method, as well as its limitations, are discussed in the paper  相似文献   

Segmentation of an image composed of different kinds of texture fields has difficulty in an exact discrimination of the texture fields and a decision of the optimum number of segmentation areas in an image when the image contains similar and/or unstationary texture fields. In this paper we formulate the segmentation problem upon such images as an optimization problem and adopt evolutionary strategy of genetic algorithms for the clustering of small regions in a feature space. The purpose of this paper is to demonstrate the efficiency of genetic algorithms to the texture segmentation and to develop the automatic texture segmentation method.  相似文献   

目前不同种类的纹理区域组成的彩色图像分割还是一个难点。当一幅图像中包含相似的和(或)非固定的纹理区域时,难以计算出精确的纹理区域和分割区域的最优数目。描述了基于量子行为的微粒群优化算法(QPSO)的图像颜色分割方法,把图像分割问题看作一个最优化问题并且采用QPSO的进化策略聚类颜色特征空间中的区域。QPSO不仅参数个数少、随机性强,并且能覆盖所有解空间,保证算法的全局收敛。给出了三幅图像的分割效果,证明了QPSO算法在自动的和无监督的纹理分割上具有很好的效果。  相似文献   

A bottom-up approach to segmentation of a scanned document into background, text, and image regions is considered. The image is partitioned into blocks at the first step. A series of texture features is computed for each block. The block type is determined on the basis of these features. Different variants of block arrangement and size, 26 texture variables, and four block type classification algorithms have been considered. The block type is corrected on the basis of adjacent region analysis at the second step. The error matrix and ICDAR 2007 criterion are used for result estimation.  相似文献   

Chinese text location under complex background using Gabor filter and SVM   总被引:1,自引:0,他引:1  
For the Chinese text location under complex background, this paper presents a novel method by combining Gabor filter and support vector machine (SVM). It bases on such a fact that Chinese characters are composed of four kinds of strokes. By extracting four kinds of stroke features with Gabor filters, Chinese text location problem can be transformed into a texture classification one, which can use SVM classifier for the purpose. So, the proposed method is composed of two phases. First, Gabor filters with different scales and orientations are employed to obtain four texture images representing the stokes of Chinese text in horizontal line, top-down vertical line, left-downward slope line and short pausing stroke directions. Then, the text regions and background regions in four texture images are used to train four SVM classifiers to distinguish the texture in four directions, by integrating an SVM classification network to obtain the final classification results, according to the sum of the weights to determine whether the block is the text region. Some experiments are conducted on a large amount of typical images with different texts and different fonts. Compared with some existing methods, the proposed approach achieves better results for Chinese text location.  相似文献   

针对目前的打印文件识别方法受限于样本中必须有相同字符的问题,提出一种基于字符图像分割的打印文件识别方法。通过k-means算法对字符图像进行分割,分别对不同区域提取局部二值模式纹理特征,从而消除字符结构对识别结果的影响。研究了单一区域的特征集和组合特征集的分类识别效果,实验结果表明,该方法在样本中无相同字符的情况下,能够得到较高的识别准确率。  相似文献   

目的 现实中的纹理往往具有类型多样、形态多变、结构复杂等特点,直接影响到纹理图像分割的准确性。传统的无监督纹理图像分割算法具有一定的局限性,不能很好地提取稳定的纹理特征。本文提出了基于Gabor滤波器和改进的LTP(local ternary pattern)算子的针对复杂纹理图像的纹理特征提取算法。方法 利用Gabor滤波器和扩展LTP算子分别提取相同或相似纹理模式的纹理特征和纹理的差异性特征,并将这些特征融入到水平集框架中对纹理图像进行分割。结果 通过实验表明,对纹理方向及尺度变化较大的图像、复杂背景下的纹理图像以及弱纹理模式的图像,本文方法整体分割结果明显优于传统的Gabor滤波器、结构张量、拓展结构张量、局部相似度因子等纹理分割方法得到的结果。同时,将本文方法与基于LTP的方法进行对比,分割结果依然更优。在量化指标方面,将本文方法与各种无监督的纹理分割方法就分割准确度进行对比,结果表明,在典型的纹理图像上,本文方法准确度达到97%以上,高于其他方法的分割准确度。结论 提出了一种结合Gabor滤波器和扩展LTP算子的无监督多特征的纹理图像分割方法,能够较好地提取相似纹理模式的特征和纹理的差异性特征,且这些纹理特征可以很好地融合到水平集框架中,对真实世界复杂纹理图像能够得到良好的分割效果。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号