首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This study presents a new method, namely the multi-plane segmentation approach, for segmenting and extracting textual objects from various real-life complex document images. The proposed multi-plane segmentation approach first decomposes the document image into distinct object planes to extract and separate homogeneous objects including textual regions of interest, non-text objects such as graphics and pictures, and background textures. This process consists of two stages—localized histogram multilevel thresholding and multi-plane region matching and assembling. Then a text extraction procedure is applied on the resultant planes to detect and extract textual objects with different characteristics in the respective planes. The proposed approach processes document images regionally and adaptively according to their respective local features. Hence detailed characteristics of the extracted textual objects, particularly small characters with thin strokes, as well as gradational illuminations of characters, can be well-preserved. Moreover, this way also allows background objects with uneven, gradational, and sharp variations in contrast, illumination, and texture to be handled easily and well. Experimental results on real-life complex document images demonstrate that the proposed approach is effective in extracting textual objects with various illuminations, sizes, and font styles from various types of complex document images.  相似文献   

2.
3.
Recent remarkable progress in computer systems and printing devices has made it easier to produce printed documents with various designs. Text characters are often printed on colored backgrounds, and sometimes on complex backgrounds such as photographs, computer graphics, etc. Some methods have been developed for character pattern extraction from document images and scene images with complex backgrounds. However, the previous methods are suitable only for extracting rather large characters, and the processes often fail to extract small characters with thin strokes. This paper proposes a new method by which character patterns can be extracted from document images with complex backgrounds. The method is based on local multilevel thresholding and pixel labeling, and region growing. This framework is very useful for extracting character patterns from badly illuminated document images. The performance of extracting small character patterns has been improved by suppressing the influence of mixed-color pixels around character edges. Experimental results show that the method is capable of extracting very small character patterns from main text blocks in various documents, separating characters and complex backgrounds, as long as the thickness of the character strokes is more than about 1.5 pixels. Received July 23, 2001 / Accepted November 5, 2001  相似文献   

4.
A multimedia document is composed of different media objects. ISO's Open Document Architecture (ODA) proposes a standard multimedia document model. However, the current ODA profile only includes static media, e.g. text, geometric graphics and images. Because the future multimedia documents not only include static media but also continuous media, e.g. video and audio, continuous media document parts should be added to have a complete multimedia document model. In this paper, we propose a multimedia document model, which is derived from ODA's concept. The proposed model is based on the object-oriented approach. Objects in the proposed document model are divided into two types: data objects and pseudo objects. Data objects are data structures of a document; pseudo objects are used to manage data objects. Based on the proposed model, a multimedia document authoring and presenting system (MMDS) is also developed on SUN SPARC workstations using the Solaris 2.X operating system  相似文献   

5.
6.
Segmentation and classification of mixed text/graphics/image documents   总被引:2,自引:0,他引:2  
In this paper, a feature-based document analysis system is presented which utilizes domain knowledge to segment and classify mixed text/graphics/image documents. In our approach, we first perform a run-length smearing operation followed by the stripe merging procedure to segment the blocks embedded in a document. The classification task is then performed based on the domain knowledge induced from the primitives associated with each type of medium. Proper use of domain knowledge is proved to be effective in accelerating the segmentation speed and decreasing the classification error. The experimental study reveals the feasibility of the new technique in segmenting and classifying mixed text/graphics/image documents.  相似文献   

7.
Marginal noise is a common phenomenon in document analysis which results from the scanning of thick documents or skew documents. It usually appears in the front of a large and dark region around the margin of document images. Marginal noise might cover meaningful document objects, such as text, graphics and forms. The overlapping of marginal noise with meaningful objects makes it difficult to perform the task of segmentation and recognition of document objects. This paper proposes a novel approach to remove marginal noise. The proposed approach consists of two steps which are marginal noise detection and marginal noise deletion. Marginal noise detection will reduce an original document image into a smaller image, and then find marginal noise regions according to the shape length and location of the split blocks. After the detection of marginal noise regions, different removal methods are performed. A local thresholding method is proposed for the removal of marginal noise in gray-scale document images, whereas a region growing method is devised for binary document images. Experimenting with a wide variety of test samples reveals the feasibility and effectiveness of our proposed approach in removing marginal noises.  相似文献   

8.
This paper presents a new approach for text-line segmentation based on Block Covering which solves the problem of overlapping and multi-touching components. Block Covering is the core of a system which processes a set of ancient Arabic documents from historical archives. The system is designed for separating text-lines even if they are overlapping and multi-touching. We exploit the Block Covering technique in three steps: a new fractal analysis (Block Counting) for document classification, a statistical analysis of block heights for block classification and a neighboring analysis for building text-lines. The Block Counting fractal analysis, associated with a fuzzy C-means scheme, is performed on document images in order to classify them according to their complexity: tightly (closely) spaced documents (TSD) or widely spaced documents (WSD). An optimal Block Covering is applied on TSD documents which include overlapping and multi-touching lines. The large blocks generated by the covering are then segmented by relying on the statistical analysis of block heights. The final labeling into text-lines is based on a block neighboring analysis. Experimental results provided on images of the Tunisian Historical Archives reveal the feasibility of the Block Covering technique for segmenting ancient Arabic documents.  相似文献   

9.
Variations in inter-line gaps and skewed or curled text-lines are some of the challenging issues in segmentation of handwritten text-lines. Moreover, overlapping and touching text-lines that frequently appear in unconstrained handwritten text documents significantly increase segmentation complexities. In this paper, we propose a novel approach for unconstrained handwritten text-line segmentation. A new painting technique is employed to smear the foreground portion of the document image. The painting technique enhances the separability between the foreground and background portions enabling easy detection of text-lines. A dilation operation is employed on the foreground portion of the painted image to obtain a single component for each text-line. Thinning of the background portion of the dilated image and subsequently some trimming operations are performed to obtain a number of separating lines, called candidate line separators. By using the starting and ending points of the candidate line separators and analyzing the distances among them, related candidate line separators are connected to obtain segmented text-lines. Furthermore, the problems of overlapping and touching components are addressed using some novel techniques. We tested the proposed scheme on text-pages of English, French, German, Greek, Persian, Oriya and Bangla and remarkable results were obtained.  相似文献   

10.
11.
一种基于条件随机场的复杂背景图像文本抽取方法   总被引:1,自引:0,他引:1  
针对复杂背景图像中的文本抽取问题,文中提出一种基于条件随机场的图像文本抽取方法.该方法在将各种特征有效结合起来的同时,考虑到上下文特征,从而能有效地从复杂图像中抽取文本信息.分析比较不同颜色空间、不同特征对文本抽取性能的影响.实验结果表明该方法的有效性.  相似文献   

12.
Text extraction in mixed-type documents is a pre-processing and necessary stage for many document applications. In mixed-type color documents, text, drawings and graphics appear with millions of different colors. In many cases, text regions are overlaid onto drawings or graphics. In this paper, a new method to automatically detect and extract text in mixed-type color documents is presented. The proposed method is based on a combination of an adaptive color reduction (ACR) technique and a page layout analysis (PLA) approach. The ACR technique is used to obtain the optimal number of colors and to convert the document into the principal of them. Then, using the principal colors, the document image is split into the separable color plains. Thus, binary images are obtained, each one corresponding to a principal color. The PLA technique is applied independently to each of the color plains and identifies the text regions. A merging procedure is applied in the final stage to merge the text regions derived from the color plains and to produce the final document. Several experimental and comparative results, exhibiting the performance of the proposed technique, are also presented.  相似文献   

13.
Text segmentation using gabor filters for automatic document processing   总被引:24,自引:0,他引:24  
There is a considerable interest in designing automatic systems that will scan a given paper document and store it on electronic media for easier storage, manipulation, and access. Most documents contain graphics and images in addition to text. Thus, the document image has to be segmented to identify the text regions, so that OCR techniques may be applied only to those regions. In this paper, we present a simple method for document image segmentation in which text regions in a given document image are automatically identified. The proposed segmentation method for document images is based on a multichannel filtering approach to texture segmentation. The text in the document is considered as a textured region. Nontext contents in the document, such as blank spaces, graphics, and pictures, are considered as regions with different textures. Thus, the problem of segmenting document images into text and nontext regions can be posed as a texture segmentation problem. Two-dimensional Gabor filters are used to extract texture features for each of these regions. These filters have been extensively used earlier for a variety of texture segmentation tasks. Here we apply the same filters to the document image segmentation problem. Our segmentation method does not assume any a priori knowledge about the content or font styles of the document, and is shown to work even for skewed images and handwritten text. Results of the proposed segmentation method are presented for several test images which demonstrate the robustness of this technique. This work was supported by the National Science Foundation under NSF grant CDA-88-06599 and by a grant from E. 1. Du Pont De Nemours & Company.  相似文献   

14.
As sharing documents through the World Wide Web has been recently and constantly increasing, the need for creating hyperdocuments to make them accessible and retrievable via the internet, in formats such as HTML and SGML/XML, has also been rapidly rising. Nevertheless, only a few works have been done on the conversion of paper documents into hyperdocuments. Moreover, most of these studies have concentrated on the direct conversion of single-column document images that include only text and image objects. In this paper, we propose two methods for converting complex multi-column document images into HTML documents, and a method for generating a structured table of contents page based on the logical structure analysis of the document image. Experiments with various kinds of multi-column document images show that, by using the proposed methods, their corresponding HTML documents can be generated in the same visual layout as that of the document images, and their structured table of contents page can be also produced with the hierarchically ordered section titles hyperlinked to the contents.  相似文献   

15.
当网络中存在不同类型的对象时,对象与对象之间的关系会变得多种多样,网络的结构也会变得更为复杂。针对网络的异构化问题,提出了一种基于神经网络的异构网络向量化表示方法。针对具有图片和文本两种类型对象的异构网络,采用多层次的卷积网络将图片映射到一个潜在的特征空间,采用全连接的神经网络将文本对象也映射到相同的特征空间。在该特征空间内,图片与图片、文本与文本以及图片和文本之间的相似性采用相同的距离计算方法。在实验中,应用提出的方法进行异构网络的多种应用测试,结果表明提出的方法是有效的。  相似文献   

16.
Layout extraction of mixed mode documents   总被引:2,自引:0,他引:2  
Proper processing and efficient representation of the digitized images of printed documents require the separation of the various information types: text, graphics, and image elements. For most applications it is sufficient to separate text and nontext, because text contains the most information. This paper describes the implementation and performance of a robust algorithm for text extraction and segmentation that is completely independent of text orientation and can deal with text in various font styles and sizes. Text objects can be nested in nontext areas, and inverse printing can also be analyzed. It should be mentioned that the classification is based only on rough image features, and individual characters are not recognized. The three main processing steps of the system are the generation of connected components, neighborhood analysis, and generation of text lines and blocks. As output, connected components are classified as text or nontext. Text components are grouped as characters, words, lines, and blocks. Nontext objects are accumulated as a separate nontext block.  相似文献   

17.
18.
Clustering of related or similar objects has long been regarded as a potentially useful contribution of helping users to navigate an information space such as a document collection. Many clustering algorithms and techniques have been developed and implemented but as the sizes of document collections have grown these techniques have not been scaled to large collections because of their computational overhead. To solve this problem, the proposed system concentrates on an interactive text clustering methodology, probability based topic oriented and semi-supervised document clustering. Recently, as web and various documents contain both text and large number of images, the proposed system concentrates on content-based image retrieval (CBIR) for image clustering to give additional effect to the document clustering approach. It suggests two kinds of indexing keys, major colour sets (MCS) and distribution block signature (DBS) to prune away the irrelevant images to given query image. Major colour sets are related with colour information while distribution block signatures are related with spatial information. After successively applying these filters to a large database, only small amount of high potential candidates that are somewhat similar to that of query image are identified. Then, the system uses quad modelling method (QM) to set the initial weight of two-dimensional cells in query image according to each major colour and retrieve more similar images through similarity association function associated with the weights. The proposed system evaluates the system efficiency by implementing and testing the clustering results with Dbscan and K-means clustering algorithms. Experiment shows that the proposed document clustering algorithm performs with an average efficiency of 94.4% for various document categories.  相似文献   

19.
针对自然场景图像背景复杂和文本方向不确定的问题,提出一种多方向自然场景文本检测的方法。首先利用颜色增强的最大稳定极值区域(C-MSER)方法对图像中的字符候选区域进行提取,并利用启发式规则和LIBSVM分类器对非字符区域进行消除;然后设计位置颜色模型将被误滤除的字符找回,并利用字符区域中心进行拟合估计文本行倾斜角度;最后通过一个CNN分类器得到精确的结果。该算法在两个标准数据集上(ICDAR2011和ICDAR2013)上进行了测试,f-score分别为0.81和0.82,证明了该方法的有效性。  相似文献   

20.
Document images often suffer from different types of degradation that renders the document image binarization a challenging task. This paper presents a document image binarization technique that segments the text from badly degraded document images accurately. The proposed technique is based on the observations that the text documents usually have a document background of the uniform color and texture and the document text within it has a different intensity level compared with the surrounding document background. Given a document image, the proposed technique first estimates a document background surface through an iterative polynomial smoothing procedure. Different types of document degradation are then compensated by using the estimated document background surface. The text stroke edge is further detected from the compensated document image by using L1-norm image gradient. Finally, the document text is segmented by a local threshold that is estimated based on the detected text stroke edges. The proposed technique was submitted to the recent document image binarization contest (DIBCO) held under the framework of ICDAR 2009 and has achieved the top performance among 43 algorithms that are submitted from 35 international research groups.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号