期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A multi-plane approach for text segmentation of complex document images

Yen-Lin Chen Author Vitae 《Pattern recognition》2009,42(7):1419-1444

This study presents a new method, namely the multi-plane segmentation approach, for segmenting and extracting textual objects from various real-life complex document images. The proposed multi-plane segmentation approach first decomposes the document image into distinct object planes to extract and separate homogeneous objects including textual regions of interest, non-text objects such as graphics and pictures, and background textures. This process consists of two stages—localized histogram multilevel thresholding and multi-plane region matching and assembling. Then a text extraction procedure is applied on the resultant planes to detect and extract textual objects with different characteristics in the respective planes. The proposed approach processes document images regionally and adaptively according to their respective local features. Hence detailed characteristics of the extracted textual objects, particularly small characters with thin strokes, as well as gradational illuminations of characters, can be well-preserved. Moreover, this way also allows background objects with uneven, gradational, and sharp variations in contrast, illumination, and texture to be handled easily and well. Experimental results on real-life complex document images demonstrate that the proposed approach is effective in extracting textual objects with various illuminations, sizes, and font styles from various types of complex document images. 相似文献

2.

A new approach on search for similar documents with multiple categories using fuzzy clustering

R&#x;dvan Saraolu Kemal Tütüncü Novruz Allahverdi 《Expert systems with applications》2008,34(4):2545-2554

Searching for similar document has an important role in text mining and document management. In whether similar document search or in other text mining applications generally document classification is focused and class or category that the documents belong to is tried to be determined. The aim of the present study is the investigation of the case which includes the documents that belong to more than one category. The system used in the present study is a similar document search system that uses fuzzy clustering. The situation of belonging to more than one category for the documents is included by this system. The proposed approach consists of two stages to solve multicategories problem. The first stage is to find out the documents belonging to more than one category. The second stage is the determination of the categories to which these found documents belong to. For these two aims -threshold Fuzzy Similarity Classification Method (-FSCM) and Multiple Categories Vector Method (MCVM) are proposed as written order. Experimental results showed that proposed system can distinguish the documents that belong to more than one category efficiently. Regarding to the finding which documents belong to which classes, proposed system has better performance and success than the traditional approach. 相似文献

3.

Shape based local thresholding for binarization of document images

Jichuan ShiNilanjan Ray Hong Zhang 《Pattern recognition letters》2012,33(1):24-32

This paper presents a novel local threshold algorithm for the binarization of document images. Stroke width of handwritten and printed characters in documents is utilized as the shape feature. As a result, in addition to the intensity analysis, the proposed algorithm introduces the stroke width as shape information into local thresholding. Experimental results for both synthetic and practical document images show that the proposed local threshold algorithm is superior in terms of segmentation quality to the threshold approaches that solely use intensity information. 相似文献

4.

A knowledge-based system for extracting text-lines from mixed and overlapping text/graphics compound document images

Yen-Lin ChenZeng-Wei Hong Cheng-Hung Chuang 《Expert systems with applications》2012,39(1):494-507

This paper presents a new knowledge-based system for extracting and identifying text-lines from various real-life mixed text/graphics compound document images. The proposed system first decomposes the document image into distinct object planes to separate homogeneous objects, including textual regions of interest, non-text objects such as graphics and pictures, and background textures. A knowledge-based text extraction and identification method obtains the text-lines with different characteristics in each plane. The proposed system offers high flexibility and expandability by merely updating new rules to cope with various types of real-life complex document images. Experimental and comparative results prove the effectiveness of the proposed knowledge-based system and its advantages in extracting text-lines with a large variety of illumination levels, sizes, and font styles from various types of mixed and overlapping text/graphics complex compound document images. 相似文献

5.

Marginal noise removal of document images

Kuo-Chin FanAuthor Vitae Yuan-Kai WangAuthor VitaeTsann-Ran LayAuthor Vitae 《Pattern recognition》2002,35(11):2593-2611

Marginal noise is a common phenomenon in document analysis which results from the scanning of thick documents or skew documents. It usually appears in the front of a large and dark region around the margin of document images. Marginal noise might cover meaningful document objects, such as text, graphics and forms. The overlapping of marginal noise with meaningful objects makes it difficult to perform the task of segmentation and recognition of document objects. This paper proposes a novel approach to remove marginal noise. The proposed approach consists of two steps which are marginal noise detection and marginal noise deletion. Marginal noise detection will reduce an original document image into a smaller image, and then find marginal noise regions according to the shape length and location of the split blocks. After the detection of marginal noise regions, different removal methods are performed. A local thresholding method is proposed for the removal of marginal noise in gray-scale document images, whereas a region growing method is devised for binary document images. Experimenting with a wide variety of test samples reveals the feasibility and effectiveness of our proposed approach in removing marginal noises. 相似文献

6.

Correcting show-through effects on scanned color document images by multiscale analysis

Hirobumi Takeshi 《Pattern recognition》2003,36(12):2835-2847

This paper describes a new approach to restoring scanned color document images where the backside image shows through the paper sheet. A new framework is presented for correcting show-through components using digital image processing techniques. First, the foreground components on the front side are separated from the background and backside components through locally adaptive binarization for each color component and edge magnitude thresholding. Background colors are estimated locally through color thresholding to generate a restored image, and then corrected adaptively through multi-scale analysis along with comparison of edge distributions between the original and the restored image. The proposed method does not require specific input devices or the backside to be input; it is able to correct unneeded image components through analysis of the front side image alone. Experimental results are given to verify effectiveness of the proposed method. 相似文献

7.

Comprehension effects of signalling relationships between documents in search engines

Ladislao Salmerón Laura Gil Ivar Bråten Helge Strømsø 《Computers in human behavior》2010

A key task for students learning about a complex topic from multiple documents on the web is to establish the existing rhetorical relations between the documents. Traditional search engines such as Google® display the search results in a listed format, without signalling any relationship between the documents retrieved. New search engines such as Kartoo® go a step further, displaying the results as a constellation of documents, in which the existing relations between pages are made explicit. This presentation format is based on previous studies of single-text comprehension, which demonstrate that providing a graphical overview of the text contents and their relation boosts readers’ comprehension of the topic. We investigated the assumption that graphical overviews can also facilitate multiple-documents comprehension. The present study revealed that undergraduate students reading a set of web pages on climate change comprehended them better when using a search engine that makes explicit the relationships between documents (i.e. Kartoo-like) than when working with a list-like presentation of the same documents (i.e. Google-like). The facilitative effect of a graphical-overview interface was reflected in inter-textual inferential tasks, which required students to integrate key information between documents, even after controlling for readers’ topic interest and background knowledge. 相似文献

8.

Adaptive inverse halftoning for scanned document images through multiresolution and multiscale analysis

Hirobumi Nishida^{Author Vitae} 《Pattern recognition》2005,38(2):251-260

This paper describes an efficient algorithm for inverse halftoning of scanned document images to resolve problems with interference patterns such as moiré and graininess when the images are displayed or printed out. The algorithm is suitable for software implementation and useful for high quality printing or display of scanned document images delivered via networks from unknown scanners. A multi-resolution approach is used to achieve practical processing speed under software implementation. Through data-driven, adaptive, multi-scale processing, the algorithm can cope with a variety of input devices and requires no information on the halftoning method or properties (such as coefficients in dither matrices, filter coefficients of error diffusion kernels, screen angles, or dot frequencies). Effectiveness of the new algorithm is demonstrated through real examples of scanned document images, as well as quantitative evaluations with synthetic data. 相似文献

9.

Problem-adaptable document analysis and understanding for high-volume applications

Bertin?Klein Email author Andreas?R.?Dengel 《International Journal on Document Analysis and Recognition》2003,6(3):167-180

相似文献

10.

Adaptive thresholding algorithm: Efficient computation technique based on intelligent block detection for degraded document images

Yu-Ting Pai Author VitaeAuthor Vitae Shanq-Jang Ruan^{Author Vitae} 《Pattern recognition》2010,43(9):3177-3187

Document image binarization involves converting gray level images into binary images, which is a feature that has significantly impacted many portable devices in recent years, including PDAs and mobile camera phones. Given the limited memory space and the computational power of portable devices, reducing the computational complexity of an embedded system is of priority concern. This work presents an efficient document image binarization algorithm with low computational complexity and high performance. Integrating the advantages of global and local methods allows the proposed algorithm to divide the document image into several regions. A threshold surface is then constructed based on the diversity and the intensity of each region to derive the binary image. Experimental results demonstrate the effectiveness of the proposed method in providing a promising binarization outcome and low computational cost. 相似文献

11.

An adaptive local binarization method for document images based on a novel thresholding method and dynamic windows 总被引：1，自引：0，他引：1

Bilal Bataineh Siti Norul Huda Sheikh Abdullah Khairuddin Omar 《Pattern recognition letters》2011,32(14):1805-1813

Binary image representation is essential format for document analysis. In general, different available binarization techniques are implemented for different types of binarization problems. The majority of binarization techniques are complex and are compounded from filters and existing operations. However, the few simple thresholding methods available cannot be applied to many binarization problems. In this paper, we propose a local binarization method based on a simple, novel thresholding method with dynamic and flexible windows. The proposed method is tested on selected samples called the DIBCO 2009 benchmark dataset using specialized evaluation techniques for binarization processes. To evaluate the performance of our proposed method, we compared it with the Niblack, Sauvola and NICK methods. The results of the experiments show that the proposed method adapts well to all types of binarization challenges, can deal with higher numbers of binarization problems and boosts the overall performance of the binarization. 相似文献

12.

Layout extraction of mixed mode documents 总被引：2，自引：0，他引：2

Frank Hönes Jürgen Lichter 《Machine Vision and Applications》1994,7(4):237-246

Proper processing and efficient representation of the digitized images of printed documents require the separation of the various information types: text, graphics, and image elements. For most applications it is sufficient to separate text and nontext, because text contains the most information. This paper describes the implementation and performance of a robust algorithm for text extraction and segmentation that is completely independent of text orientation and can deal with text in various font styles and sizes. Text objects can be nested in nontext areas, and inverse printing can also be analyzed. It should be mentioned that the classification is based only on rough image features, and individual characters are not recognized. The three main processing steps of the system are the generation of connected components, neighborhood analysis, and generation of text lines and blocks. As output, connected components are classified as text or nontext. Text components are grouped as characters, words, lines, and blocks. Nontext objects are accumulated as a separate nontext block. 相似文献

13.

Extraction and recognition of artificial text in multimedia documents 总被引：1，自引：0，他引：1

C.?Wolf jolion}@rfv.insa-lyon.fr" title="{wolf jolion}@rfv.insa-lyon.fr" itemprop="email" data-track="click" data-track-action="Email author" data-track-label="">Email author J.-M.?Jolion 《Pattern Analysis & Applications》2004,6(4):309-326

Abstract The systems currently available for contentbased image and video retrieval work without semantic knowledge, i. e. they use image processing methods to extract low level features of the data. The similarity obtained by these approaches does not always correspond to the similarity a human user would expect. A way to include more semantic knowledge into the indexing process is to use the text included in the images and video sequences. It is rich in information but easy to use, e. g. by key word based queries. In this paper we present an algorithm to localise artificial text in images and videos using a measure of accumulated gradients and morphological processing. The quality of the localised text is improved by robust multiple frame integration. A new technique for the binarisation of the text boxes based on a criterion maximizing local contrast is proposed. Finally, detection and OCR results for a commercial OCR are presented, justifying the choice of the binarisation technique.An erratum to this article can be found at 相似文献

14.

Development of a patent document classification and search platform using a back-propagation network 总被引：1，自引：0，他引：1

Amy J.C. Trappey Fu-Chiang Hsu Charles V. Trappey Chia-I. Lin 《Expert systems with applications》2006,31(4):755-765

In order to process large numbers of explicit knowledge documents such as patents in an organized manner, automatic document categorization and search are required. In this paper, we develop a document classification and search methodology based on neural network technology that helps companies manage patent documents more effectively. The classification process begins by extracting key phrases from the document set by means of automatic text processing and determining the significance of key phrases according to their frequency in text. In order to maintain a manageable number of independent key phrases, correlation analysis is applied to compute the similarities between key phrases. Phrases with higher correlations are synthesized into a smaller set of phrases. Finally, the back-propagation network model is adopted as a classifier. The target output identifies a patent document’s category based on a hierarchical classification scheme, in this case, the international patent classification (IPC) standard. The methodology is tested using patents related to the design of power hand-tools. Related patents are automatically classified using pre-trained neural network models. In the prototype system, two modules are used for patent document management. The automatic classification module helps the user classify patent documents and the search module helps users find relevant and related patent documents. The result shows an improvement in document classification and identification over previously published methods of patent document management. 相似文献

15.

Associative Naïve Bayes classifier: Automated linking of gene ontology to medline documents

Hyunki Kim Author Vitae Author Vitae 《Pattern recognition》2009,42(9):1777-270

We demonstrate a text-mining method, called associative Naïve Bayes (ANB) classifier, for automated linking of MEDLINE documents to gene ontology (GO). The approach of this paper is a nontrivial extension of document classification methodology from a fixed set of classes C={c₁,c₂,…,c_n} to a knowledge hierarchy like GO. Due to the complexity of GO, we use a knowledge representation structure. With that structure, we develop the text mining classifier, called ANB classifier, which automatically links Medline documents to GO. To check the performance, we compare our datasets under several well-known classifiers: NB classifier, large Bayes classifier, support vector machine and ANB classifier. Our results, described in the following, indicate its practical usefulness. 相似文献

16.

Color reconstruction in digital cameras: optimization for document images

C. Dance L. Fan 《International Journal on Document Analysis and Recognition》2005,7(2-3):138-146

Digital cameras normally sample one color at each pixel. Missing colors are obtained by spatial interpolation, decreasing resolution relative to images acquired with a greyscale sensor. The consequence for document imaging is higher text recognition error rates. This paper introduces the horizontal-vertical regression (HVR) method for document-optimized color reconstruction. HVR exploits a local two-color approximation, making spatial interpolation unnecessary. Comparison with the best alternative reconstruction methods indicates large reductions in error rates for text resulting from HVR, as well as improvements in intermediate color and binary images.Received: 11 June 2003, Accepted: 6 March 2004, Published online: 2 February 2005C. Dance: Correspondence to 相似文献

17.

Minimizer of the Reconstruction Error for multi-class document categorization

《Expert systems with applications》2014,41(3):861-868

In the present article we introduce and validate an approach for single-label multi-class document categorization based on text content features. The introduced approach uses the statistical property of Principal Component Analysis, which minimizes the reconstruction error of the training documents used to compute a low-rank category transformation matrix. Such matrix transforms the original set of training documents from a given category to a new low-rank space and then optimally reconstructs them to the original space with a minimum reconstruction error. The proposed method, called Minimizer of the Reconstruction Error (mRE) classifier, uses this property, and extends and applies it to new unseen test documents. Several experiments on four multi-class datasets for text categorization are conducted in order to test the stable and generally better performance of the proposed approach in comparison with other popular classification methods. 相似文献

18.

A fast algorithm for skew detection of document images using morphology 总被引：1，自引：0，他引：1

A.K. Das B. Chanda 《International Journal on Document Analysis and Recognition》2001,4(2):109-114

相似文献

19.

基于Word Spotting技术的蒙古文古籍图像检索中的特征选择

魏宏喜高光来《计算机应用》2011,31(11):3038-3041

设计了一个基于word spotting技术的蒙古文《甘珠尔经》图像检索的系统框架。在充分分析了蒙古文《甘珠尔经》中手写单词图像特点的基础上,提出了采用轮廓特征、投影特征和笔划穿越数目来表示单词图像。在由5500个单词图像构成的数据集上进行对比实验,确定了最佳的特征组合,平均准确率（MAP）能达到78.79%,R-Precision能达到73.01%。实验结果表明,所选择的特征是合理的、有效的。相似文献

20.

A binarization method with learning-built rules for document images produced by cameras

Chien-Hsing Chou Author Vitae 《Pattern recognition》2010,43(4):1518-1530

In this paper, we propose a novel binarization method for document images produced by cameras. Such images often have varying degrees of brightness and require more careful treatment than merely applying a statistical method to obtain a threshold value. To resolve the problem, the proposed method divides an image into several regions and decides how to binarize each region. The decision rules are derived from a learning process that takes training images as input. Tests on images produced under normal and inadequate illumination conditions show that our method yields better visual quality and better OCR performance than three global binarization methods and four locally adaptive binarization methods. 相似文献