期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Hidden tree Markov models for document image classification 总被引：3，自引：0，他引：3

Diligenti M. Frasconi P. Gori M. 《IEEE transactions on pattern analysis and machine intelligence》2003,25(4):519-523

Classification is an important problem in image document processing and is often a preliminary step toward recognition, understanding, and information extraction. In this paper, the problem is formulated in the framework of concept learning and each category corresponds to the set of image documents with similar physical structure. We propose a solution based on two algorithmic ideas. First, we obtain a structured representation of images based on labeled XY-trees (this representation informs the learner about important relationships between image subconstituents). Second, we propose a probabilistic architecture that extends hidden Markov models for learning probability distributions defined on spaces of labeled trees. Finally, a successful application of this method to the categorization of commercial invoices is presented. 相似文献

2.

小波的文本图像区分及其在文献信息数字化中的应用

陈杰孙忠贵周书锋《智能系统学报》2010,5(2):185-188

目前,OCR技术对文本图像区域自动区分的效果还不够精确,进而影响了OCR技术在文献信息数字化过程中的工作效率.针对这一局限,提出了一种基于小波的文本图像区分方法.方法首先对扫描区域进行小波分解,然后使用分解系数构建分解能量,最后依据分解能量大小对文本图像进行自动区分.结果表明,该方法对文本图像的区分效果较好,减少了在使用OCR技术进行文献信息数字化时的人为干预,有利于提高文献信息数字化过程的自动化水平.最后通过实验仿真验证了该方法的有效性. 相似文献

3.

Hierarchical content classification and script determination for automatic document image processing

Zheru ChiAuthor Vitae Qing WangAuthor Vitae Wan-Chi SiuAuthor Vitae 《Pattern recognition》2003,36(11):2483-2500

相似文献

4.

EAML: ensemble self-attention-based mutual learning network for document image classification

Bakkali Souhail Ming Zuheng Coustaty Mickaël Rusiñol Marçal 《International Journal on Document Analysis and Recognition》2021,24(3):251-268

International Journal on Document Analysis and Recognition (IJDAR) - In the recent past, complex deep neural networks have received huge interest in various document understanding tasks such as... 相似文献

5.

DS-DPSO: A dual surrogate approach for intelligent watermarking of bi-tonal document image streams

Eduardo Vellasques Robert Sabourin Eric Granger 《Expert systems with applications》2013,40(13):5240-5259

Intelligent watermarking (IW) techniques employ population-based evolutionary computing in order to optimize embedding parameters that trade-off between watermark robustness and image quality for digital watermarking systems. Recent advances indicate that it is possible to decrease the computational burden of IW techniques in scenarios involving long heterogeneous streams of bi-tonal document images by recalling embedding parameters (solutions) from a memory based on Gaussian Mixture Model (GMM) representation of optimization problems. This representation can provide ready-to-use solutions for similar optimization problem instances, avoiding the need for a costly re-optimization process. In this paper, a dual surrogate dynamic Particle Swarm Optimization (DS-DPSO) approach is proposed which employs a memory of GMMs in regression mode in order to decrease the cost of re-optimization for heterogeneous bi-tonal image streams. This approach is applied within a four level search for near-optimal solutions, with increasing computational burden and precision. Following previous research, the first two levels use GMM re-sampling to recall solutions for recurring problems, allowing to manage streams of heterogeneous images. Then, if embedding parameters of an image require a significant adaptation, the third level is activated. This optimization level relies on an off-line surrogate, using Gaussian Mixture Regression (GMR), in order to replace costly fitness evaluations during optimization. The final level also performs optimization, but GMR is employed as a costlier on-line surrogate in a worst-case scenario and provides a safeguard to the IW system. Experimental validation were performed on the OULU image data set, featuring heterogeneous image streams with a varying levels of attacks. In this scenario, the DS-DPSO approach has been shown to provide comparable level of watermarking performance with a 93% decline in computational cost compared to full re-optimization. Indeed, when significant parameter adaptation is required, fitness evaluations may be replaced with GMR. 相似文献

6.

Binarization of degraded document image based on feature space partitioning and classification

Morteza Valizadeh Ehsanollah Kabir 《International Journal on Document Analysis and Recognition》2012,15(1):57-69

In this paper, we propose a new algorithm for the binarization of degraded document images. We map the image into a 2D feature space in which the text and background pixels are separable, and then we partition this feature space into small regions. These regions are labeled as text or background using the result of a basic binarization algorithm applied on the original image. Finally, each pixel of the image is classified as either text or background based on the label of its corresponding region in the feature space. Our algorithm splits the feature space into text and background regions without using any training dataset. In addition, this algorithm does not need any parameter setting by the user and is appropriate for various types of degraded document images. The proposed algorithm demonstrated superior performance against six well-known algorithms on three datasets. 相似文献

7.

Schema-aware XPath filtering on XML document streams

Daewook Lee Joonho Kwon Weidong Yang Hyoseop Shin Jae-min Kwak Sukho Lee 《Journal of Intelligent Manufacturing》2009,20(3):273-282

The XML stream filtering is gaining widespread attention from the research community in recent years. There have been many efforts to improve the performance of the XML filtering system by utilizing XML schema information. In this paper, we design and implement an XML stream filtering system, SFilter, which uses DTD or XML schema information for improving the performance. We propose the simplification and two kinds of optimization, one is static and the other is dynamic optimization. The Simplification and static optimization transform the XPath queries to make automata as an index structure for the filtering. The dynamic optimization are done in runtime at the filtering time. We developed five kinds of static optimization and two kinds of dynamic optimization. We present the novel filtering algorithm for the resulting transformed XPath queries and runtime optimizing. The experimental result shows that our system filters the XML streams efficiently. 相似文献

8.

The document spectrum for page layout analysis 总被引：17，自引：0，他引：17

O'Gorman L. 《IEEE transactions on pattern analysis and machine intelligence》1993,15(11):1162-1173

Page layout analysis is a document processing technique used to determine the format of a page. This paper describes the document spectrum (or docstrum), which is a method for structural page layout analysis based on bottom-up, nearest-neighbor clustering of page components. The method yields an accurate measure of skew, within-line, and between-line spacings and locates text lines and text blocks. It is advantageous over many other methods in three main ways: independence from skew angle, independence from different text spacings, and the ability to process local regions of different text orientations within the same image. Results of the method shown for several different page formats and for randomly oriented subpages on the same image illustrate the versatility of the method. We also discuss the differences, advantages, and disadvantages of the docstrum with respect to other lay-out methods 相似文献

9.

Information retrieval in document image databases 总被引：2，自引：0，他引：2

Yje Lu Chew Lim Tan 《Knowledge and Data Engineering, IEEE Transactions on》2004,16(11):1398-1410

With the rising popularity and importance of document images as an information source, information retrieval in document image databases has become a growing and challenging problem. In this paper, we propose an approach with the capability of matching partial word images to address two issues in document image retrieval: word spotting and similarity measurement between documents. First, each word image is represented by a primitive string. Then, an inexact string matching technique is utilized to measure the similarity between the two primitive strings generated from two word images. Based on the similarity, we can estimate how a word image is relevant to the other and, thereby, decide whether one is a portion of the other. To deal with various character fonts, we use a primitive string which is tolerant to serif and font differences to represent a word image. Using this technique of inexact string matching, our method is able to successfully handle the problem of heavily touching characters. Experimental results on a variety of document image databases confirm the feasibility, validity, and efficiency of our proposed approach in document image retrieval. 相似文献

10.

Adaptive degraded document image binarization

B. Gatos Author Vitae I. Pratikakis Author Vitae Author Vitae 《Pattern recognition》2006,39(3):317-327

This paper presents a new adaptive approach for the binarization and enhancement of degraded documents. The proposed method does not require any parameter tuning by the user and can deal with degradations which occur due to shadows, non-uniform illumination, low contrast, large signal-dependent noise, smear and strain. We follow several distinct steps: a pre-processing procedure using a low-pass Wiener filter, a rough estimation of foreground regions, a background surface calculation by interpolating neighboring background intensities, a thresholding by combining the calculated background surface with the original image while incorporating image up-sampling and finally a post-processing step in order to improve the quality of text regions and preserve stroke connectivity. After extensive experiments, our method demonstrated superior performance against four (4) well-known techniques on numerous degraded document images. 相似文献

11.

A survey of document image classification: problem statement,classifier architecture and performance evaluation

Nawei Chen Dorothea Blostein 《International Journal on Document Analysis and Recognition》2007,10(1):1-16

Document image classification is an important step in Office Automation, Digital Libraries, and other document image analysis applications. There is great diversity in document image classifiers: they differ in the problems they solve, in the use of training data to construct class models, and in the choice of document features and classification algorithms. We survey this diverse literature using three components: the problem statement, the classifier architecture, and performance evaluation. This brings to light important issues in designing a document classifier, including the definition of document classes, the choice of document features and feature representation, and the choice of classification algorithm and learning mechanism. We emphasize techniques that classify single-page typeset document images without using OCR results. Developing a general, adaptable, high-performance classifier is challenging due to the great variety of documents, the diverse criteria used to define document classes, and the ambiguity that arises due to ill-defined or fuzzy document classes. 相似文献

12.

XML document classification based on ELM

Xiang-guo ZhaoAuthor Vitae Guoren WangAuthor Vitae Xin BiAuthor Vitae Peizhen Gong Yuhai Zhao 《Neurocomputing》2011,74(16):2444-2451

In this paper, we describe an XML document classification framework based on extreme learning machine (ELM). On the basis of Structured Link Vector Model (SLVM), an optimized Reduced Structured Vector Space Model (RS-VSM) is proposed to incorporate structural information into feature vectors more efficiently and optimize the computation of document similarity. We apply ELM in the XML document classification to achieve good performance at extremely high speed compared with conventional learning machines (e.g., support vector machine). A voting-ELM algorithm is then proposed to improve the accuracy of ELM classifier. Revoting of Equal Votes (REV) method and Revoting of Confusing Classes (RCC) method are also proposed to postprocess the voting result of v-ELM and further improve the performance. The experiments conducted on real world classification problems demonstrate that the voting-ELM classifiers presented in this paper can achieve better performance than ELM algorithms with respect to precision, recall and F-measure. 相似文献

13.

XML document indexes: a classification

Catania B. Maddalena A. Vakali A. 《Internet Computing, IEEE》2005,9(5):64-71

XML's increasing diffusion makes efficient XML query processing and indexing all the more critical. Given the semistructured nature of XML documents, however, general query processing techniques won't work. Researchers have proposed several specialized indexing methods that offer query processors efficient access to XML documents, although none are yet fully implemented in commercial products. In this article the classification of XML indexing techniques identifies current practices and trends, offering insight into how developers can improve query processing and select the best solution for particular contexts. 相似文献

14.

Historical document enhancement using LUT classification

Tayo Obafemi-Ajayi Gady Agam Ophir Frieder 《International Journal on Document Analysis and Recognition》2010,13(1):1-17

The fast evolution of scanning and computing technologies in recent years has led to the creation of large collections of scanned historical documents. It is almost always the case that these scanned documents suffer from some form of degradation. Large degradations make documents hard to read and substantially deteriorate the performance of automated document processing systems. Enhancement of degraded document images is normally performed assuming global degradation models. When the degradation is large, global degradation models do not perform well. In contrast, we propose to learn local degradation models and use them in enhancing degraded document images. Using a semi-automated enhancement system, we have labeled a subset of the Frieder diaries collection (The diaries of Rabbi Dr. Avraham Abba Frieder. ). This labeled subset was then used to train classifiers based on lookup tables in conjunction with the approximated nearest neighbor algorithm. The resulting algorithm is highly efficient and effective. Experimental evaluation results are provided using the Frieder diaries collection (The diaries of Rabbi Dr. Avraham Abba Frieder. ). 相似文献

15.

Adaptive document block segmentation and classification 总被引：3，自引：0，他引：3

Shih F.Y. Shy-Shyan Chen 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》1996,26(5):797-802

This paper presents an adaptive block segmentation and classification technique for daily-received office documents having complex layout structures such as multiple columns and mixed-mode contents of text, graphics, and pictures. First, an improved two-step block segmentation algorithm is performed based on run-length smoothing for decomposing any document into single-mode blocks. Then, a rule-based block classification is used for classifying each block into the text, horizontal/vertical line, graphics, or-picture type. The document features and rules used are independent of character font and size and the scanning resolution. Experimental results show that our algorithms are capable of correctly segmenting and classifying different types of mixed-mode printed documents. 相似文献

16.

Discriminative features for text document classification

K.?Torkkola Email author 《Pattern Analysis & Applications》2004,6(4):301-308

Abstract The bag-of-words approach to text document representation typically results in vectors of the order of 5000–20,000 components as the representation of documents. To make effective use of various statistical classifiers, it may be necessary to reduce the dimensionality of this representation. We point out deficiencies in class discrimination of two popular such methods, Latent Semantic Indexing (LSI), and sequential feature selection according to some relevant criterion. As a remedy, we suggest feature transforms based on Linear Discriminant Analysis (LDA). Since LDA requires operating both with large and dense matrices, we propose an efficient intermediate dimension reduction step using either a random transform or LSI. We report good classification results with the combined feature transform on a subset of the Reuters-21578 database. Drastic reduction of the feature vector dimensionality from 5000 to 12 actually improves the classification performance.An erratum to this article can be found at 相似文献

17.

PCA document reconstruction for email classification

Juan Carlos Gomez Marie-Francine Moens 《Computational statistics & data analysis》2012,56(3):741-751

This paper presents a document classifier based on text content features and its application to email classification. We test the validity of a classifier which uses Principal Component Analysis Document Reconstruction (PCADR), where the idea is that principal component analysis (PCA) can compress optimally only the kind of documents-in our experiments email classes-that are used to compute the principal components (PCs), and that for other kinds of documents the compression will not perform well using only a few components. Thus, the classifier computes separately the PCA for each document class, and when a new instance arrives to be classified, this new example is projected in each set of computed PCs corresponding to each class, and then is reconstructed using the same PCs. The reconstruction error is computed and the classifier assigns the instance to the class with the smallest error or divergence from the class representation. We test this approach in email filtering by distinguishing between two message classes (e.g. spam from ham, or phishing from ham). The experiments show that PCADR is able to obtain very good results with the different validation datasets employed, reaching a better performance than the popular Support Vector Machine classifier. 相似文献

18.

Learning with rationales for document classification

Manali Sharma Mustafa Bilgic 《Machine Learning》2018,107(5):797-824

We present a simple and yet effective approach for document classification to incorporate rationales elicited from annotators into the training of any off-the-shelf classifier. We empirically show on several document classification datasets that our classifier-agnostic approach, which makes no assumptions about the underlying classifier, can effectively incorporate rationales into the training of multinomial naïve Bayes, logistic regression, and support vector machines. In addition to being classifier-agnostic, we show that our method has comparable performance to previous classifier-specific approaches developed for incorporating rationales and feature annotations. Additionally, we propose and evaluate an active learning method tailored specifically for the learning with rationales framework. 相似文献

19.

Consensus-based clustering for document image segmentation

Soumyadeep Dey Jayanta Mukherjee Shamik Sural 《International Journal on Document Analysis and Recognition》2016,19(4):351-368

Segmentation of a document image plays an important role in automatic document processing. In this paper, we propose a consensus-based clustering approach for document image segmentation. In this method, the foreground regions of a document image are grouped into a set of primitive blocks, and a set of features is extracted from them. Similarities among the blocks are computed on each feature using a hypothesis test-based similarity measure. Based on the consensus of these similarities, clustering is performed on the primitive blocks. This clustering approach is used iteratively with a classifier to label each primitive block. Experimental results show the effectiveness of the proposed method. It is further shown in the experimental results that the dependency of classification performance on the training data is significantly reduced. 相似文献

20.

Automatic document classification and indexing in high-volume applications

E. Appiani F. Cesarini A.M. Colla M. Diligenti M. Gori S. Marinai G. Soda 《International Journal on Document Analysis and Recognition》2001,4(2):69-83

In this paper a system for analysis and automatic indexing of imaged documents for high-volume applications is described. This system, named STRETCH (STorage and RETrieval by Content of imaged documents), is based on an Archiving and Retrieval Engine, which overcomes the bottleneck of document profiling bypassing some limitations of existing pre-defined indexing schemes. The engine exploits a structured document representation and can activate appropriate methods to characterise and automatically index heterogeneous documents with variable layout. The originality of STRETCH lies principally in the possibility for unskilled users to define the indexes relevant to the document domains of their interest by simply presenting visual examples and applying reliable automatic information extraction methods (document classification, flexible reading strategies) to index the documents automatically, thus creating archives as desired. STRETCH offers ease of use and application programming and the ability to dynamically adapt to new types of documents. The system has been tested in two applications in particular, one concerning passive invoices and the other bank documents. In these applications, several classes of documents are involved. The indexing strategy first automatically classifies the document, thus avoiding pre-sorting, then locates and reads the information pertaining to the specific document class. Experimental results are encouraging overall; in particular, document classification results fulfill the requirements of high-volume application. Integration into production lines is under execution. Received March 30, 2000 / Revised June 26, 2001 相似文献