首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This article proposes an approach to predict the result of binarization algorithms on a given document image according to its state of degradation. Indeed, historical documents suffer from different types of degradation which result in binarization errors. We intend to characterize the degradation of a document image by using different features based on the intensity, quantity and location of the degradation. These features allow us to build prediction models of binarization algorithms that are very accurate according to $R^2$ values and p values. The prediction models are used to select the best binarization algorithm for a given document image. Obviously, this image-by-image strategy improves the binarization of the entire dataset.  相似文献   

2.
Document binarization is an important technique in document image analysis and recognition. Generally, binarization methods are ineffective for degraded images. Several binarization methods have been proposed; however, none of them are effective for historical and degraded document images. In this paper, a new binarization method is proposed for degraded document images. The proposed method based on the variance between pixel contrast, it consists of four stages: pre-processing, geometrical feature extraction, feature selection, and post-processing. The proposed method was evaluated based on several visual and statistical experiments. The experiments were conducted using five International Document Image Binarization Contest benchmark datasets specialized for binarization testing. The results compared with five adaptive binarization methods: Niblack, Sauvola thresholding, Sauvola compound algorithm, NICK, and Bataineh. The results show that the proposed method performs better than other methods in all binarization cases.  相似文献   

3.
In this work, a multi-scale binarization framework is introduced, which can be used along with any adaptive threshold-based binarization method. This framework is able to improve the binarization results and to restore weak connections and strokes, especially in the case of degraded historical documents. This is achieved thanks to localized nature of the framework on the spatial domain. The framework requires several binarizations on different scales, which is addressed by introduction of fast grid-based models. This enables us to explore high scales which are usually unreachable to the traditional approaches. In order to expand our set of adaptive methods, an adaptive modification of Otsu's method, called AdOtsu, is introduced. In addition, in order to restore document images suffering from bleed-through degradation, we combine the framework with recursive adaptive methods. The framework shows promising performance in subjective and objective evaluations performed on available datasets.  相似文献   

4.
Document image binarization is a difficult task, especially for complex document images. Nonuniform background, stains, and variation in the intensity of the printed characters are some examples of challenging document features. In this work, binarization is accomplished by taking advantage of local probabilistic models and of a flexible active contour scheme. More specifically, local linear models are used to estimate both the expected stroke and the background pixel intensities. This information is then used as the main driving force in the propagation of an active contour. In addition, a curvature-based force is used to control the viscosity of the contour and leads to more natural-looking results. The proposed implementation benefits from the level set framework, which is highly successful in other contexts, such as medical image segmentation and road network extraction from satellite images. The validity of the proposed approach is demonstrated on both recent and historical document images of various types and languages. In addition, this method was submitted to the Document Image Binarization Contest (DIBCO??09), at which it placed 3rd.  相似文献   

5.
6.
In this paper, we present an adaptive water flow model for the binarization of degraded document images. We regard an image surface as a three-dimensional terrain and pour water on it. The water finds the valleys and fills them. Our algorithm controls the rainfall process, pouring the water, in such a way that the water fills up to half of the valley’s depth. After stopping the rainfall, each wet region represents one character or a noisy component. To segment each character, we labeled the wet regions and regarded them as blobs; since some of the blobs are noisy components, we use a multilayer Perceptron to label each blob as either text or non-text. Since our algorithm classifies the blobs instead of pixels, it preserves stroke connectivity. After several experiments, the proposed binarization algorithm demonstrated superior performance against six well-known algorithms on three sets of degraded document images. The main superiority of our algorithm is on document images with uneven illumination.  相似文献   

7.
Multimedia Tools and Applications - Vehicle License Plate Recognition (VLPR) is one of the most important aspects of applying computer techniques in Intelligent Transport Systems (ITS). They face...  相似文献   

8.
9.
Multimedia Tools and Applications - Binarization of document images has great importance in several applications like historical document restoration, Optical Character Recognition (OCR). It is a...  相似文献   

10.
Evaluation of binarization methods for document images   总被引:19,自引:0,他引:19  
This paper presents an evaluation of eleven locally adaptive binarization methods for gray scale images with low contrast, variable background intensity and noise. Niblack's method (1986) with the addition of the postprocessing step of Yanowitz and Bruckstein's method (1989) added performed the best and was also one of the fastest binarization methods  相似文献   

11.
Pattern Analysis and Applications - Binarization of ancient degraded document images is a very important step for their preservation and digital use. In this paper, a new simple threshold-based...  相似文献   

12.
Binarization plays an important role in document image processing, especially in degraded documents. For degraded document images, adaptive binarization methods often incorporate local information to determine the binarization threshold for each individual pixel in the document image. We propose a two-stage parameter-free window-based method to binarize the degraded document images. In the first stage, an incremental scheme is used to determine a proper window size beyond which no substantial increase in the local variation of pixel intensities is observed. In the second stage, based on the determined window size, a noise-suppressing scheme delivers the final binarized image by contrasting two binarized images which are produced by two adaptive thresholding schemes which incorporate the local mean gray and gradient values. Empirical results demonstrate that the proposed method is competitive when compared to the existing adaptive binarization methods and achieves better performance in precision, accuracy, and F-measure.  相似文献   

13.
In this paper, we propose a novel binarization method for document images produced by cameras. Such images often have varying degrees of brightness and require more careful treatment than merely applying a statistical method to obtain a threshold value. To resolve the problem, the proposed method divides an image into several regions and decides how to binarize each region. The decision rules are derived from a learning process that takes training images as input. Tests on images produced under normal and inadequate illumination conditions show that our method yields better visual quality and better OCR performance than three global binarization methods and four locally adaptive binarization methods.  相似文献   

14.
This paper presents a novel local threshold algorithm for the binarization of document images. Stroke width of handwritten and printed characters in documents is utilized as the shape feature. As a result, in addition to the intensity analysis, the proposed algorithm introduces the stroke width as shape information into local thresholding. Experimental results for both synthetic and practical document images show that the proposed local threshold algorithm is superior in terms of segmentation quality to the threshold approaches that solely use intensity information.  相似文献   

15.
16.
Binary image representation is essential format for document analysis. In general, different available binarization techniques are implemented for different types of binarization problems. The majority of binarization techniques are complex and are compounded from filters and existing operations. However, the few simple thresholding methods available cannot be applied to many binarization problems. In this paper, we propose a local binarization method based on a simple, novel thresholding method with dynamic and flexible windows. The proposed method is tested on selected samples called the DIBCO 2009 benchmark dataset using specialized evaluation techniques for binarization processes. To evaluate the performance of our proposed method, we compared it with the Niblack, Sauvola and NICK methods. The results of the experiments show that the proposed method adapts well to all types of binarization challenges, can deal with higher numbers of binarization problems and boosts the overall performance of the binarization.  相似文献   

17.
A new thresholding method, called the noise attribute thresholding method (NAT), for document image binarization is presented in this paper. This method utilizes the noise attribute features extracted from the images to make the selection of threshold values for image thresholding. These features are based on the properties of noise in the images and are independent of the strength of the signals (objects and background) in the image. A simple noise model is given to explain these noise properties. The NAT method has been applied to the problem of removing text and figures printed on the back of the paper. Conventional global thresholding methods cannot solve this kind of problem satisfactorily. Experimental results show that the NAT method is very effective. Received July 05, 1999 / Revised July 07, 2000  相似文献   

18.
Multimedia Tools and Applications - Word spotting in handwritten document images is a field of immense interest due to its widespread applications. Recognition-free and recognition-based approaches...  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号