首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Compared to typical scanners, handheld cameras offer convenient, flexible, portable, and non-contact image capture, which enables many new applications and breathes new life into existing ones. However, camera-captured documents may suffer from distortions caused by non-planar document shape and perspective projection, which lead to failure of current OCR technologies. We present a geometric rectification framework for restoring the frontal-flat view of a document from a single camera-captured image. Our approach estimates 3D document shape from texture flow information obtained directly from the image without requiring additional 3D/metric data or prior camera calibration. Our framework provides a unified solution for both planar and curved documents and can be applied in many, especially mobile, camera-based document analysis applications. Experiments show that our method produces results that are significantly more OCR compatible than the original images.  相似文献   

3.
视觉文档图像的几何校正   总被引:2,自引:0,他引:2  
田学东  马兴杰  韩磊  刘海博 《计算机应用》2007,27(12):3045-3047
在以数码相机等数字设备拍摄文档资料时,所拍摄的图像经常会产生各种各样的几何变形,这种变形可能会导致识别软件中的版面分析和切分算法失效,从而使文档图像无法被识别。为了使普通的识别软件能够对数码相机等拍摄的文档图像进行识别,有必要对其进行几何校正。根据几何变形产生的原因对其进行了分类,并针对不同种类的变形提出了相应的校正算法。实验结果证明该分类方法和相应校正算法都有较好的效果。  相似文献   

4.
5.
International Journal on Document Analysis and Recognition (IJDAR) - Nowadays, with the development of electronic devices, more and more attention has been paid to camera-based text processing....  相似文献   

6.
International Journal on Document Analysis and Recognition (IJDAR) - Automated dewarping of camera-captured handwritten documents is a challenging research problem in Computer Vision and Pattern...  相似文献   

7.
Several methods for segmentation of document images (maps, drawings, etc.) are explored. The segmentation operation is posed as a statistical classification task with two pattern classes: print and background. A number of classification strategies are available. All require some prior information about the distribution of gray levels for the two classes. Training (either supervised or unsupervised) is employed to form these initial density estimates. Automatic updating of the class-conditional densities is performed within subregions in the image to adapt these global density estimates to the local image area. After local class-conditional densities have been obtained, each pixel is classified within the window using several techniques: a noncontextual Bayes classifier, Besag's classifier, relaxation, Owen and Switzer's classifier, and Haslett's classifier. Four test images were processed. In two of these, the relaxation method performed best, and in the other two, the noncontextual method performed best. Automatic updating improved the results for both classifiers  相似文献   

8.
Many preprocessing techniques intended to normalize artifacts and clean noise induce anomalies in part due to the discretized nature of the document image and in part due to inherent ambiguity in the input image relative to the desired transformation. The potentially deleterious effects of common preprocessing methods are illustrated through a series of dramatic albeit contrived examples and then shown to affect real applications of ongoing interest to the community through three writer identification experiments conducted on Arabic handwriting. Retaining ruling lines detected by multi-line linear regression instead of repairing strokes broken by deleting ruling lines reduced the error rate by 4.5 %. Exploiting word position relative to detected rulings instead of ignoring it decreased errors by 5.5 %. Counteracting page skew by rotating extracted contours during feature extraction instead of rectifying the page image reduced the error by 1.4  %. All of these accuracy gains are shown to be statistically significant. Analogous methods are advocated for other document processing tasks as topics for future research.  相似文献   

9.
In this paper, we propose a metric rectification method to restore an image from a single camera-captured document image. The core idea is to construct an isometric image mesh by exploiting the geometry of page surface and camera. Our method uses a general cylindrical surface (GCS) to model the curved page shape. Under a few proper assumptions, the printed horizontal text lines are shown to be line convergent symmetric. This property is then used to constrain the estimation of various model parameters under perspective projection. We also introduce a paraperspective projection to approximate the nonlinear perspective projection. A set of close-form formulas is thus derived for the estimate of GCS directrix and document aspect ratio. Our method provides a straightforward framework for image metric rectification. It is insensitive to camera positions, viewing angles, and the shapes of document pages. To evaluate the proposed method, we implemented comprehensive experiments on both synthetic and real-captured images. The results demonstrate the efficiency of our method. We also carried out a comparative experiment on the public CBDAR2007 data set. The experimental results show that our method outperforms the state-of-the-art methods in terms of OCR accuracy and rectification errors.  相似文献   

10.
Marginal noise is a common phenomenon in document analysis which results from the scanning of thick documents or skew documents. It usually appears in the front of a large and dark region around the margin of document images. Marginal noise might cover meaningful document objects, such as text, graphics and forms. The overlapping of marginal noise with meaningful objects makes it difficult to perform the task of segmentation and recognition of document objects. This paper proposes a novel approach to remove marginal noise. The proposed approach consists of two steps which are marginal noise detection and marginal noise deletion. Marginal noise detection will reduce an original document image into a smaller image, and then find marginal noise regions according to the shape length and location of the split blocks. After the detection of marginal noise regions, different removal methods are performed. A local thresholding method is proposed for the removal of marginal noise in gray-scale document images, whereas a region growing method is devised for binary document images. Experimenting with a wide variety of test samples reveals the feasibility and effectiveness of our proposed approach in removing marginal noises.  相似文献   

11.
Tuan D. Pham   《Pattern recognition》2003,36(12):3023-3025
A fast and effective algorithm is developed for detecting logos in grayscale document images. The computational schemes involve segmentation, and the calculation of the spatial density of the defined foreground pixels. The detection does not require training and is unconstrained in the sense that the presence of a logo in a document image can be detected under scaling, rotation, translation, and noise. Several tests on different electronic document forms such as letters, faxes, and billing statements are carried out to illustrate the performance of the method.  相似文献   

12.
13.
智能手机拍摄的图像中经常会出现变形的文档图像,变形的文档图像影响文本的识别和后期图像处理等工作,而现有的变形文档图像校正方法存在校正类型单一和校正效果不理想的问题.针对以上问题,提出了一种基于最小化重投影的变形文档图像校正方法.该方法首先通过文本域轮廓检测,合并文本域轮廓来获取文本行连通域.然后利用主成分分析法PCA在...  相似文献   

14.
Evaluation of binarization methods for document images   总被引:19,自引:0,他引:19  
This paper presents an evaluation of eleven locally adaptive binarization methods for gray scale images with low contrast, variable background intensity and noise. Niblack's method (1986) with the addition of the postprocessing step of Yanowitz and Bruckstein's method (1989) added performed the best and was also one of the fastest binarization methods  相似文献   

15.
16.
Correcting for variable skew in document images   总被引:1,自引:0,他引:1  
The proliferation of inexpensive sheet-feed scanners, particularly in fax machines, has led to a need to correct for the uneven paper feed rates during digitization if the images produced by these scanners are to be further analyzed. We develop a technique for detecting and compensating for this type of image distortion. This technique relies on the detection of multiple prominent skew angles in the document image along with their vertical position on the page, rotating the image by each of those angles and sampling the rotated images to allow reconstruction of the entire page image.Received: 28 November 2002, Accepted: 16 April 2003, Published online: 12 September 2003Correspondence to: A. Lawrence Spitz  相似文献   

17.
The problem addressed in this paper is the automatic extraction of names from a document image. Our approach relies on the combination of two complementary analyses. First, the image-based analysis exploits visual clues to select the regions of interest in the document. Second, the textual-based analysis searches for name patterns and low-level word textual features. Both analyses are then combined at the word level through a neural network fusion scheme. Reported results on degraded documents such as facsimile and photocopied technical journals demonstrate the interest of the combined approach.  相似文献   

18.
The paper presents a clutter detection and removal algorithm for complex document images. This distance transform based technique aims to remove irregular and independent unwanted clutter while preserving the text content. The novelty of this approach is in its approximation to the clutter–content boundary when the clutter is attached to the content in irregular ways. As an intermediate step, a residual image is created, which forms the basis for clutter detection and removal. Clutter detection and removal are independent of clutter’s position, size, shape, and connectivity with text. The method is tested on a collection of highly degraded and noisy, machine-printed and handwritten Arabic and English documents, and results show pixel-level accuracies of 99.18 and 98.67 % for clutter detection and removal, respectively. This approach is also extended to documents having a mix of clutter and salt-and-pepper noise.  相似文献   

19.
As sharing documents through the World Wide Web has been recently and constantly increasing, the need for creating hyperdocuments to make them accessible and retrievable via the internet, in formats such as HTML and SGML/XML, has also been rapidly rising. Nevertheless, only a few works have been done on the conversion of paper documents into hyperdocuments. Moreover, most of these studies have concentrated on the direct conversion of single-column document images that include only text and image objects. In this paper, we propose two methods for converting complex multi-column document images into HTML documents, and a method for generating a structured table of contents page based on the logical structure analysis of the document image. Experiments with various kinds of multi-column document images show that, by using the proposed methods, their corresponding HTML documents can be generated in the same visual layout as that of the document images, and their structured table of contents page can be also produced with the hierarchically ordered section titles hyperlinked to the contents.  相似文献   

20.
基于形态学的文档图像径向校正算法   总被引:1,自引:0,他引:1  
常骏  苗立刚 《计算机应用》2010,30(4):950-952
手持相机拍摄的文档图像存在不同程度的镜头失真。根据文档图像的文本行信息,提出了一种基于数学形态学的镜头校正算法。首先利用自适应阈值方法分割文档图像,并通过形态学闭运算把连通体聚类为文本行。然后利用二次多项式模型拟合文本行的中心线,并建立径向失真校正的目标函数。该目标函数把中心线对应的曲线映射为直线,从而求出文档图像的镜头失真参数。实验结果表明,该校正算法可以有效地校正文档图像各种程度的径向失真。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号