首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Text data present in images and video contain useful information for automatic annotation, indexing, and structuring of images. Extraction of this information involves detection, localization, tracking, extraction, enhancement, and recognition of the text from a given image. However, variations of text due to differences in size, style, orientation, and alignment, as well as low image contrast and complex background make the problem of automatic text extraction extremely challenging. While comprehensive surveys of related problems such as face detection, document analysis, and image & video indexing can be found, the problem of text information extraction is not well surveyed. A large number of techniques have been proposed to address this problem, and the purpose of this paper is to classify and review these algorithms, discuss benchmark data and performance evaluation, and to point out promising directions for future research.  相似文献   

2.
一种改进的中文文档图像倾斜检测方法   总被引:4,自引:0,他引:4  
孙楠  刘志文 《计算机仿真》2006,23(9):184-187
图像获取设备将纸质文档转换为文档图像时,经常会使文档图像出现某种程度的倾斜,从而可能使后续的文档版面理解和OCR识别算法失败。文中提出一种基于近邻法的中文图像的倾斜角度检测方法,并采用最小二乘法减小倾斜估计的误差,从而大大优化了运算速度,增强了算法的鲁棒性,与现有方法相比,具有运算速度快,检测精度高的优势。算法在Visual C++下编程加以实现,通过对检测库中100幅倾斜中文文档图像的检测证明,该方法具有精度高和适应性强的特点。  相似文献   

3.
Algorithms are proposed for the detection, localization (i.e., position determination in the observed scene), and recognition of optoelectronic images of a group of plants with a contrasting image background intended for use in inertial sighting systems for the navigation and guidance of aircraft [1, 2]. The structure and parameters of the developed algorithms are presented. The parameters of the quality of the processes of detection, localization, and recognition of halftone electro-optical images of a group of ground objects are given.  相似文献   

4.
As one of the most pervasive methods of individual identification and document authentication, signatures present convincing evidence and provide an important form of indexing for effective document image processing and retrieval in a broad range of applications. However, detection and segmentation of free-form objects such as signatures from clustered background is currently an open document analysis problem. In this paper, we focus on two fundamental problems in signature-based document image retrieval. First, we propose a novel multiscale approach to jointly detecting and segmenting signatures from document images. Rather than focusing on local features that typically have large variations, our approach captures the structural saliency using a signature production model and computes the dynamic curvature of 2D contour fragments over multiple scales. This detection framework is general and computationally tractable. Second, we treat the problem of signature retrieval in the unconstrained setting of translation, scale, and rotation invariant nonrigid shape matching. We propose two novel measures of shape dissimilarity based on anisotropic scaling and registration residual error and present a supervised learning framework for combining complementary shape information from different dissimilarity metrics using LDA. We quantitatively study state-of-the-art shape representations, shape matching algorithms, measures of dissimilarity, and the use of multiple instances as query in document image retrieval. We further demonstrate our matching techniques in offline signature verification. Extensive experiments using large real-world collections of English and Arabic machine-printed and handwritten documents demonstrate the excellent performance of our approaches.  相似文献   

5.
李样  王建国 《计算机科学》2009,36(7):284-287
人脸识别是当前模式识别应用的一个重要领域.在理解当前广泛使用的各种人脸识别算法的基础上,提出了一种基于多方法融合的彩色图像纯脸检测与定位的优化算法.该算法首先通过肤色检测及预处理技术缩小彩色图像人脸检测的搜索区域,然后应用基于物体区域方向的检测平面内任意旋转角度人脸计算方法;并通过二次计算旋转角度的方法来准确确定人脸区域的旋转角度;最后利用积分投影函数找到人脸候选区域中双眼的位置,结合人脸特征在人脸中的比例关系准确确定纯脸的位置.该算法还考虑了人脸侧偏时的情况.实验表明,本方法对平面内任意旋转及双眼存在的侧偏人脸有较好的检测效果,对不同光照条件有较好的鲁棒性.  相似文献   

6.
Document images often suffer from different types of degradation that renders the document image binarization a challenging task. This paper presents a document image binarization technique that segments the text from badly degraded document images accurately. The proposed technique is based on the observations that the text documents usually have a document background of the uniform color and texture and the document text within it has a different intensity level compared with the surrounding document background. Given a document image, the proposed technique first estimates a document background surface through an iterative polynomial smoothing procedure. Different types of document degradation are then compensated by using the estimated document background surface. The text stroke edge is further detected from the compensated document image by using L1-norm image gradient. Finally, the document text is segmented by a local threshold that is estimated based on the detected text stroke edges. The proposed technique was submitted to the recent document image binarization contest (DIBCO) held under the framework of ICDAR 2009 and has achieved the top performance among 43 algorithms that are submitted from 35 international research groups.  相似文献   

7.
航空遥感图像目标检测旨在定位和识别遥感图像中感兴趣的目标,是航空遥感图像智能解译的关键技术,在情报侦察、灾害救援和资源勘探等领域具有重要应用价值。然而由于航空遥感图像具有尺寸大、目标小且密集、目标呈任意角度分布、目标易被遮挡、目标类别不均衡以及背景复杂等诸多特点,航空遥感图像目标检测目前仍然是极具挑战的任务。基于深度卷积神经网络的航空遥感图像目标检测方法因具有精度高、处理速度快等优点,受到了越来越多的关注。为推进基于深度学习的航空遥感图像目标检测技术的发展,本文对当前主流遥感图像目标检测方法,特别是2020—2022年提出的检测方法,进行了系统梳理和总结。首先梳理了基于深度学习目标检测方法的研究发展演化过程,然后对基于卷积神经网络和基于Transformer目标检测方法中的代表性算法进行分析总结,再后针对不同遥感图象应用场景的改进方法思路进行归纳,分析了典型算法的思路和特点,介绍了现有的公开航空遥感图像目标检测数据集,给出了典型算法的实验比较结果,最后给出现阶段航空遥感图像目标检测研究中所存在的问题,并对未来研究及发展趋势进行了展望。  相似文献   

8.
针对目前图像篡改定位与恢复的水印算法在篡改定位精度和篡改恢复性能方面存在的不足,提出一种精确的图像篡改定位与恢复的三水印算法。该算法在最低有效位(LSB)方法的基础上,采用二进制编码的方式生成检测水印、定位水印和恢复水印等三种水印,并嵌入到图像的低位。篡改检测和篡改恢复采用基于分块的检测水印和恢复水印,篡改精确定位采用基于单像素的定位水印。仿真实验表明,该算法对任意大小的亮度图像、RGB图像具有高精度的篡改定位能力,并且具有很好的篡改恢复性能。  相似文献   

9.
目的 数字视频区域篡改是指视频帧图像的某个关键区域被覆盖或被替换,经过图像编辑和修补之后,该关键区域的修改痕迹很难通过肉眼来分辨。视频图像的关键区域承载了视频序列的关键语义信息。如果该篡改操作属于恶意的伪造行为,将产生非常严重的影响和后果。因此,视频区域篡改的检测与定位研究具有重要的研究价值和应用前景。方法 数字图像的复制粘贴篡改检测已经取得较大的研究进展,相关研究成果也很多。但是,数字视频区域篡改的检测与定位不能直接采用数字图像的复制—粘贴篡改取证算法。数字视频区域篡改检测与定位是数字视频被动取证研究领域中的一个新兴的研究方向,近年来越来越多的学者在该领域开展研究工作。目前,数字视频的区域篡改检测与定位研究还缺少完善的理论支撑和通用的检测与定位算法。在广泛调研最近几年的最新研究成果的基础上,对数字视频区域篡改的被动取证概念及重要性进行了介绍,将现有的数字视频区域篡改被动取证算法分为4类:基于噪声模式的算法、基于像素相关性的算法、基于视频内容特征的算法和基于抽象统计特征的算法。然后,对这些区域篡改检测与定位的算法进行对比分析,并介绍现有的视频区域篡改软件和算法,以及篡改检测算法的测试数据库。最后,对本研究领域存在的问题和挑战进行总结,并对未来的研究趋势进行展望。结果 选取了20篇文献中的18种算法,分别介绍每种算法的算法原理,并对这些算法进行对比分析。大部分的算法都宣称可以检测并定位出篡改可疑区域,但是检测和定位的精度、计算复杂度都各有差异。其中,基于时空域的像素相关性分析的算法具有较好的检测和定位效果,并且支持运动背景视频中的运动目标删除篡改检测和定位。基于光流平滑性异常的算法和基于运动目标检测的算法都是基于公开的视频篡改测试库进行比较测试的,两种算法都具有较好的检测和定位效果。基于隐写分析特征提取的集成分类算法虽然只能实现时域上的篡改定位,不能实现更精细的空域篡改定位,但是该算法为基于机器学习的大规模视频篡改取证研究提供了新思路和可能的发展方向,具有较大的指导意义。结论 由于视频编码压缩引入噪声,以及视频区域篡改软件工具和技术的改进,视频区域篡改检测和定位仍是一个极具挑战的课题。未来几年,基于视频内容特征和抽象统计特征的视频区域篡改检测和定位算法,有可能结合深度学习算法,得到进一步的研究和发展;相关的理论算法、系统模型和评价标准等研究成果将逐步完善。  相似文献   

10.
新的文本图像倾斜检测及校正算法   总被引:3,自引:0,他引:3  
在文档扫描过程中,文档可能会发生倾斜,而很多字符识别和布局分析算法都对倾斜十分敏感,文本图像的倾斜检测及校正就成为文档分析不可缺少的环节.提出了一种新的倾斜文本图像的校正方法,该方法首先获取文档图像的bounding box,以bounding box面积最小作为倾斜校正的最终目标,并使用遗传算法搜索该最小值.实验结果表明,该算法对倾斜角的检测具有较高的精确度.  相似文献   

11.
In this paper, we propose a new algorithm for the binarization of degraded document images. We map the image into a 2D feature space in which the text and background pixels are separable, and then we partition this feature space into small regions. These regions are labeled as text or background using the result of a basic binarization algorithm applied on the original image. Finally, each pixel of the image is classified as either text or background based on the label of its corresponding region in the feature space. Our algorithm splits the feature space into text and background regions without using any training dataset. In addition, this algorithm does not need any parameter setting by the user and is appropriate for various types of degraded document images. The proposed algorithm demonstrated superior performance against six well-known algorithms on three datasets.  相似文献   

12.
针对自然图像中,复杂背景信息对显著性目标检测的影响,提出一种利用背景信息进行预测和贝叶斯模型选择优化的显著性检测方法。首先,为了提取完整的先验信息,根据背景信息与图像边界的连通性,以及对图像边界是否为背景进行评估来生成先验显著图。其次,为了降低背景信息的干扰,通过对流行排序算法生成的显著图进行角点检测,选择较为准确的显著点构造凸包。最后,利用贝叶斯模型进行选择优化来抑制和显著目标具有相同特征的背景信息。在2个公开的数据集上进行测试,并与4种性能较好的显著性检测算法对比,结果显示本文算法可提高显著性检测的准确性和区域的完整性。  相似文献   

13.
The current trend in object detection and localization is to learn predictions with high capacity deep neural networks trained on a very large amount of annotated data and using a high amount of processing power. In this work, we particularly target the detection of text in document images and we propose a new neural model which directly predicts object coordinates. The particularity of our contribution lies in the local computations of predictions with a new form of local parameter sharing which keeps the overall amount of trainable parameters low. Key components of the model are spatial 2D-LSTM recurrent layers which convey contextual information between the regions of the image. We show that this model is more powerful than the state of the art in applications where training data are not as abundant as in the classical configuration of natural images and Imagenet/Pascal-VOC tasks. The proposed model also facilitates the detection of many objects in a single image and can deal with inputs of variable sizes without resizing. To enhance the localization precision of the coordinate regressor, we limit the amount of information produced by the local model components and propose two different regression strategies: (i) separately predict lower-left and upper-right corners of each object bounding box, followed by combinatorial pairing; (ii) only predict the left side of the objects and estimate the right position jointly with text recognition. These strategies lead to good full-page text recognition results in heterogeneous documents. Experiments have been performed on a document analysis task, the localization of the text lines in the Maurdor dataset.  相似文献   

14.
The digitalization processes of documents produce frequently images with small rotation angles. The skew angles in document images degrade the performance of optical character recognition (OCR) tools. Therefore, skew detection of document images plays an important role in automatic document analysis systems. In this paper, we propose a Rectangular Active Contour Model (RAC Model) for content region detection and skew angle calculation by imposing a rectangular shape constraint on the zero-level set in Chan–Vese Model (C-V Model) according to the rectangular feature of content regions in document images. Our algorithm differs from other skew detection methods in that it does not rely on local image features. Instead, it uses global image features and shape constraint to obtain a strong robustness in detecting skew angles of document images. We experimented on different types of document images. Comparing the results with other skew detection algorithms, our algorithm is more accurate in detecting the skews of the complex document images with different fonts, tables, illustrations, and layouts. We do not need to pre-process the original image, even if it is noisy, and at the same time the rectangular content region of a document image is also detected.  相似文献   

15.
生成对抗网络(generative adversarial network,GAN)快速发展,并在图像生成和图像编辑技术等多个方面取得成功应用。然而,若将上述技术用于伪造身份或制作虚假新闻,则会造成严重的安全隐患。多媒体取证领域的研究者面向GAN生成图像已提出了多种被动取证与反取证方法,但现阶段缺乏相关系统性综述。针对上述问题,本文首先阐述本领域的研究背景和研究意义,然后分析自然图像采集与GAN图像生成过程的区别。根据上述理论基础,详细介绍了现有GAN生成图像的被动取证技术,包括:GAN生成图像检测算法,GAN模型溯源算法和其他相关取证问题。此外,针对不同应用场景介绍基于GAN的反取证技术。最后,通过实验分析当前GAN生成图像被动取证技术所面临的挑战。本文根据对现有技术从理论和实验两方面的分析得到以下结论:现阶段,GAN生成图像的被动取证技术已在空间域和频率域形成了不同技术路线,较好地解决了简单场景下的相关取证问题。针对常见取证痕迹,基于GAN的反取证技术已能够进行有效隐藏。然而,该领域研究仍存在诸多局限:1)取证与反取证技术的可解释性不足;2)取证技术鲁棒性和泛化性较弱;3)反取证技术缺乏多特征域协同的抗分析能力等。上述问题和挑战还需要研究人员继续深入探索。  相似文献   

16.
Change detection in synthetic aperture radar (SAR) images can be made as a matrix factorisation model, and it can detect the changes based on the foreground information in the image. However, these methods cannot obtain satisfactory results in the change detection of SAR images because reliable background data are often not available. In this article, we propose a matrix factorisation model based on a naïve Bayes classifier to explore the low-rank and sparse information, and then detect the changes in SAR images. The factorisation model of the low-rank and sparse matrix extracts both background and foreground information from images. From the low-rank and sparse matrices, we can get the background and foreground information recovered, respectively. Then by computing the mean and variance matrix of the unchanged and changed region information, we will obtain the statistical features. The statistical features are then used to build a naïve Bayes classifier, which is used to distinguish the change detection results, and all of them are based on the acquired data distribution. The experiments, which are based on four real data sets, indicate that the approach gets a better performance than some other state-of-the-art algorithms.  相似文献   

17.
Topics often transit among documents in a document collection. To improve the accuracy of the topic detection and tracking (TDT) algorithms in discovering topics or classifying documents, it is necessary to make full use of this kind of topic transition information. However, TDT algorithms usually find topics based on topic models, such as LDA, pLSI, etc., which are a kind of mixture model and make the topic transition difficult to be denoted and implemented. A topic transition model representation based on hidden Markov model is present, and learning the topic transition from documents is discussed. Based on the model, two TDT algorithms incorporating topic transition, i.e. topic discovering and document classifying, are provided to show the application of the proposed model. Experiments on two real-world document collections are done with the two algorithms, and performance comparison with other similar algorithm shows that the accuracy can achieve 93% for topic discovering in Reuters-21578, and 97.3% in document classifying. Furthermore, topic transition discovered by the algorithm on a dataset which was collected from a BBS website is consistent with the manual analysis results.  相似文献   

18.
显著性物体检测的关键在于准确地突出前景区域,多数传统方法在处理复杂背景图像时效果不理想。针对上述问题,提出了一种基于前景增强与背景抑制的显著性物体检测方法。首先,利用简单线性迭代聚类(SLIC)将图像进行分割得到多个超像素区域,通过区域间的对比和边界信息分别获得图像的显著区域与背景种子,并通过计算得到基于区域间对比和基于背景的两幅显著图。然后,在两幅图像中运用Seam Carving和Graph based的图像分割法区分显著与非显著区域,进而得到前景增强与背景抑制模板。最终,融合两幅显著图与模板得到最终的显著图。在公开数据集MSRA 1000上对算法进行验证,结果表明,所提算法与7种主流算法相比具有更好的查准率和查全率。  相似文献   

19.
Projection methods have been used in the analysis of bitonal document images for different tasks such as page segmentation and skew correction for more than two decades. However, these algorithms are sensitive to the presence of border noise in document images. Border noise can appear along the page border due to scanning or photocopying. Over the years, several page segmentation algorithms have been proposed in the literature. Some of these algorithms have come into widespread use due to their high accuracy and robustness with respect to border noise. This paper addresses two important questions in this context: 1) Can existing border noise removal algorithms clean up document images to a degree required by projection methods to achieve competitive performance? 2) Can projection methods reach the performance of other state-of-the-art page segmentation algorithms (e.g., Docstrum or Voronoi) for documents where border noise has successfully been removed? We perform extensive experiments on the University of Washington (UW-III) data set with six border noise removal methods. Our results show that although projection methods can achieve the accuracy of other state-of-the-art algorithms on the cleaned document images, existing border noise removal techniques cannot clean up documents captured under a variety of scanning conditions to the degree required to achieve that accuracy.  相似文献   

20.
《Real》2001,7(4):367-380
The aim of this article is to outline the issues involved in the application of machine vision to the automatic extraction and registration of watermarks from continuous web paper. The correct identification and localization of watermarks are key issues in paper manufacturing. As well as requiring the position of the watermark for defect detection and classification, it is necessary to insure its position on the paper prior to the cutting process. Two paper types are discussed, with and without laid and chain lines (these lines appear as a complex periodic background to the watermark and further complicate the segmentation process). We will examine both morphological and Fourier approaches to the watermark segmentation process, concentrating specifically on those images with complex backgrounds. Finally we detail a system design suitable for real-time implementation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号