首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Extraction of foreground contents in complex background document images is very difficult as background texture, color and foreground font, size, color, tilt are not known in advance. In this work, we propose a RGB color model for the input of complex color document images. An algorithm to detect the text regions using Gabor filters followed by extraction of text using color feature luminance is developed too. The proposed approach consists of three stages. Based on the Gabor features, the candidate image segments containing text are detected in stage-1. Because of complex background, certain amount of high frequency non-text objects in the background are also detected as text objects in stage-1. In stage-2, certain amount of false text objects is dropped by performing the connected component analysis. In stage-3, the image segments containing textual information, which are obtained from the previous stage are binarized to extract the foreground text. The color feature luminance is extracted from the input color document image. The threshold value is derived automatically using this color feature. The proposed approach handles both printed and handwritten color document images with foreground text in any color, font, size and orientation. For experimental evaluations, we have considered a variety of document images having non-uniform/uniform textured and multicolored background. Performance of segmentation of foreground text is evaluated on a commercially available OCR. Evaluation results show better recognition accuracy of foreground characters in the processed document images against unprocessed document images.  相似文献   

2.
Numerous techniques have previously been proposed for single-stage thresholding of document images to separate the written or printed information from the background. A new thresholding structure called the decompose algorithm is proposed and compared against some existing single-stage algorithms. The decompose algorithm uses local feature vectors to analyse and find the best approach to threshold a local area. Instead of employing a single thresholding algorithm, automatic selection of an appropriate algorithm for specific types of subregions of the document is performed. The original image is recursively broken down into subregions using quad-tree decomposition until a suitable thresholding method can be applied to each subregion. The algorithm has been trained using 300 historical images and evaluated on 300 'difficult' document images, in which considerable background noise or variation in contrast and illumination exists. Quantitative analysis of the results by measuring text recall, and qualitative assessment of processed document image quality is reported. The decompose algorithm is demonstrated to be effective at resolving the problem in varying quality historical images.  相似文献   

3.
To efficiently compress rasterized compound documents, an encoder must be content-adaptive. Content adaptivity may be achieved by employing a layered approach. In such an approach, a compound image is segmented into layers so that appropriate encoders can be used to compress these layers individually. A major factor in using standard encoders efficiently is to match the layers’ characteristics to those of the encoders by using data filling techniques to fill-in the initially sparse layers. In this work we present a review of methods dealing with data filling and propose also a sub-optimal non-linear projections scheme that efficiently matches the baseline JPEG coder in compressing background layers, leading to smaller files with better image quality.  相似文献   

4.
Side match and overlap match vector quantizers for images   总被引:6,自引:0,他引:6  
A class of vector quantizers with memory that are known as finite state vector quantizers (FSVQs) in the image coding framework is investigated. Two FSVQ designs, namely side match vector quantizers (SMVQs) and overlap match vector quantizers (OMVQs), are introduced. These designs take advantage of the 2-D spatial contiguity of pixel vectors as well as the high spatial correlation of pixels in typical gray-level images. SMVQ and OMVQ try to minimize the granular noise that causes visible pixel block boundaries in ordinary VQ. For 512 by 512 gray-level images, SMVQ and OMVQ can achieve communication quality reproduction at an average of 1/2 b/pixel per image frame, and acceptable quality reproduction. Because block boundaries are less visible, the perceived improvement in quality over ordinary VQ is even greater. Owing to the structure of SMVQ and OMVQ, simple variable length noiseless codes can achieve as much as 60% bit rate reduction over fixed-length noiseless codes.  相似文献   

5.
文本图像二值化是文本图像识别的重要步骤,由于光照不均或文档水渍等原因导致文本图像退化,增加了文本图像识别的难度。本文对一种局部阈值算法进行了改进,首先对图像进行水平投影,根据直方图的极小点对版面进行简单划分,再利用全局阈值法估算出更为准确的各区域字符笔画宽度,从而自适应地得到适当的窗口尺寸,再利用对比图和局部阈值进行图像二值化,并结合OTSU图像消除原算法产生的伪轮廓。实验与分析表明,改进后的方法能够明显消除因笔画粗细不均、字符大小不同而产生的前景像素误识问题。  相似文献   

6.
Image-based rendering (IBR) systems create photorealistic views of complex 3D environments by resampling large collections of images captured in the environment. The quality of the resampled images increases significantly with image capture density. Thus, a significant challenge in interactive IBR systems is to provide both fast image access along arbitrary viewpoint paths and efficient storage of large image data sets.We describe a spatial image hierarchy combined with an image compression scheme that meets the requirements of interactive IBR walkthroughs. By using image warping and exploiting image coherence over the image capture plane, we achieve compression performance similar to traditional motion-compensated schema, e.g., MPEG, yet allow image access along arbitrary paths. Furthermore, by exploiting graphics hardware for image resampling, we can achieve interactive rates during IBR walkthroughs.  相似文献   

7.
This paper presents a modified JPEG coder that is applied to the compression of mixed documents (containing text, natural images, and graphics) for printing purposes. The modified JPEG coder proposed in this paper takes advantage of the distinct perceptually significant regions in these documents to achieve higher perceptual quality than the standard JPEG coder. The region-adaptivity is performed via classified thresholding being totally compliant with the baseline standard. A computationally efficient classification algorithm is presented, and the improved performance of the classified JPEG coder is verified.  相似文献   

8.
Image segmentation is an important component of any document image analysis system. While many segmentation algorithms exist in the literature, very few i) allow users to specify the physical style, and ii) incorporate user-specified style information into the algorithm's objective function that is to be minimized. We describe a segmentation algorithm that models a document's physical structure as a hierarchical structure where each node describes a region of the document using a stochastic regular grammar. The exact form of the hierarchy and the stochastic language is specified by the user, while the probabilities associated with the transitions are estimated from groundtruth data. We demonstrate the segmentation algorithm on images of bilingual dictionaries.  相似文献   

9.
Printers usually generate a limited number of colors and lack the ability of producing continuous-tone color images. Traditional error-diffusion algorithms are used to solve this problem. Compared with other approaches, the approaches of using error-diffusion in general can generate halftoned images of better quality. However, smeared edges and textures may occur in these halftoned images. To produce halftoned images of higher quality, these artifacts due to unstable images, dot-overlap, and error-diffusion must be eliminated or reduced. In this paper, we show that unstable images can be eliminated or reduced through using a proper color difference formula to select the reproduction colors even vector error-diffusion is performed in the RGB domain. We also present a method of using different filters to halftone different components of a color. This approach may have clearer and sharper edges for halftoned color images. Unexpected colors may be generated due to dot-overlap in the printing process. We have presented a method to eliminate this color distortion in the process of error-diffusion. Halftoning a color image by our proposed error-diffusion algorithm with edge enhancement has the following characteristics: the unstable images do not exist; the color-error caused by dot-overlap is corrected; and the smeared edges are sharpened.  相似文献   

10.
In this paper, we explore H.264/AVC operating in intraframe mode to compress a mixed image, i.e., composed of text, graphics, and pictures. Even though mixed contents (compound) documents usually require the use of multiple compressors, we apply a single compressor for both text and pictures. For that, distortion is taken into account differently between text and picture regions. Our approach is to use a segmentation-driven adaptation strategy to change the H.264/AVC quantization parameter on a macroblock by macroblock basis, i.e., we deviate bits from pictorial regions to text in order to keep text edges sharp. We show results of a segmentation driven quantizer adaptation method applied to compress documents. Our reconstructed images have better text sharpness compared to straight unadapted coding, at negligible visual losses on pictorial regions. Our results also highlight the fact that H.264/AVC-INTRA outperforms coders such as JPEG-2000 as a single coder for compound images.  相似文献   

11.
二值化是光学文字识别(OCR)的重要步骤,直接影响到光学文字识别的成功率。目前基 于亮度分割局域二值化算法效果好,但是过程复杂、运算耗时。快速二值化算法流程简单、 噪声敏感。低亮度图片一般有不可忽略的噪声,并且文字对比度低。为获取低对比度文字, 快速二值化算法需对亮度梯度敏感,但是也会导致快速二值化结果文字断裂、丢失、背景噪 声大。为实现高质量快速二值化,本文采取非局域均值滤波算法抑制噪声,同时避免过度平 滑图片。采用改进的Bradley算法提取低对比度文字,并解决了文字断裂等问题。最后采用 膨胀腐蚀算法抑制二值化噪声。本方法适用于非均匀低亮度和高亮度的图片。实验结果表明 ,本方法在非均匀高亮度下,表现和其他快速二值化算法相同。在非均匀低亮度下,提取文 字更多、文字断裂更少、噪声更小。本方法二值化结果的OCR召回率达到了93.5%。  相似文献   

12.
Compound document images contain graphic or textual content along with pictures. They are a very common form of documents, found in magazines, brochures, Web sites, etc. We focus our attention on the mixed raster content (MRC) multilayer approach for compound image compression. We study block thresholding as a means to segment an image for MRC. An attempt is made to optimize the block threshold in a rate-distortion sense. Also, a fast algorithm is presented to approximate the optimized method. Extensive results are presented including rate-distortion curves, segmentation masks and reconstructed images, showing the performance of the proposed algorithm.  相似文献   

13.
Text extraction is an important initial step in digitizing the historical documents. In this paper, we present a text extraction method for historical Tibetan document images based on block projections. The task of text extraction is considered as text area detection and location problem. The images are divided equally into blocks and the blocks are filtered by the information of the categories of connected components and corner point density. By analyzing the filtered blocks’ projections, the approximate text areas can be located, and the text regions are extracted. Experiments on the dataset of historical Tibetan documents demonstrate the effectiveness of the proposed method.  相似文献   

14.
针对卫星SAR图像海洋背景和舰船目标特点,文献[1]提出了基于小波多分辨率分析的卫星SAR海洋图像舰船目标检测的新方法。在此基础上,本文针对不同海情杂波服从不同概率密度分布的特点,讨论了复杂杂波背景下基于小波变换检测海洋SAR图像中舰船目标的性能,给出了不同海情下的检测性能,并与传统门限检测方法比较,给出了不同信杂比下虚警概率曲线。仿真结果表明,该方法实用、有效。  相似文献   

15.
Segmentation of images into foreground (an actor) and background is required for many motion picture special effects. To produce these shots, the unwanted background must be removed so that none of it appears in the final composite shot. The standard approach requires the background to be a blue screen. Systems that are capable of segmenting actors from more natural backgrounds have been proposed, but many of these are not readily adaptable to the resolution involved in motion picture imaging. An algorithm is presented that requires minimal human interaction to segment motion picture resolution images. Results from this algorithm are quantitatively compared with alternative approaches. Adaptations to the algorithm, which enable segmentation even when the foreground is lit from behind, are described. Segmentation of image sequences normally requires manual creation of a separate hint image for each frame of a sequence. An algorithm is presented that generates such hint images automatically, so that only a single input is required for an entire sequence. Results are presented that show that the algorithm successfully generates hint images where an alternative approach fails.  相似文献   

16.
Microwave imaging by in-line holography is demonstrated experimentally. The twin image, reference level and autocorrelation contributions to the required image are suppressed by digital processing. An extension of the technique which provides better image reconstruction is introduced. Results are compared with images obtained from direct phase and amplitude recording.  相似文献   

17.
CFAR算法是一种常见的合成孔径雷达图像目标检测算法。CFAR算法在对背景杂波统计建模时需要考虑建模样本中是否混入目标等非背景像素。文中提出了一种新的CFAR检测算法,该算法认为目标窗口和背景窗口是同一个窗口,在CFAR算法之前,通过目标预筛选去除建模样本中的降质因素,并用广义伽马模型对背景窗口中的剩余样本建模。相比于传统的高斯模型CFAR算法,本文所用算法考虑了建模样本中降质因素对建模精度的影响,新的滑动窗模型结构更加简单,检测结果虚警率低,对距离很近的目标不会产生漏检。  相似文献   

18.
非一致重叠率大批量航拍远红外图像拼接方法研究   总被引:1,自引:0,他引:1       下载免费PDF全文
高琰  肖小月  李小虎  朱洪  唐琎  郭璠 《红外与激光工程》2022,51(7):20210611-1-20210611-12
由于远红外图像存在分辨率低、具有大量重复结构和稀疏结构等问题,大批量远红外图像的拼接工作面临很多挑战。文中针对航空场景下的远红外图像拼接,将图像对齐过程分为顺序图像序列对齐和多列图像对齐。单列图像基于非极大值抑制求取单应性矩阵,并推导传递关系进行拼接。列间图像通过划分网格,结合区域相似变换和局部单应变换逐网格优化变换矩阵的权重,并通过网格变换的递推实现拼接。文中所述的拼接方法能够避免连续多张远红外图像拼接时导致的特征消失问题,同时适应多列图像拼接时上下重叠率不一致的问题,最终实现了大批量远红外图像的拼接。  相似文献   

19.
提出了一种基于混合高斯模型拟合选取阈值和区域生长的图像分割方法:首先利用笔画方向算子对文字笔画进行抽样,再利用混合高斯模型拟合其灰度直方图,确定样本的最佳分割阈值,最后利用样本的标准差作为生长规则的判断依据分割文字。该算法计算量小,实时性和分割精度均有一定优势,在提取目标的同时,残留背景像素极少,使下一步的目标识别更为简单。  相似文献   

20.
针对微结构大范围无损测试问题,提出了一种低 重叠度的三维结构拼接方法。首先基于实验参数将结构特征提 取区域限制在测量过程中的重叠区域,以减少误匹配出现的可能性并提高计算效率;在上述 区域内,通过SIFT(scale-invariant feature transform)算法进行特征 点提取;在特征点匹配阶段,根据测量系统参数进一步提出缩小匹配点对搜索范围的方法以 提高特征点匹配可靠性;最后以 重叠区域的局部连续性为依据,计算校正矩阵以校正测量过程中环境扰动带来的拼接结构错 动。实验中,将本方法与目前商业设 备的拼接测量功能进行了对比。实验表明,本文方法不但适用于特征丰富的结构,也适用于 相似度高的阵列性结构,可在重叠度为6%时实现有效拼接。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号