共查询到20条相似文献,搜索用时 156 毫秒
1.
Extraction of foreground contents in complex background document images is very difficult as background texture, color and foreground font, size, color, tilt are not known in advance. In this work, we propose a RGB color model for the input of complex color document images. An algorithm to detect the text regions using Gabor filters followed by extraction of text using color feature luminance is developed too. The proposed approach consists of three stages. Based on the Gabor features, the candidate image segments containing text are detected in stage-1. Because of complex background, certain amount of high frequency non-text objects in the background are also detected as text objects in stage-1. In stage-2, certain amount of false text objects is dropped by performing the connected component analysis. In stage-3, the image segments containing textual information, which are obtained from the previous stage are binarized to extract the foreground text. The color feature luminance is extracted from the input color document image. The threshold value is derived automatically using this color feature. The proposed approach handles both printed and handwritten color document images with foreground text in any color, font, size and orientation. For experimental evaluations, we have considered a variety of document images having non-uniform/uniform textured and multicolored background. Performance of segmentation of foreground text is evaluated on a commercially available OCR. Evaluation results show better recognition accuracy of foreground characters in the processed document images against unprocessed document images. 相似文献
2.
Numerous techniques have previously been proposed for single-stage thresholding of document images to separate the written or printed information from the background. A new thresholding structure called the decompose algorithm is proposed and compared against some existing single-stage algorithms. The decompose algorithm uses local feature vectors to analyse and find the best approach to threshold a local area. Instead of employing a single thresholding algorithm, automatic selection of an appropriate algorithm for specific types of subregions of the document is performed. The original image is recursively broken down into subregions using quad-tree decomposition until a suitable thresholding method can be applied to each subregion. The algorithm has been trained using 300 historical images and evaluated on 300 'difficult' document images, in which considerable background noise or variation in contrast and illumination exists. Quantitative analysis of the results by measuring text recall, and qualitative assessment of processed document image quality is reported. The decompose algorithm is demonstrated to be effective at resolving the problem in varying quality historical images. 相似文献
3.
《Signal Processing: Image Communication》2005,20(5):487-502
To efficiently compress rasterized compound documents, an encoder must be content-adaptive. Content adaptivity may be achieved by employing a layered approach. In such an approach, a compound image is segmented into layers so that appropriate encoders can be used to compress these layers individually. A major factor in using standard encoders efficiently is to match the layers’ characteristics to those of the encoders by using data filling techniques to fill-in the initially sparse layers. In this work we present a review of methods dealing with data filling and propose also a sub-optimal non-linear projections scheme that efficiently matches the baseline JPEG coder in compressing background layers, leading to smaller files with better image quality. 相似文献
4.
Side match and overlap match vector quantizers for images 总被引:6,自引:0,他引:6
A class of vector quantizers with memory that are known as finite state vector quantizers (FSVQs) in the image coding framework is investigated. Two FSVQ designs, namely side match vector quantizers (SMVQs) and overlap match vector quantizers (OMVQs), are introduced. These designs take advantage of the 2-D spatial contiguity of pixel vectors as well as the high spatial correlation of pixels in typical gray-level images. SMVQ and OMVQ try to minimize the granular noise that causes visible pixel block boundaries in ordinary VQ. For 512 by 512 gray-level images, SMVQ and OMVQ can achieve communication quality reproduction at an average of 1/2 b/pixel per image frame, and acceptable quality reproduction. Because block boundaries are less visible, the perceived improvement in quality over ordinary VQ is even greater. Owing to the structure of SMVQ and OMVQ, simple variable length noiseless codes can achieve as much as 60% bit rate reduction over fixed-length noiseless codes. 相似文献
5.
文本图像二值化是文本图像识别的重要步骤,由于光照不均或文档水渍等原因导致文本图像退化,增加了文本图像识别的难度。本文对一种局部阈值算法进行了改进,首先对图像进行水平投影,根据直方图的极小点对版面进行简单划分,再利用全局阈值法估算出更为准确的各区域字符笔画宽度,从而自适应地得到适当的窗口尺寸,再利用对比图和局部阈值进行图像二值化,并结合OTSU图像消除原算法产生的伪轮廓。实验与分析表明,改进后的方法能够明显消除因笔画粗细不均、字符大小不同而产生的前景像素误识问题。 相似文献
6.
This paper presents a modified JPEG coder that is applied to the compression of mixed documents (containing text, natural images, and graphics) for printing purposes. The modified JPEG coder proposed in this paper takes advantage of the distinct perceptually significant regions in these documents to achieve higher perceptual quality than the standard JPEG coder. The region-adaptivity is performed via classified thresholding being totally compliant with the baseline standard. A computationally efficient classification algorithm is presented, and the improved performance of the classified JPEG coder is verified. 相似文献
7.
Jim Z. C. Lai Chia-Chi Chen 《Journal of Visual Communication and Image Representation》2003,14(4):389-404
Printers usually generate a limited number of colors and lack the ability of producing continuous-tone color images. Traditional error-diffusion algorithms are used to solve this problem. Compared with other approaches, the approaches of using error-diffusion in general can generate halftoned images of better quality. However, smeared edges and textures may occur in these halftoned images. To produce halftoned images of higher quality, these artifacts due to unstable images, dot-overlap, and error-diffusion must be eliminated or reduced. In this paper, we show that unstable images can be eliminated or reduced through using a proper color difference formula to select the reproduction colors even vector error-diffusion is performed in the RGB domain. We also present a method of using different filters to halftone different components of a color. This approach may have clearer and sharper edges for halftoned color images. Unexpected colors may be generated due to dot-overlap in the printing process. We have presented a method to eliminate this color distortion in the process of error-diffusion. Halftoning a color image by our proposed error-diffusion algorithm with edge enhancement has the following characteristics: the unstable images do not exist; the color-error caused by dot-overlap is corrected; and the smeared edges are sharpened. 相似文献
8.
Image segmentation is an important component of any document image analysis system. While many segmentation algorithms exist in the literature, very few i) allow users to specify the physical style, and ii) incorporate user-specified style information into the algorithm's objective function that is to be minimized. We describe a segmentation algorithm that models a document's physical structure as a hierarchical structure where each node describes a region of the document using a stochastic regular grammar. The exact form of the hierarchy and the stochastic language is specified by the user, while the probabilities associated with the transitions are estimated from groundtruth data. We demonstrate the segmentation algorithm on images of bilingual dictionaries. 相似文献
9.
In this paper, we explore H.264/AVC operating in intraframe mode to compress a mixed image, i.e., composed of text, graphics, and pictures. Even though mixed contents (compound) documents usually require the use of multiple compressors, we apply a single compressor for both text and pictures. For that, distortion is taken into account differently between text and picture regions. Our approach is to use a segmentation-driven adaptation strategy to change the H.264/AVC quantization parameter on a macroblock by macroblock basis, i.e., we deviate bits from pictorial regions to text in order to keep text edges sharp. We show results of a segmentation driven quantizer adaptation method applied to compress documents. Our reconstructed images have better text sharpness compared to straight unadapted coding, at negligible visual losses on pictorial regions. Our results also highlight the fact that H.264/AVC-INTRA outperforms coders such as JPEG-2000 as a single coder for compound images. 相似文献
10.
Compound document images contain graphic or textual content along with pictures. They are a very common form of documents, found in magazines, brochures, Web sites, etc. We focus our attention on the mixed raster content (MRC) multilayer approach for compound image compression. We study block thresholding as a means to segment an image for MRC. An attempt is made to optimize the block threshold in a rate-distortion sense. Also, a fast algorithm is presented to approximate the optimized method. Extensive results are presented including rate-distortion curves, segmentation masks and reconstructed images, showing the performance of the proposed algorithm. 相似文献
11.
12.
Hillman P. Hannah J. Renshaw D. 《Vision, Image and Signal Processing, IEE Proceedings -》2005,152(4):387-397
Segmentation of images into foreground (an actor) and background is required for many motion picture special effects. To produce these shots, the unwanted background must be removed so that none of it appears in the final composite shot. The standard approach requires the background to be a blue screen. Systems that are capable of segmenting actors from more natural backgrounds have been proposed, but many of these are not readily adaptable to the resolution involved in motion picture imaging. An algorithm is presented that requires minimal human interaction to segment motion picture resolution images. Results from this algorithm are quantitatively compared with alternative approaches. Adaptations to the algorithm, which enable segmentation even when the foreground is lit from behind, are described. Segmentation of image sequences normally requires manual creation of a separate hint image for each frame of a sequence. An algorithm is presented that generates such hint images automatically, so that only a single input is required for an entire sequence. Results are presented that show that the algorithm successfully generates hint images where an alternative approach fails. 相似文献
13.
Microwave imaging by in-line holography is demonstrated experimentally. The twin image, reference level and autocorrelation contributions to the required image are suppressed by digital processing. An extension of the technique which provides better image reconstruction is introduced. Results are compared with images obtained from direct phase and amplitude recording. 相似文献
14.
15.
提出了一种基于混合高斯模型拟合选取阈值和区域生长的图像分割方法:首先利用笔画方向算子对文字笔画进行抽样,再利用混合高斯模型拟合其灰度直方图,确定样本的最佳分割阈值,最后利用样本的标准差作为生长规则的判断依据分割文字。该算法计算量小,实时性和分割精度均有一定优势,在提取目标的同时,残留背景像素极少,使下一步的目标识别更为简单。 相似文献
16.
Wireless Personal Communications - Information retrieval (IR) defines the process of searching and attaining specific information resources which are related to the specific information... 相似文献
17.
Propagation front or grassfire methods are very popular in image processing because of their efficiency and because of their inherent geodesic nature. However, because of their random-access nature, they are inefficient in large images that cannot fit in available random access memory. We explore ways to increase the memory efficiency of two algorithms that use propagation fronts: the skeletonization by influence zones and the watershed transform. Two algorithms are presented for the skeletonization by influence zones. The first computes the skeletonization on surfaces without storing the enclosing volume. The second performs the skeletonization without any region reference, by using only the propagation fronts. The watershed transform algorithm that was developed keeps in memory the propagation fronts and only one greylevel of the image. All three algorithms use much less memory than the ones presented in the literature so far. Several techniques have been developed in this work in order to minimize the effect of these set operations. These include fast search methods, double propagation fronts, directional propagation, and others. 相似文献
18.
Spatiotriangulation with multisensor VIR/SAR images 总被引:3,自引:0,他引:3
19.
Dimitris E Maroulis Michalis A Savelonas Dimitris K Iakovidis Stavros A Karkanis Nikos Dimitropoulos 《IEEE transactions on information technology in biomedicine》2007,11(5):537-543
This paper presents a computer-aided approach for nodule delineation in thyroid ultrasound (US) images. The developed algorithm is based on a novel active contour model, named variable background active contour (VBAC), and incorporates the advantages of the level set region-based active contour without edges (ACWE) model, offering noise robustness and the ability to delineate multiple nodules. Unlike the classic active contour models that are sensitive in the presence of intensity inhomogeneities, the proposed VBAC model considers information of variable background regions. VBAC has been evaluated on synthetic images, as well as on real thyroid US images. From the quantification of the results, two major impacts have been derived: 1) higher average accuracy in the delineation of hypoechoic thyroid nodules, which exceeds 91%; and 2) faster convergence when compared with the ACWE model. 相似文献
20.
Wang GL Zhou Y Chen AH Zhang PM Liang PJ 《IEEE transactions on bio-medical engineering》2006,53(6):1195-1198
Spike sorting is the mandatory first step in analyzing multiunit recording signals for studying information processing mechanisms within the nervous system. Extracellular recordings usually contain overlapped spikes produced by a number of neurons adjacent to the electrode, together with unknown background noise, which in turn induce some difficulties in neural signal identification. In this paper, we propose a robust method to deal with these problems, which employs an automatic overlap decomposition technique based on the relaxation algorithm that requires simple fast Fourier transforms. The performance of the presented system was tested at various signal-to-noise ratio levels based on synthetic data that were generated from real recordings. 相似文献