首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Abstract. For document images corrupted by various kinds of noise, direct binarization images may be severely blurred and degraded. A common treatment for this problem is to pre-smooth input images using noise-suppressing filters. This article proposes an image-smoothing method used for prefiltering the document image binarization. Conceptually, we propose that the influence range of each pixel affecting its neighbors should depend on local image statistics. Technically, we suggest using coplanar matrices to capture the structural and textural distribution of similar pixels at each site. This property adapts the smoothing process to the contrast, orientation, and spatial size of local image structures. Experimental results demonstrate the effectiveness of the proposed method, which compares favorably with existing methods in reducing noise and preserving image features. In addition, due to the adaptive nature of the similar pixel definition, the proposed filter output is more robust regarding different noise levels than existing methods. Received: October 31, 2001 / October 09, 2002 Correspondence to:L. Fan (e-mail: fanlixin@ieee.org)  相似文献   

2.
On fast microscopic browsing of MPEG-compressed video   总被引:1,自引:0,他引:1  
MPEG has been established as a compression standard for efficient storage and transmission of digital video. However, users are limited to VCR-like (and tedious) functionalities when viewing MPEG video. The usefulness of MPEG video is presently limited by the lack of tools available for fast browsing, manipulation and processing of MPEG video. In this paper, we first address the problem of rapid access to individual shots and frames in MPEG video. We build upon the compressed-video-processing framework proposed in [1, 8], and propose new and fast algorithms based on an adaptive mixture of approximation techniques for extracting spatially reduced image sequence of uniform quality from MPEG video across different frame types and also under different motion activities in the scenes. The algorithms execute faster than real time on a Pentium personal computer. We demonstrate how the reduced images facilitate fast and convenient shot- and frame-level video browsing and access, shot-level editing and annotation, without the need for frequent decompression of MPEG video. We further propose methods for reducing the auxiliary data size associated with the reduced images through exploitation of spatial and temporal redundancy. We also address how the reduced images lead to computationally efficient algorithms for video analysis based on intra- and inter-shot processing for video database and browsing applications. The algorithms, tools for browsing and techniques for video processing presented in this paper have been used by many in IBM Research on more than 30 h of MPEG-1 video for video browsing and analysis.  相似文献   

3.
一种基于数学形态学的分形维数估计方法   总被引:5,自引:1,他引:5       下载免费PDF全文
对于分形维数的估计是基于分形理论的纹理图像分割算法中最重要的环节。由于使用固定划分的规则网格,常用的基于盒计数的分形维数估计算法及其各种改进方法的误差较大;而传统的形态学维数估计算法虽然在准确性上有一定提高.但其时间复杂度偏高。为此提出了一种基于可变结构元的数学形态学分形维数估计方法(VSEM)。该方法将灰度图像视为一个三维空间中的曲面,使用一组不同尺度的结构元分别度量该曲面.根据度量结果与尺度之间满足的指数率来估计图像表面的分形维数。通过恰当的选择结构元和使用递推技术得到不同尺度下的膨胀结果,新方法成功地弥补了现有算法的不足。本文使用了一组合成纹理和一组自然纹理来评估几种常见的分形维数估计算法。结果显示,本文提出的新方法能够在较小的时间复杂度下,得到更为精确的估计结果。最后,将该方法应用于遥感图像的分割。与其他常用的分形分割算法相比,使用该方法估计的分形维数和图像的临域均值作为特征能够得到更好的分割结果。在对比分析和分割实验中表现出的良好性能说明本文提出的分形维数估计算法可以有效地应用于纹理图像分割。  相似文献   

4.
The paper deals with a nonuniform diffusion filtering of magnetic resonance (MR) tomograms. Alternative digital schemes for discrete implementation of the nonuniform diffusion equations are analyzed and tested. A novel locally adaptive conductance for the geometry-driven diffusion (GDD) filtering is proposed. It is based on a measure of the neighborhood unhomogeneity adopted from the optimal orientation detection of linear symmetry. The algorithm performance is evaluated on the basis of pseudoartificial 2D MR brain phantom and using the signal-to-noise ratio, as well as HC measure, developed for image discrimination characterization. Three filtering methods are applied to MR images acquired by the fast 3D FLASH sequence. The results obtained are quantitatively and visually compared and discussed. Received: 24 April 1997 / Accepted: 10 November 1997  相似文献   

5.
Abstract. For some multimedia applications, it has been found that domain objects cannot be represented as feature vectors in a multidimensional space. Instead, pair-wise distances between data objects are the only input. To support content-based retrieval, one approach maps each object to a k-dimensional (k-d) point and tries to preserve the distances among the points. Then, existing spatial access index methods such as the R-trees and KD-trees can support fast searching on the resulting k-d points. However, information loss is inevitable with such an approach since the distances between data objects can only be preserved to a certain extent. Here we investigate the use of a distance-based indexing method. In particular, we apply the vantage point tree (vp-tree) method. There are two important problems for the vp-tree method that warrant further investigation, the n-nearest neighbors search and the updating mechanisms. We study an n-nearest neighbors search algorithm for the vp-tree, which is shown by experiments to scale up well with the size of the dataset and the desired number of nearest neighbors, n. Experiments also show that the searching in the vp-tree is more efficient than that for the -tree and the M-tree. Next, we propose solutions for the update problem for the vp-tree, and show by experiments that the algorithms are efficient and effective. Finally, we investigate the problem of selecting vantage-point, propose a few alternative methods, and study their impact on the number of distance computation. Received June 9, 1998 / Accepted January 31, 2000  相似文献   

6.
目的 纹理特征提取一直是遥感图像分析领域研究的热点和难点。现有的纹理特征提取方法主要集中于研究单波段灰色遥感图像,如何提取多波段彩色遥感图像的纹理特征,是多光谱遥感的研究前沿。方法 提出了一种基于流形学习的彩色遥感图像分维数估算方法。该方法利用局部线性嵌入方法,对由颜色属性所组成的5-D欧氏超曲面进行维数简约处理;再将维数简约处理后的颜色属性用于分维数估算。结果 利用Landsat-7遥感卫星数据和GeoEye-1遥感卫星数据进行实验,结果表明,同Peleg法和Sarkar法等其他分维数估算方法相比,本文方法具有较小的拟合误差。其中,其他4种对比方法所获拟合误差E平均值分别是本文方法所获得拟合误差E平均值的26.2倍、5倍、26.3倍、5倍。此外,本文方法不仅可提供具有较好分类特性的分维数,而且还能提供相对于其他4种对比方法更加稳健的分维数。结论 在针对中低分辨率的真彩遥感图像和假彩遥感图像以及高分辨率彩色合成遥感图像方面,本文方法能够利用不同地物所具有颜色属性信息,提取出各类型地物所对应的纹理信息,有效地改善了分维数对不同地物的区分能力。这对后续研究各区域中不同类型地物的分布情况及针对不同类型地物分布特点而制定区域规划及开发具有积极意义。  相似文献   

7.
We propose a system that simultaneously utilizes the stereo disparity and optical flow information of real-time stereo grayscale multiresolution images for the recognition of objects and gestures in human interactions. For real-time calculation of the disparity and optical flow information of a stereo image, the system first creates pyramid images using a Gaussian filter. The system then determines the disparity and optical flow of a low-density image and extracts attention regions in a high-density image. The three foremost regions are recognized using higher-order local autocorrelation features and linear discriminant analysis. As the recognition method is view based, the system can process the face and hand recognitions simultaneously in real time. The recognition features are independent of parallel translations, so the system can use unstable extractions from stereo depth information. We demonstrate that the system can discriminate the users, monitor the basic movements of the user, smoothly learn an object presented by users, and can communicate with users by hand signs learned in advance. Received: 31 January 2000 / Accepted: 1 May 2001 Correspondence to: I. Yoda (e-mail: yoda@ieee.org, Tel.: +81-298-615941, Fax: +81-298-613313)  相似文献   

8.
The paper deals with the problems of staircase artifacts and low-contrast boundary smoothing in filtering (magnetic resonance MR) brain tomograms that is based on geometry-driven diffusion (GDD). A novel method of the model-based GDD filtering of MR brain tomograms is proposed to tackle these problems. It is based on a local adaptation of the conductance that is defined for each diffusion iteration within the variable limits. The local adaptation uses a neighborhood inhomogeneity measure, pixel dissimilarity, while gradient histograms of MR brain template regions are used as the variable limits for the conductance. A methodology is developed for implementing the template image selected from an MR brain atlas to the model-based GDD filtering. The proposed method is tested on an MR brain phantom. The methodology developed is exemplified on the real MR brain tomogram with the corresponding template selected from the Brainweb. The performance of the developed algorithms is evaluated quantitatively and visually. Received: 1 September 1998 / Accepted: 20 August 2000  相似文献   

9.
In this paper, we present two novel disk failure recovery methods that utilize the inherent characteristics of video streams for efficient recovery. Whereas the first method exploits the inherent redundancy in video streams (rather than error-correcting codes) to approximately reconstruct data stored on failed disks, the second method exploits the sequentiality of video playback to reduce the overhead of online failure recovery in conventional RAID arrays. For the former approach, we present loss-resilient versions of JPEG and MPEG compression algorithms. We present an inherently redundant array of disks (IRAD) architecture that combines these loss-resilient compression algorithms with techniques for efficient placement of video streams on disk arrays to ensure that on-the-fly recovery does not impose any additional load on the array. Together, they enhance the scalability of multimedia servers by (1) integrating the recovery process with the decompression of video streams, and thereby distributing the reconstruction process across the clients; and (2) supporting graceful degradation in the quality of recovered images with increase in the number of disk failures. We present analytical and experimental results to show that both schemes significantly reduce the failure recovery overhead in a multimedia server.  相似文献   

10.
The optimized distance-based access methods currently available for multidimensional indexing in multimedia databases have been developed based on two major assumptions: a suitable distance function is known a priori and the dimensionality of the image features is low. It is not trivial to define a distance function that best mimics human visual perception regarding image similarity measurements. Reducing high-dimensional features in images using the popular principle component analysis (PCA) might not always be possible due to the non-linear correlations that may be present in the feature vectors. We propose in this paper a fast and robust hybrid method for non-linear dimensions reduction of composite image features for indexing in large image database. This method incorporates both the PCA and non-linear neural network techniques to reduce the dimensions of feature vectors so that an optimized access method can be applied. To incorporate human visual perception into our system, we also conducted experiments that involved a number of subjects classifying images into different classes for neural network training. We demonstrate that not only can our neural network system reduce the dimensions of the feature vectors, but that the reduced dimensional feature vectors can also be mapped to an optimized access method for fast and accurate indexing. Received 11 June 1998 / Accepted 25 July 2000 Published online: 13 February 2001  相似文献   

11.
Algorithms for coplanar camera calibration   总被引:5,自引:0,他引:5  
Abstract. Coplanar camera calibration is the process of determining the extrinsic and intrinsic camera parameters from a given set of image and world points, when the world points lie on a two-dimensional plane. Noncoplanar calibration, on the other hand, involves world points that do not lie on a plane. While optimal solutions for both the camera-calibration procedures can be obtained by solving a set of constrained nonlinear optimization problems, there are significant structural differences between the two formulations. We investigate the computational and algorithmic implications of such underlying differences, and provide a set of efficient algorithms that are specifically tailored for the coplanar case. More specifically, we offer the following: (1) four algorithms for coplanar calibration that use linear or iterative linear methods to solve the underlying nonlinear optimization problem, and produce sub-optimal solutions. These algorithms are motivated by their computational efficiency and are useful for real-time low-cost systems. (2) Two optimal solutions for coplanar calibration, including one novel nonlinear algorithm. A constraint for the optimal estimation of extrinsic parameters is also given. (3) A Lyapunov type convergence analysis for the new nonlinear algorithm. We test the validity and performance of the calibration procedures with both synthetic and real images. The results consistently show significant improvements over less complete camera models. Received: 30 September 1998 / Accepted: 12 January 2000  相似文献   

12.
Motion picture films are susceptible to local degradations such as dust spots. Other deteriorations are global such as intensity and spatial jitter. It is obvious that motion needs to be compensated for before the detection/correction of such local and dynamic defects. Therefore, we propose a hierarchical motion estimation method ideally suited for high resolution film sequences. This recursive block-based motion estimator relies on an adaptive search strategy and Radon projections to improve processing speed. The localization of dust particles then becomes straightforward. Thus, it is achieved by simple inter-frame differences between the current image and motion compensated successive and preceding frames. However, the detection of spatial and intensity jitter requires a specific process taking advantage of the high temporal correlation in the image sequence. In this paper, we present our motion compensation-based algorithms for removing dust spots, spatial and intensity jitter in degraded motion pictures. Experimental results are presented showing the usefulness of our motion estimator for film restoration at reasonable computational costs. Received: 9 July 2000 / Accepted: 13 January 2002 Correspondence to:S. Boukir  相似文献   

13.
基于分形的水声图像目标探测   总被引:4,自引:0,他引:4       下载免费PDF全文
针对水声图像中人造物体的探测问题,给出了一种基于分形分析的方法,由于分形模型可以较好地模拟自然物体,而与人工物体存在较大差距,所以以其为主要特征可以准确地将人造物体从自然背景中探测出来。本文讨论了分维的提取方法,根据分形特征将水声图像标记为人造目标区域和非人造目标区域,并对一定噪声干扰下该方法的应用进行了研究,给出了相应的实验结果。实验结果表明,分形特征可以实现人造目标和自然物体的分类,并具有一定的抗噪声性,适宜对水声图像中的目标进行探测和识别。  相似文献   

14.
Standard methods for sub-pixel matching are iterative and nonlinear; they are also sensitive to false initialization and window deformation. In this paper, we present a linear method that incorporates information from neighboring pixels. Two algorithms are presented: one ‘fast’ and one ‘robust’. They both start from an initial rough estimate of the matching. The fast one is suitable for pairs of images requiring negligible window deformation. The robust method is slower but more general and more precise. It eliminates false matches in the initialization by using robust estimation of the local affine deformation. The first algorithm attains an accuracy of 0.05 pixels for interest points and 0.06 for random points in the translational case. For the general case, if the deformation is small, the second method gives an accuracy of 0.05 pixels; while for large deformation, it gives an accuracy of about 0.06 pixels for points of interest and 0.10 pixels for random points. They are very few false matches in all cases, even if there are many in the initialization. Received: 24 July 1997 / Accepted: 4 December 1997  相似文献   

15.
Association Rule Mining algorithms operate on a data matrix (e.g., customers products) to derive association rules [AIS93b, SA96]. We propose a new paradigm, namely, Ratio Rules, which are quantifiable in that we can measure the “goodness” of a set of discovered rules. We also propose the “guessing error” as a measure of the “goodness”, that is, the root-mean-square error of the reconstructed values of the cells of the given matrix, when we pretend that they are unknown. Another contribution is a novel method to guess missing/hidden values from the Ratio Rules that our method derives. For example, if somebody bought $10 of milk and $3 of bread, our rules can “guess” the amount spent on butter. Thus, unlike association rules, Ratio Rules can perform a variety of important tasks such as forecasting, answering “what-if” scenarios, detecting outliers, and visualizing the data. Moreover, we show that we can compute Ratio Rules in a single pass over the data set with small memory requirements (a few small matrices), in contrast to association rule mining methods which require multiple passes and/or large memory. Experiments on several real data sets (e.g., basketball and baseball statistics, biological data) demonstrate that the proposed method: (a) leads to rules that make sense; (b) can find large itemsets in binary matrices, even in the presence of noise; and (c) consistently achieves a “guessing error” of up to 5 times less than using straightforward column averages. Received: March 15, 1999 / Accepted: November 1, 1999  相似文献   

16.
Stop word location and identification for adaptive text recognition   总被引:2,自引:0,他引:2  
Abstract. We propose a new adaptive strategy for text recognition that attempts to derive knowledge about the dominant font on a given page. The strategy uses a linguistic observation that over half of all words in a typical English passage are contained in a small set of less than 150 stop words. A small dictionary of such words is compiled from the Brown corpus. An arbitrary text page first goes through layout analysis that produces word segmentation. A fast procedure is then applied to locate the most likely candidates for those words, using only widths of the word images. The identity of each word is determined using a word shape classifier. Using the word images together with their identities, character prototypes can be extracted using a previously proposed method. We describe experiments using simulated and real images. In an experiment using 400 real page images, we show that on average, eight distinct characters can be learned from each page, and the method is successful on 90% of all the pages. These can serve as useful seeds to bootstrap font learning. Received October 8, 1999 / Revised March 29, 2000  相似文献   

17.
Binarization of document images with poor contrast, strong noise, complex patterns, and variable modalities in the gray-scale histograms is a challenging problem. A new binarization algorithm has been developed to address this problem for personal cheque images. The main contribution of this approach is optimizing the binarization of a part of the document image that suffers from noise interference, referred to as the Target Sub-Image (TSI), using information easily extracted from another noise-free part of the same image, referred to as the Model Sub-Image (MSI). Simple spatial features extracted from MSI are used as a model for handwriting strokes. This model captures the underlying characteristics of the writing strokes, and is invariant to the handwriting style or content. This model is then utilized to guide the binarization in the TSI. Another contribution is a new technique for the structural analysis of document images, which we call “Wavelet Partial Reconstruction” (WPR). The algorithm was tested on 4,200 cheque images and the results show significant improvement in binarization quality in comparison with other well-established algorithms. Received: October 10, 2001 / Accepted: May 7, 2002 This research was supported in part by NCR and NSERC's industrial postgraduate scholarship No. 239464. A simplified version of this paper has been presented at ICDAR 2001 [3].  相似文献   

18.
Comparing images using joint histograms   总被引:11,自引:0,他引:11  
Color histograms are widely used for content-based image retrieval due to their efficiency and robustness. However, a color histogram only records an image's overall color composition, so images with very different appearances can have similar color histograms. This problem is especially critical in large image databases, where many images have similar color histograms. In this paper, we propose an alternative to color histograms called a joint histogram, which incorporates additional information without sacrificing the robustness of color histograms. We create a joint histogram by selecting a set of local pixel features and constructing a multidimensional histogram. Each entry in a joint histogram contains the number of pixels in the image that are described by a particular combination of feature values. We describe a number of different joint histograms, and evaluate their performance for image retrieval on a database with over 210,000 images. On our benchmarks, joint histograms outperform color histograms by an order of magnitude.  相似文献   

19.
In this paper, we propose a novel system that strives to achieve advanced content-based image retrieval using seamless combination of two complementary approaches: on the one hand, we propose a new color-clustering method to better capture color properties of the original images; on the other hand, expecting that image regions acquired from the original images inevitably contain many errors, we make use of the available erroneous, ill-segmented image regions to accomplish the object-region-based image retrieval. We also propose an effective image-indexing scheme to facilitate fast and efficient image matching and retrieval. The carefully designed experimental evaluation shows that our proposed image retrieval system surpasses other methods under comparison in terms of not only quantitative measures, but also image retrieval capabilities.  相似文献   

20.
In this paper, we address the analysis of 3D shape and shape change in non-rigid biological objects imaged via a stereo light microscope. We propose an integrated approach for the reconstruction of 3D structure and the motion analysis for images in which only a few informative features are available. The key components of this framework are: 1) image registration using a correlation-based approach, 2) region-of-interest extraction using motion-based segmentation, and 3) stereo and motion analysis using a cooperative spatial and temporal matching process. We describe these three stages of processing and illustrate the efficacy of the proposed approach using real images of a live frog's ventricle. The reconstructed dynamic 3D structure of the ventricle is demonstrated in our experimental results, and it agrees qualitatively with the observed images of the ventricle.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号