首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 11 毫秒
1.
2.
Snakes are active contours that minimize an energy function. We present a new kind of active contours called “Sandwich Snakes”. They are formed by two snakes, one inside and the other outside of the curve that one is looking for. They have the same number of particles, which are connected in one-to-one correspondence. At the minimum the two snakes have the same position. We also present here a multi-scale system, where Sandwich Snakes are adjusted at increasing resolutions, and an interactive tool that permits one to easily specify the initial position for the Sandwich Snakes. Sandwich Snakes exhibit very good perfomance detecting contours with complex shapes, where the traditional methods fail. They are also very robust with respect to noise. Received: 29 January 1999 / Accepted: 20 August 2000  相似文献   

3.
Comparing images using joint histograms   总被引:11,自引:0,他引:11  
Color histograms are widely used for content-based image retrieval due to their efficiency and robustness. However, a color histogram only records an image's overall color composition, so images with very different appearances can have similar color histograms. This problem is especially critical in large image databases, where many images have similar color histograms. In this paper, we propose an alternative to color histograms called a joint histogram, which incorporates additional information without sacrificing the robustness of color histograms. We create a joint histogram by selecting a set of local pixel features and constructing a multidimensional histogram. Each entry in a joint histogram contains the number of pixels in the image that are described by a particular combination of feature values. We describe a number of different joint histograms, and evaluate their performance for image retrieval on a database with over 210,000 images. On our benchmarks, joint histograms outperform color histograms by an order of magnitude.  相似文献   

4.
Fast image retrieval using color-spatial information   总被引:1,自引:0,他引:1  
In this paper, we present an image retrieval system that employs both the color and spatial information of images to facilitate the retrieval process. The basic unit used in our technique is a single-colored cluster, which bounds a homogeneous region of that color in an image. Two clusters from two images are similar if they are of the same color and overlap in the image space. The number of clusters that can be extracted from an image can be very large, and it affects the accuracy of retrieval. We study the effect of the number of clusters on retrieval effectiveness to determine an appropriate value for “optimal' performance. To facilitate efficient retrieval, we also propose a multi-tier indexing mechanism called the Sequenced Multi-Attribute Tree (SMAT). We implemented a two-tier SMAT, where the first layer is used to prune away clusters that are of different colors, while the second layer discriminates clusters of different spatial locality. We conducted an experimental study on an image database consisting of 12,000 images. Our results show the effectiveness of the proposed color-spatial approach, and the efficiency of the proposed indexing mechanism. Received August 1, 1997 / Accepted December 9, 1997  相似文献   

5.
We present an autonomous mobile robot navigation system using stereo fish-eye lenses for navigation in an indoor structured environment and for generating a model of the imaged scene. The system estimates the three-dimensional (3D) position of significant features in the scene, and by estimating its relative position to the features, navigates through narrow passages and makes turns at corridor ends. Fish-eye lenses are used to provide a large field of view, which images objects close to the robot and helps in making smooth transitions in the direction of motion. Calibration is performed for the lens-camera setup and the distortion is corrected to obtain accurate quantitative measurements. A vision-based algorithm that uses the vanishing points of extracted segments from a scene in a few 3D orientations provides an accurate estimate of the robot orientation. This is used, in addition to 3D recovery via stereo correspondence, to maintain the robot motion in a purely translational path, as well as to remove the effects of any drifts from this path from each acquired image. Horizontal segments are used as a qualitative estimate of change in the motion direction and correspondence of vertical segment provides precise 3D information about objects close to the robot. Assuming detected linear edges in the scene as boundaries of planar surfaces, the 3D model of the scene is generated. The robot system is implemented and tested in a structured environment at our research center. Results from the robot navigation in real environments are presented and discussed. Received: 25 September 1996 / Accepted: 20 October 1996  相似文献   

6.
Analyzing scenery images by monotonic tree   总被引:3,自引:0,他引:3  
Content-based image retrieval (CBIR) has been an active research area in the last ten years, and a variety of techniques have been developed. However, retrieving images on the basis of low-level features has proven unsatisfactory, and new techniques are needed to support high-level queries. Research efforts are needed to bridge the gap between high-level semantics and low-level features. In this paper, we present a novel approach to support semantics-based image retrieval. Our approach is based on the monotonic tree, a derivation of the contour tree for use with discrete data. The structural elements of an image are modeled as branches (or subtrees) of the monotonic tree. These structural elements are classified and clustered on the basis of such properties as color, spatial location, harshness and shape. Each cluster corresponds to some semantic feature. This scheme is applied to the analysis and retrieval of scenery images. Comparisons of experimental results of this approach with conventional techniques using low-level features demonstrate the effectiveness of our approach.  相似文献   

7.
A feature-based algorithm for detecting and classifying production effects   总被引:22,自引:0,他引:22  
We describe a new approach to the detection and classification of production effects in video sequences. Our method can detect and classify a variety of effects, including cuts, fades, dissolves, wipes and captions, even in sequences involving significant motion. We detect the appearance of intensity edges that are distant from edges in the previous frame. A global motion computation is used to handle camera or object motion. The algorithm we propose withstands JPEG and MPEG artifacts, even at high compression rates. Experimental evidence demonstrates that our method can detect and classify production effects that are difficult to detect with previous approaches.  相似文献   

8.
Abstract. For some multimedia applications, it has been found that domain objects cannot be represented as feature vectors in a multidimensional space. Instead, pair-wise distances between data objects are the only input. To support content-based retrieval, one approach maps each object to a k-dimensional (k-d) point and tries to preserve the distances among the points. Then, existing spatial access index methods such as the R-trees and KD-trees can support fast searching on the resulting k-d points. However, information loss is inevitable with such an approach since the distances between data objects can only be preserved to a certain extent. Here we investigate the use of a distance-based indexing method. In particular, we apply the vantage point tree (vp-tree) method. There are two important problems for the vp-tree method that warrant further investigation, the n-nearest neighbors search and the updating mechanisms. We study an n-nearest neighbors search algorithm for the vp-tree, which is shown by experiments to scale up well with the size of the dataset and the desired number of nearest neighbors, n. Experiments also show that the searching in the vp-tree is more efficient than that for the -tree and the M-tree. Next, we propose solutions for the update problem for the vp-tree, and show by experiments that the algorithms are efficient and effective. Finally, we investigate the problem of selecting vantage-point, propose a few alternative methods, and study their impact on the number of distance computation. Received June 9, 1998 / Accepted January 31, 2000  相似文献   

9.
We present several algorithms suitable for analysis of broadcast video. First, we show how wavelet analysis of frames of video can be used to detect transitions between shots in a video stream, thereby dividing the stream into segments. Next we describe how each segment can be inserted into a video database using an indexing scheme that involves a wavelet-based “signature.” Finally, we show that during a subsequent broadcast of a similar or identical video clip, the segment can be found in the database by quickly searching for the relevant signature. The method is robust against noise and typical variations in the video stream, even global changes in brightness that can fool histogram-based techniques. In the paper, we compare experimentally our shot transition mechanism to a color histogram implementation, and also evaluate the effectiveness of our database-searching scheme. Our algorithms are very efficient and run in realtime on a desktop computer. We describe how this technology could be employed to construct a “smart VCR” that was capable of alerting the viewer to the beginning of a specific program or identifying  相似文献   

10.
In this paper, we propose a multi-level abstraction mechanism for capturing the spatial and temporal semantics associated with various objects in an input image or in a sequence of video frames. This abstraction can manifest itself effectively in conceptualizing events and views in multimedia data as perceived by individual users. The objective is to provide an efficient mechanism for handling content-based queries, with the minimum amount of processing performed on raw data during query evaluation. We introduce a multi-level architecture for video data management at different levels of abstraction. The architecture facilitates a multi-level indexing/searching mechanism. At the finest level of granularity, video data can be indexed based on mere appearance of objects and faces. For management of information at higher levels of abstractions, an object-oriented paradigm is proposed which is capable of supporting domain specific views.  相似文献   

11.
Robust and efficient surface reconstruction from contours   总被引:1,自引:0,他引:1  
We propose a new approach for surface recovery from planar sectional contours. The surface is reconstructed based on the so-called “equal importance criterion,” which suggests that every point in the region contributes equally to the reconstruction process. The problem is then formulated in terms of a partial differential equation, and the solution is efficiently calculated from distance transformation. To make the algorithm valid for different application purposes, both the isosurface and the primitive representations of the object surface are derived. The isosurface is constructed by means of a partial differential equation, which can be solved iteratively. The traditional distance interpolating method, which was used by several researchers for surface reconstruction, is an approximate solution of the equation. The primitive representations are approximated by Voronoi diagram transformation of the surface space. Isosurfaces have the advantage that subsequent geometric analysis of the object can be easily carried out while primitive representation is easy to visualize. The proposed technique allows for surface recovery at any desired resolution, thus avoiding the inherent problems of correspondence, tiling, and branching.  相似文献   

12.
13.
The design and implementation of Harbin Institute of Technology—Digital Music Library (HIT-DML) is presented in this paper. Firstly, a novel framework, a music data model, and a query language are proposed as the theoretical foundation of the library. Secondly, music computing algorithms used in the library for feature extracting and matching are described. In addition, indices are introduced for both mining themes of music objects and accelerating content-based information retrieval. Finally, experimental results on the indices and the current development of the library are provided. HIT-DML is distinguished by the following points. First, it is inherently based on database systems, and combines database technologies with multimedia technologies seamlessly. Musical data are structurally stored. Second, it has a solid theoretical foundation, from framework and data model to query language. Last, it can retrieve musical information based on content against different kinds of musical instruments. The indices used, also power the library.  相似文献   

14.
This paper provides an overview of a project aimed at using knowledge-based technology to improve accessibility of the Web for visually impaired users. The focus is on the multi-dimensional components of Web pages (tables and frames); our cognitive studies demonstrate that spatial information is essential in comprehending tabular data, and this aspect has been largely overlooked in the existing literature. Our approach addresses these issues by using explicit representations of the navigational semantics of the documents and using a domain-specific language to query the semantic representation and derive navigation strategies. Navigational knowledge is explicitly generated and associated to the tabular and multi-dimensional HTML structures of documents. This semantic representation provides to the blind user an abstract representation of the layout of the document; the user is then allowed to issue commands from the domain-specific language to access and traverse the document according to its abstract layout. Published online: 6 November 2002  相似文献   

15.
This paper presents a new probabilistic model of information retrieval. The most important modeling assumption made is that documents and queries are defined by an ordered sequence of single terms. This assumption is not made in well-known existing models of information retrieval, but is essential in the field of statistical natural language processing. Advances already made in statistical natural language processing will be used in this paper to formulate a probabilistic justification for using tf×idf term weighting. The paper shows that the new probabilistic interpretation of tf×idf term weighting might lead to better understanding of statistical ranking mechanisms, for example by explaining how they relate to coordination level ranking. A pilot experiment on the TREC collection shows that the linguistically motivated weighting algorithm outperforms the popular BM25 weighting algorithm. Received: 17 December 1998 / Revised: 31 May 1999  相似文献   

16.
NeTra: A toolbox for navigating large image databases   总被引:17,自引:0,他引:17  
We present here an implementation of NeTra, a prototype image retrieval system that uses color, texture, shape and spatial location information in segmented image regions to search and retrieve similar regions from the database. A distinguishing aspect of this system is its incorporation of a robust automated image segmentation algorithm that allows object- or region-based search. Image segmentation significantly improves the quality of image retrieval when images contain multiple complex objects. Images are segmented into homogeneous regions at the time of ingest into the database, and image attributes that represent each of these regions are computed. In addition to image segmentation, other important components of the system include an efficient color representation, and indexing of color, texture, and shape features for fast search and retrieval. This representation allows the user to compose interesting queries such as “retrieve all images that contain regions that have the color of object A, texture of object B, shape of object C, and lie in the upper of the image”, where the individual objects could be regions belonging to different images. A Java-based web implementation of NeTra is available at http://vivaldi.ece.ucsb.edu/Netra.  相似文献   

17.
We present an efficient and accurate method for retrieving images based on color similarity with a given query image or histogram. The method matches the query against parts of the image using histogram intersection. Efficient searching for the best matching subimage is done by pruning the set of subimages using upper bound estimates. The method is fast, has high precision and recall and also allows queries based on the positions of one or more objects in the database image. Experimental results showing the efficiency of the proposed search method, and high precision and recall of retrieval are presented. Received: 20 January 1997 / Accepted: 5 January 1998  相似文献   

18.
Relevance feedback in image retrieval: A comprehensive review   总被引:22,自引:1,他引:22  
We analyze the nature of the relevance feedback problem in a continuous representation space in the context of content-based image retrieval. Emphasis is put on exploring the uniqueness of the problem and comparing the assumptions, implementations, and merits of various solutions in the literature. An attempt is made to compile a list of critical issues to consider when designing a relevance feedback algorithm. With a comprehensive review as the main portion, this paper also offers some novel solutions and perspectives throughout the discussion. RID="*" ID="*" Work was done while at the University of Illinois.  相似文献   

19.
In this paper, we propose a novel system that strives to achieve advanced content-based image retrieval using seamless combination of two complementary approaches: on the one hand, we propose a new color-clustering method to better capture color properties of the original images; on the other hand, expecting that image regions acquired from the original images inevitably contain many errors, we make use of the available erroneous, ill-segmented image regions to accomplish the object-region-based image retrieval. We also propose an effective image-indexing scheme to facilitate fast and efficient image matching and retrieval. The carefully designed experimental evaluation shows that our proposed image retrieval system surpasses other methods under comparison in terms of not only quantitative measures, but also image retrieval capabilities.  相似文献   

20.
This paper describes the development of an anthropomorphic visual sensor which generates a spatially variant resolution image by using a retina-like structure. This sensor consists of a dove prism for image rotation and two linear CCD sensors with 512 pixel/line resolution and holds approximately 45 kbytes of image data. The retina-like sensor has variable resolution with increasing density towards the center of the visual field and yields a polar-coordinate image directly. The motion analysis of the object in the scene from the optical flow is considerably simplified if the velocity is represented in polar coordinates, compared to the case when the image is represented in cartesian coordinates. A calibration procedure for the proposed retina-like sensor is also presented with experimental data to verify the validity of the system. Development of this sensor holds promise in applications to high-speed tracking systems, such as the eyes of navigation robots, because it has data reduction and polar mapping characteristics.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号