首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
NeTra: A toolbox for navigating large image databases   总被引:17,自引:0,他引:17  
We present here an implementation of NeTra, a prototype image retrieval system that uses color, texture, shape and spatial location information in segmented image regions to search and retrieve similar regions from the database. A distinguishing aspect of this system is its incorporation of a robust automated image segmentation algorithm that allows object- or region-based search. Image segmentation significantly improves the quality of image retrieval when images contain multiple complex objects. Images are segmented into homogeneous regions at the time of ingest into the database, and image attributes that represent each of these regions are computed. In addition to image segmentation, other important components of the system include an efficient color representation, and indexing of color, texture, and shape features for fast search and retrieval. This representation allows the user to compose interesting queries such as “retrieve all images that contain regions that have the color of object A, texture of object B, shape of object C, and lie in the upper of the image”, where the individual objects could be regions belonging to different images. A Java-based web implementation of NeTra is available at http://vivaldi.ece.ucsb.edu/Netra.  相似文献   

2.
In order to get useful information from various kinds of information sources, we first apply a searching process with query statements to retrieve candidate data objects (called a hunting process in this paper) and then apply a browsing process to check the properties of each object in detail by visualizing candidates. In traditional information retrieval systems, the hunting process determines the quality of the result, since there are only a few candidates left for the browsing process. In order to retrieve data from widely distributed digital libraries, the browsing process becomes very important, since the properties of data sources are not known in advance. After getting data from various information sources, a user checks the properties of data in detail using the browsing process. The result can be used to improve the hunting process or for selecting more appropriate visualization parameters. Visualization relationships among data are very important, but will become too time-consuming if the amount of data in the candidate set is large, for example, over one hundred objects. One of the important problems in handling information retrieval from a digital library is to create efficient and powerful visualization mechanisms for the browsing process. One promising way to solve the visualization problem is to map each candidate data object into a location in three-dimensional (3D) space using a proper distance definition. In this paper, we will introduce the functions and organization of a system having a browsing navigator to achieve an efficient browsing process in 3D information search space. This browsing navigator has the following major functions: ?1. Selection of features which determine the distance for visualization, in order to generate a uniform distribution of candidate data objects in the resulting space. ?2. Calculation of the location of the data objects in 2D space using the selected features. ?3. Construction of 3D browsing space by combining 2D spaces, in order to find the required data objects easily. ?4. Generation of the oblique views of 3D browsing space and data objects by reducing the overlap of data objects in order to make navigation easy for the user in 3D space. ?Examples of this browsing navigator applied to book data are shown. Received: 15 December 1997 / Revised: June 1999  相似文献   

3.
This paper describes a method for recognizing partially occluded objects under different levels of illumination brightness by using the eigenspace analysis. In our previous work, we developed the “eigenwindow” method to recognize the partially occluded objects in an assembly task, and demonstrated with sufficient high performance for the industrial use that the method works successfully for multiple objects with specularity under constant illumination. In this paper, we modify the eigenwindow method for recognizing objects under different illumination conditions, as is sometimes the case in manufacturing environments, by using additional color information. In the proposed method, a measured color in the RGB color space is transformed into one in the HSV color space. Then, the hue of the measured color, which is invariant to change in illumination brightness and direction, is used for recognizing multiple objects under different illumination conditions. The proposed method was applied to real images of multiple objects under various illumination conditions, and the objects were recognized and localized successfully.  相似文献   

4.
Detection, segmentation, and classification of specific objects are the key building blocks of a computer vision system for image analysis. This paper presents a unified model-based approach to these three tasks. It is based on using unsupervised learning to find a set of templates specific to the objects being outlined by the user. The templates are formed by averaging the shapes that belong to a particular cluster, and are used to guide a probabilistic search through the space of possible objects. The main difference from previously reported methods is the use of on-line learning, ideal for highly repetitive tasks. This results in faster and more accurate object detection, as system performance improves with continued use. Further, the information gained through clustering and user feedback is used to classify the objects for problems in which shape is relevant to the classification. The effectiveness of the resulting system is demonstrated in two applications: a medical diagnosis task using cytological images, and a vehicle recognition task. Received: 5 November 2000 / Accepted: 29 June 2001 Correspondence to: K.-M. Lee  相似文献   

5.
We present a novel approach to the robust classification of arbitrary object classes in complex, natural scenes. Starting from a re-appraisal of Marr's ‘primal sketch’, we develop an algorithm that (1) employs local orientations as the fundamental picture primitives, rather than the more usual edge locations, (2) retains and exploits the local spatial arrangement of features of different complexity in an image and (3) is hierarchically arranged so that the level of feature abstraction increases at each processing stage. The resulting, simple technique is based on the accumulation of evidence in binary channels, followed by a weighted, non-linear sum of the evidence accumulators. The steps involved in designing a template for recognizing a simple object are explained. The practical application of the algorithm is illustrated, with examples taken from a broad range of object classification problems. We discuss the performance of the algorithm and describe a hardware implementation. First successful attempts to train the algorithm, automatically, are presented. Finally, we compare our algorithm with other object classification algorithms described in the literature.  相似文献   

6.
In this paper, we present a method called MODEEP (Motion-based Object DEtection and Estimation of Pose) to detect independently moving objects (IMOs) in forward-looking infrared (FLIR) image sequences taken from an airborne, moving platform. Ego-motion effects are removed through a robust multi-scale affine image registration process. Thereafter, areas with residual motion indicate potential object activity. These areas are detected, refined and selected using a Bayesian classifier. The resulting regions are clustered into pairs such that each pair represents one object's front and rear end. Using motion and scene knowledge, we estimate object pose and establish a region of interest (ROI) for each pair. Edge elements within each ROI are used to segment the convex cover containing the IMO. We show detailed results on real, complex, cluttered and noisy sequences. Moreover, we outline the integration of our fast and robust system into a comprehensive automatic target recognition (ATR) and action classification system.  相似文献   

7.
When implementing persistent objects on a relational database, a major performance issue is prefetching data to minimize the number of round-trips to the database. This is especially hard with navigational applications, since future accesses are unpredictable. We propose the use of the context in which an object is loaded as a predictor of future accesses, where a context can be a stored collection of relationships, a query result, or a complex object. When an object O's state is loaded, similar state for other objects in O's context is prefetched. We present a design for maintaining context and for using it to guide prefetch. We give performance measurements of its implementation in Microsoft Repository, showing up to a 70% reduction in running time. We describe several variations of the optimization: selectively applying the technique based on application and database characteristics, using application-supplied performance hints, using concurrent database queries to support asynchronous prefetch, prefetching across relationship paths, and delayed prefetch to save database round-trips. Received May 3, 2000 / Accepted October 26, 2000  相似文献   

8.
Abstract. Conventional tracking methods encounter difficulties as the number of objects, clutter, and sensors increase, because of the requirement for data association. Statistical tracking, based on the concept of network tomography, is an alternative that avoids data association. It estimates the number of trips made from one region to another in a scene based on interregion boundary traffic counts accumulated over time. It is not necessary to track an object through a scene to determine when an object crosses a boundary. This paper describes statistical tracing and presents an evaluation based on the estimation of pedestrian and vehicular traffic intensities at an intersection over a period of 1 month. We compare the results with those from a multiple-hypothesis tracker and manually counted ground-truth estimates. Received: 30 August 2001 / Accepted: 28 May 2002 Correspondence to: J.E. Boyd  相似文献   

9.
Multimedia applications that are required to manipulate large collections of objects are becoming increasingly common. Moreover, the size of multimedia objects, which are already huge, are getting even bigger as the resolution of output devices improve. As a result, many multimedia storage systems are not likely to be able to keep all of their objects disk-resident. Instead, a majority of the less popular objects have to be off-loaded to tertiary storage to keep costs down. The speed at which objects can be accessed from tertiary storage is thus an important consideration. In this paper, we propose an adaptive data retrieval algorithm that employs a combination of staging and direct access in servicing tertiary storage retrieval requests. At retrieval time, an object that resides in tertiary storage can either be staged to and then played back from disks, or the object can be accessed directly from the tertiary drives. We show that a simplistic policy that adheres strictly to staging or direct access does not exploit the full retrieval capacity of both the tertiary library and the secondary storage. To overcome the problem, we propose a data retrieval algorithm that dynamically chooses between staging and direct access, based on the relative load on the tertiary versus secondary devices. A series of simulation experiments confirms that the algorithm achieves good access times over a wide range of workloads and resource configurations. Moreover, the algorithm is very responsive to changing load conditions.  相似文献   

10.
A bin picking system based on depth from defocus   总被引:3,自引:0,他引:3  
It is generally accepted that to develop versatile bin-picking systems capable of grasping and manipulation operations, accurate 3-D information is required. To accomplish this goal, we have developed a fast and precise range sensor based on active depth from defocus (DFD). This sensor is used in conjunction with a three-component vision system, which is able to recognize and evaluate the attitude of 3-D objects. The first component performs scene segmentation using an edge-based approach. Since edges are used to detect the object boundaries, a key issue consists of improving the quality of edge detection. The second component attempts to recognize the object placed on the top of the object pile using a model-driven approach in which the segmented surfaces are compared with those stored in the model database. Finally, the attitude of the recognized object is evaluated using an eigenimage approach augmented with range data analysis. The full bin-picking system will be outlined, and a number of experimental results will be examined. Received: 2 December 2000 / Accepted: 9 September 2001 Correspondence to: O. Ghita  相似文献   

11.
We propose a system that simultaneously utilizes the stereo disparity and optical flow information of real-time stereo grayscale multiresolution images for the recognition of objects and gestures in human interactions. For real-time calculation of the disparity and optical flow information of a stereo image, the system first creates pyramid images using a Gaussian filter. The system then determines the disparity and optical flow of a low-density image and extracts attention regions in a high-density image. The three foremost regions are recognized using higher-order local autocorrelation features and linear discriminant analysis. As the recognition method is view based, the system can process the face and hand recognitions simultaneously in real time. The recognition features are independent of parallel translations, so the system can use unstable extractions from stereo depth information. We demonstrate that the system can discriminate the users, monitor the basic movements of the user, smoothly learn an object presented by users, and can communicate with users by hand signs learned in advance. Received: 31 January 2000 / Accepted: 1 May 2001 Correspondence to: I. Yoda (e-mail: yoda@ieee.org, Tel.: +81-298-615941, Fax: +81-298-613313)  相似文献   

12.
We present a new active vision technique called zoom tracking. Zoom tracking is the continuous adjustment of a camera's focal length in order to keep a constant-sized image of an object moving along the camera's optical axis. Two methods for performing zoom tracking are presented: a closed-loop visual feedback algorithm based on optical flow, and use of depth information obtained from an autofocus camera's range sensor. We explore two uses of zoom tracking: recovery of depth information and improving the performance of scale-variant algorithms. We show that the image stability provided by zoom tracking improves the performance of algorithms that are scale variant, such as correlation-based trackers. While zoom tracking cannot totally compensate for an object's motion, due to the effect of perspective distortion, an analysis of this distortion provides a quantitative estimate of the performance of zoom tracking. Zoom tracking can be used to reconstruct a depth map of the tracked object. We show that under normal circumstances this reconstruction is much more accurate than depth from zooming, and works over a greater range than depth from axial motion while providing, in the worst case, only slightly less accurate results. Finally, we show how zoom tracking can also be used in time-to-contact calculations. Received: 15 February 2000 / Accepted: 19 June 2000  相似文献   

13.
This paper introduces an accurate, efficient, and unified engine dedicated to dynamic animation of d-dimensional deformable objects. The objects are modelled as d-dimensional manifolds defined as functional combinations of a mesh of 3D control points, weighted by parametric blending functions. This model ensures that, at each time step, the object shape conforms to its manifold definitions. The object motion is deduced from the control points dynamic animation. In fact, control points should be viewed as the degrees of freedom of the continuous object. The chosen dynamic equations (Lagrangian formalism) reflect this generic modelling scheme and yield an exact and computationally efficient linear system.  相似文献   

14.
Converting paper-based engineering drawings into CAD model files is a tedious process. Therefore, automating the conversion of such drawings represents tremendous time and labor savings. We present a complete system which interprets such 2D paper-based engineering drawings, and outputs 3D models that can be displayed as wireframes. The system performs the detection of dimension sets, the extraction of object lines, and the assembly of 3D objects from the extracted object lines. A knowledge-based method is used to remove dimension sets and text from ANSI engineering drawings, a graphics recognition procedure is used to extract complete object lines, and an evidential rule-based method is utilized to identify view relationships. While these methods are the subject of several of our previous papers, this paper focuses on the 3D interpretation of the object. This is accomplished using a technique based on evidential reasoning and a wide range of rules and heuristics. The system is limited to the interpretation of objects composed of planar, spherical, and cylindrical surfaces. Experimental results are presented. Received December 2, 1998 / Revised June 18, 1999  相似文献   

15.
In video processing, a common first step is to segment the videos into physical units, generally called shots. A shot is a video segment that consists of one continuous action. In general, these physical units need to be clustered to form more semantically significant units, such as scenes, sequences, programs, etc. This is the so-called story-based video structuring. Automatic video structuring is of great importance for video browsing and retrieval. The shots or scenes are usually described by one or several representative frames, called key-frames. Viewed from a higher level, key frames of some shots might be redundant in terms of semantics. In this paper, we propose automatic solutions to the problems of: (i) video partitioning, (ii) key frame computing, (iii) key frame pruning. For the first problem, an algorithm called “net comparison” is devised. It is accurate and fast because it uses both statistical and spatial information in an image and does not have to process the entire image. For the last two problems, we develop an original image similarity criterion, which considers both spatial layout and detail content in an image. For this purpose, coefficients of wavelet decomposition are used to derive parameter vectors accounting for the above two aspects. The parameters exhibit (quasi-) invariant properties, thus making the algorithm robust for many types of object/camera motions and scaling variances. The novel “seek and spread” strategy used in key frame computing allows us to obtain a large representative range for the key frames. Inter-shot redundancy of the key-frames is suppressed using the same image similarity measure. Experimental results demonstrate the effectiveness and efficiency of our techniques.  相似文献   

16.
Symbolic images are composed of a finite set of symbols that have a semantic meaning. Examples of symbolic images include maps (where the semantic meaning of the symbols is given in the legend), engineering drawings, and floor plans. Two approaches for supporting queries on symbolic-image databases that are based on image content are studied. The classification approach preprocesses all symbolic images and attaches a semantic classification and an associated certainty factor to each object that it finds in the image. The abstraction approach describes each object in the symbolic image by using a vector consisting of the values of some of its features (e.g., shape, genus, etc.). The approaches differ in the way in which responses to queries are computed. In the classification approach, images are retrieved on the basis of whether or not they contain objects that have the same classification as the objects in the query. On the other hand, in the abstraction approach, retrieval is on the basis of similarity of feature vector values of these objects. Methods of integrating these two approaches into a relational multimedia database management system so that symbolic images can be stored and retrieved based on their content are described. Schema definitions and indices that support query specifications involving spatial as well as contextual constraints are presented. Spatial constraints may be based on both locational information (e.g., distance) and relational information (e.g., north of). Different strategies for image retrieval for a number of typical queries using these approaches are described. Estimated costs are derived for these strategies. Results are reported of a comparative study of the two approaches in terms of image insertion time, storage space, retrieval accuracy, and retrieval time. Received June 12, 1998 / Accepted October 13, 1998  相似文献   

17.
This paper describes a laser-based computer vision system used for automatic fruit recognition. It is based on an infrared laser range-finder sensor that provides range and reflectance images and is designed to detect spherical objects in non-structured environments. Image analysis algorithms integrate both range and reflectance information to generate four characteristic primitives which give evidence of the existence of spherical objects. The output of this vision system includes 3D position, radius and surface reflectivity of each spherical object. It has been applied to the AGRIBOT orange harvesting robot, where it has obtained good fruit detection rates and unlikely false detections.  相似文献   

18.
Issues in the design of a storage server for video-on-demand   总被引:2,自引:0,他引:2  
We examine issues related to the design of a storage server for video-on-demand (VOD) applications. The storage medium considered is magnetic disks or arrays of disks. We investigate disk scheduling policies, buffer management policies and I/O bus protocol issues. We derive the number of sessions that can be supported from a single disk or an array of disks and determine the amount of buffering required to support a given number of users. Furthermore, we propose a scheduling mechanism for disk accesses that significantly lowers the buffer-size requirements in the case of disk arrays. The buffer size required under the proposed scheme is independent of the number of disks in the array. This property allows for striping video content over a large number of disks to achieve higher concurrency in access to a particular video object. This enables the server to satisfy hundreds of independent requests to the same video object or to hundreds of different objects while storing only one copy of each video object. The reliability implications of striping content over a large number of disks are addressed and two solutions are proposed. Finally, we examine various policies for dealing with disk thermal calibration and the placement of videos on disks and disk arrays.  相似文献   

19.
A database model for object dynamics   总被引:1,自引:0,他引:1  
To effectively model complex applications in which constantly changing situations can be represented, a database system must be able to support the runtime specification of structural and behavioral nuances for objects on an individual or group basis. This paper introduces the role mechanism as an extension of object-oriented databases to support unanticipated behavioral oscillations for objects that may attain many types and share a single object identity. A role refers to the ability to represent object dynamics by seamlessly integrating idiosyncratic behavior, possibly in response to external events, with pre-existing object behavior specified at instance creation time. In this manner, the same object can simultaneously be an instance of different classes which symbolize the different roles that this object assumes. The role concept and its underlying linguistic scheme simplify the design requirements of complex applications that need to create and manipulate dynamic objects. Edited by D. McLeod / Received March 1994 / Accepted January 1996  相似文献   

20.
Requirements for choosing off-the-shelf information systems (OISR) differ from requirements for development of new information systems in that they do not necessarily provide complete specifications, thus allowing flexibility in matching an existing IS to the stated needs. We present a framework for OISR conceptual models that consists of four essential elements: business processes, business rules, information objects and required system services. We formalise the definitions of these concepts based on an ontological model. The ontology-based OISR model provides a framework to evaluate modelling languages on how appropriate they are for OISR requirements specifications. The evaluation framework is applied to the Object-Process Methodology, and its results are compared with a similar evaluation of ARIS. This comparison demonstrates the effectiveness of the ontological framework for evaluating modelling tools on how well they can guide selection, implementation and integration of purchased software packages.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号