共查询到20条相似文献,搜索用时 31 毫秒
1.
NeTra: A toolbox for navigating large image databases 总被引:17,自引:0,他引:17
We present here an implementation of NeTra, a prototype image retrieval system that uses color, texture, shape and spatial
location information in segmented image regions to search and retrieve similar regions from the database. A distinguishing
aspect of this system is its incorporation of a robust automated image segmentation algorithm that allows object- or region-based
search. Image segmentation significantly improves the quality of image retrieval when images contain multiple complex objects.
Images are segmented into homogeneous regions at the time of ingest into the database, and image attributes that represent
each of these regions are computed. In addition to image segmentation, other important components of the system include an
efficient color representation, and indexing of color, texture, and shape features for fast search and retrieval. This representation
allows the user to compose interesting queries such as “retrieve all images that contain regions that have the color of object
A, texture of object B, shape of object C, and lie in the upper of the image”, where the individual objects could be regions
belonging to different images. A Java-based web implementation of NeTra is available at http://vivaldi.ece.ucsb.edu/Netra. 相似文献
2.
In order to get useful information from various kinds of information sources, we first apply a searching process with query
statements to retrieve candidate data objects (called a hunting process in this paper) and then apply a browsing process to
check the properties of each object in detail by visualizing candidates. In traditional information retrieval systems, the
hunting process determines the quality of the result, since there are only a few candidates left for the browsing process.
In order to retrieve data from widely distributed digital libraries, the browsing process becomes very important, since the
properties of data sources are not known in advance. After getting data from various information sources, a user checks the
properties of data in detail using the browsing process. The result can be used to improve the hunting process or for selecting
more appropriate visualization parameters. Visualization relationships among data are very important, but will become too
time-consuming if the amount of data in the candidate set is large, for example, over one hundred objects. One of the important
problems in handling information retrieval from a digital library is to create efficient and powerful visualization mechanisms
for the browsing process. One promising way to solve the visualization problem is to map each candidate data object into a
location in three-dimensional (3D) space using a proper distance definition. In this paper, we will introduce the functions
and organization of a system having a browsing navigator to achieve an efficient browsing process in 3D information search
space. This browsing navigator has the following major functions: ?1. Selection of features which determine the distance for
visualization, in order to generate a uniform distribution of candidate data objects in the resulting space. ?2. Calculation
of the location of the data objects in 2D space using the selected features. ?3. Construction of 3D browsing space by combining
2D spaces, in order to find the required data objects easily. ?4. Generation of the oblique views of 3D browsing space and
data objects by reducing the overlap of data objects in order to make navigation easy for the user in 3D space. ?Examples
of this browsing navigator applied to book data are shown.
Received: 15 December 1997 / Revised: June 1999 相似文献
3.
This paper describes a method for recognizing partially occluded objects under different levels of illumination brightness
by using the eigenspace analysis. In our previous work, we developed the “eigenwindow” method to recognize the partially occluded
objects in an assembly task, and demonstrated with sufficient high performance for the industrial use that the method works
successfully for multiple objects with specularity under constant illumination. In this paper, we modify the eigenwindow method
for recognizing objects under different illumination conditions, as is sometimes the case in manufacturing environments, by
using additional color information. In the proposed method, a measured color in the RGB color space is transformed into one
in the HSV color space. Then, the hue of the measured color, which is invariant to change in illumination brightness and direction,
is used for recognizing multiple objects under different illumination conditions. The proposed method was applied to real
images of multiple objects under various illumination conditions, and the objects were recognized and localized successfully. 相似文献
4.
Detection, segmentation, and classification of specific objects are the key building blocks of a computer vision system for
image analysis. This paper presents a unified model-based approach to these three tasks. It is based on using unsupervised
learning to find a set of templates specific to the objects being outlined by the user. The templates are formed by averaging
the shapes that belong to a particular cluster, and are used to guide a probabilistic search through the space of possible
objects. The main difference from previously reported methods is the use of on-line learning, ideal for highly repetitive
tasks. This results in faster and more accurate object detection, as system performance improves with continued use. Further,
the information gained through clustering and user feedback is used to classify the objects for problems in which shape is
relevant to the classification. The effectiveness of the resulting system is demonstrated in two applications: a medical diagnosis
task using cytological images, and a vehicle recognition task.
Received: 5 November 2000 / Accepted: 29 June 2001
Correspondence to: K.-M. Lee 相似文献
5.
We present a novel approach to the robust classification of arbitrary object classes in complex, natural scenes. Starting
from a re-appraisal of Marr's ‘primal sketch’, we develop an algorithm that (1) employs local orientations as the fundamental
picture primitives, rather than the more usual edge locations, (2) retains and exploits the local spatial arrangement of features
of different complexity in an image and (3) is hierarchically arranged so that the level of feature abstraction increases
at each processing stage. The resulting, simple technique is based on the accumulation of evidence in binary channels, followed
by a weighted, non-linear sum of the evidence accumulators. The steps involved in designing a template for recognizing a simple
object are explained. The practical application of the algorithm is illustrated, with examples taken from a broad range of
object classification problems. We discuss the performance of the algorithm and describe a hardware implementation. First
successful attempts to train the algorithm, automatically, are presented. Finally, we compare our algorithm with other object
classification algorithms described in the literature. 相似文献
6.
In this paper, we present a method called MODEEP (Motion-based Object DEtection and Estimation of Pose) to detect independently
moving objects (IMOs) in forward-looking infrared (FLIR) image sequences taken from an airborne, moving platform. Ego-motion
effects are removed through a robust multi-scale affine image registration process. Thereafter, areas with residual motion
indicate potential object activity. These areas are detected, refined and selected using a Bayesian classifier. The resulting
regions are clustered into pairs such that each pair represents one object's front and rear end. Using motion and scene knowledge,
we estimate object pose and establish a region of interest (ROI) for each pair. Edge elements within each ROI are used to
segment the convex cover containing the IMO. We show detailed results on real, complex, cluttered and noisy sequences. Moreover,
we outline the integration of our fast and robust system into a comprehensive automatic target recognition (ATR) and action
classification system. 相似文献
7.
Philip A. Bernstein Shankar Pal David Shutt 《The VLDB Journal The International Journal on Very Large Data Bases》2000,9(3):177-189
When implementing persistent objects on a relational database, a major performance issue is prefetching data to minimize
the number of round-trips to the database. This is especially hard with navigational applications, since future accesses are
unpredictable. We propose the use of the context in which an object is loaded as a predictor of future accesses, where a context
can be a stored collection of relationships, a query result, or a complex object. When an object O's state is loaded, similar
state for other objects in O's context is prefetched. We present a design for maintaining context and for using it to guide
prefetch. We give performance measurements of its implementation in Microsoft Repository, showing up to a 70% reduction in
running time. We describe several variations of the optimization: selectively applying the technique based on application
and database characteristics, using application-supplied performance hints, using concurrent database queries to support asynchronous
prefetch, prefetching across relationship paths, and delayed prefetch to save database round-trips.
Received May 3, 2000 / Accepted October 26, 2000 相似文献
8.
Abstract. Conventional tracking methods encounter difficulties as the number of objects, clutter, and sensors increase, because of
the requirement for data association. Statistical tracking, based on the concept of network tomography, is an alternative
that avoids data association. It estimates the number of trips made from one region to another in a scene based on interregion
boundary traffic counts accumulated over time. It is not necessary to track an object through a scene to determine when an
object crosses a boundary. This paper describes statistical tracing and presents an evaluation based on the estimation of
pedestrian and vehicular traffic intensities at an intersection over a period of 1 month. We compare the results with those
from a multiple-hypothesis tracker and manually counted ground-truth estimates.
Received: 30 August 2001 / Accepted: 28 May 2002
Correspondence to: J.E. Boyd 相似文献
9.
HweeHwa Pang 《Multimedia Systems》1997,5(6):386-399
Multimedia applications that are required to manipulate large collections of objects are becoming increasingly common. Moreover,
the size of multimedia objects, which are already huge, are getting even bigger as the resolution of output devices improve.
As a result, many multimedia storage systems are not likely to be able to keep all of their objects disk-resident. Instead,
a majority of the less popular objects have to be off-loaded to tertiary storage to keep costs down. The speed at which objects
can be accessed from tertiary storage is thus an important consideration. In this paper, we propose an adaptive data retrieval
algorithm that employs a combination of staging and direct access in servicing tertiary storage retrieval requests. At retrieval
time, an object that resides in tertiary storage can either be staged to and then played back from disks, or the object can
be accessed directly from the tertiary drives. We show that a simplistic policy that adheres strictly to staging or direct
access does not exploit the full retrieval capacity of both the tertiary library and the secondary storage. To overcome the
problem, we propose a data retrieval algorithm that dynamically chooses between staging and direct access, based on the relative
load on the tertiary versus secondary devices. A series of simulation experiments confirms that the algorithm achieves good
access times over a wide range of workloads and resource configurations. Moreover, the algorithm is very responsive to changing
load conditions. 相似文献
10.
A bin picking system based on depth from defocus 总被引:3,自引:0,他引:3
It is generally accepted that to develop versatile bin-picking systems capable of grasping and manipulation operations, accurate
3-D information is required. To accomplish this goal, we have developed a fast and precise range sensor based on active depth from defocus (DFD). This sensor is used in conjunction with a three-component vision system, which is able to recognize and evaluate the
attitude of 3-D objects. The first component performs scene segmentation using an edge-based approach. Since edges are used
to detect the object boundaries, a key issue consists of improving the quality of edge detection. The second component attempts
to recognize the object placed on the top of the object pile using a model-driven approach in which the segmented surfaces
are compared with those stored in the model database. Finally, the attitude of the recognized object is evaluated using an
eigenimage approach augmented with range data analysis. The full bin-picking system will be outlined, and a number of experimental
results will be examined.
Received: 2 December 2000 / Accepted: 9 September 2001
Correspondence to: O. Ghita 相似文献
11.
We propose a system that simultaneously utilizes the stereo disparity and optical flow information of real-time stereo grayscale
multiresolution images for the recognition of objects and gestures in human interactions. For real-time calculation of the
disparity and optical flow information of a stereo image, the system first creates pyramid images using a Gaussian filter.
The system then determines the disparity and optical flow of a low-density image and extracts attention regions in a high-density
image. The three foremost regions are recognized using higher-order local autocorrelation features and linear discriminant
analysis. As the recognition method is view based, the system can process the face and hand recognitions simultaneously in
real time. The recognition features are independent of parallel translations, so the system can use unstable extractions from
stereo depth information. We demonstrate that the system can discriminate the users, monitor the basic movements of the user,
smoothly learn an object presented by users, and can communicate with users by hand signs learned in advance.
Received: 31 January 2000 / Accepted: 1 May 2001
Correspondence to: I. Yoda (e-mail: yoda@ieee.org, Tel.: +81-298-615941, Fax: +81-298-613313) 相似文献
12.
Jeffrey A. Fayman Oded Sudarsky Ehud Rivlin Michael Rudzsky 《Machine Vision and Applications》2001,13(1):25-37
We present a new active vision technique called zoom tracking. Zoom tracking is the continuous adjustment of a camera's focal
length in order to keep a constant-sized image of an object moving along the camera's optical axis. Two methods for performing
zoom tracking are presented: a closed-loop visual feedback algorithm based on optical flow, and use of depth information obtained
from an autofocus camera's range sensor. We explore two uses of zoom tracking: recovery of depth information and improving
the performance of scale-variant algorithms. We show that the image stability provided by zoom tracking improves the performance
of algorithms that are scale variant, such as correlation-based trackers. While zoom tracking cannot totally compensate for
an object's motion, due to the effect of perspective distortion, an analysis of this distortion provides a quantitative estimate
of the performance of zoom tracking. Zoom tracking can be used to reconstruct a depth map of the tracked object. We show that
under normal circumstances this reconstruction is much more accurate than depth from zooming, and works over a greater range
than depth from axial motion while providing, in the worst case, only slightly less accurate results. Finally, we show how
zoom tracking can also be used in time-to-contact calculations.
Received: 15 February 2000 / Accepted: 19 June 2000 相似文献
13.
This paper introduces an accurate, efficient, and unified engine dedicated to dynamic animation of d-dimensional deformable objects. The objects are modelled as d-dimensional manifolds defined as functional combinations of a mesh of 3D control points, weighted by parametric blending
functions. This model ensures that, at each time step, the object shape conforms to its manifold definitions. The object motion
is deduced from the control points dynamic animation. In fact, control points should be viewed as the degrees of freedom of
the continuous object. The chosen dynamic equations (Lagrangian formalism) reflect this generic modelling scheme and yield
an exact and computationally efficient linear system. 相似文献
14.
Pierre M. Devaux Daniel B. Lysak Rangachar Kasturi 《International Journal on Document Analysis and Recognition》1999,2(2-3):120-131
Converting paper-based engineering drawings into CAD model files is a tedious process. Therefore, automating the conversion
of such drawings represents tremendous time and labor savings. We present a complete system which interprets such 2D paper-based
engineering drawings, and outputs 3D models that can be displayed as wireframes. The system performs the detection of dimension
sets, the extraction of object lines, and the assembly of 3D objects from the extracted object lines. A knowledge-based method
is used to remove dimension sets and text from ANSI engineering drawings, a graphics recognition procedure is used to extract
complete object lines, and an evidential rule-based method is utilized to identify view relationships. While these methods
are the subject of several of our previous papers, this paper focuses on the 3D interpretation of the object. This is accomplished
using a technique based on evidential reasoning and a wide range of rules and heuristics. The system is limited to the interpretation
of objects composed of planar, spherical, and cylindrical surfaces. Experimental results are presented.
Received December 2, 1998 / Revised June 18, 1999 相似文献
15.
In video processing, a common first step is to segment the videos into physical units, generally called shots. A shot is a video segment that consists of one continuous action. In general, these physical units need to be clustered
to form more semantically significant units, such as scenes, sequences, programs, etc. This is the so-called story-based video
structuring. Automatic video structuring is of great importance for video browsing and retrieval. The shots or scenes are
usually described by one or several representative frames, called key-frames. Viewed from a higher level, key frames of some shots might be redundant in terms of semantics. In this paper, we propose
automatic solutions to the problems of: (i) video partitioning, (ii) key frame computing, (iii) key frame pruning. For the
first problem, an algorithm called “net comparison” is devised. It is accurate and fast because it uses both statistical and
spatial information in an image and does not have to process the entire image. For the last two problems, we develop an original
image similarity criterion, which considers both spatial layout and detail content in an image. For this purpose, coefficients
of wavelet decomposition are used to derive parameter vectors accounting for the above two aspects. The parameters exhibit
(quasi-) invariant properties, thus making the algorithm robust for many types of object/camera motions and scaling variances.
The novel “seek and spread” strategy used in key frame computing allows us to obtain a large representative range for the
key frames. Inter-shot redundancy of the key-frames is suppressed using the same image similarity measure. Experimental results
demonstrate the effectiveness and efficiency of our techniques. 相似文献
16.
Aya Soffer Hanan Samet 《The VLDB Journal The International Journal on Very Large Data Bases》1998,7(4):253-274
Symbolic images are composed of a finite set of symbols that have a semantic meaning. Examples of symbolic images include
maps (where the semantic meaning of the symbols is given in the legend), engineering drawings, and floor plans. Two approaches
for supporting queries on symbolic-image databases that are based on image content are studied. The classification approach
preprocesses all symbolic images and attaches a semantic classification and an associated certainty factor to each object
that it finds in the image. The abstraction approach describes each object in the symbolic image by using a vector consisting
of the values of some of its features (e.g., shape, genus, etc.). The approaches differ in the way in which responses to queries
are computed. In the classification approach, images are retrieved on the basis of whether or not they contain objects that
have the same classification as the objects in the query. On the other hand, in the abstraction approach, retrieval is on
the basis of similarity of feature vector values of these objects. Methods of integrating these two approaches into a relational
multimedia database management system so that symbolic images can be stored and retrieved based on their content are described.
Schema definitions and indices that support query specifications involving spatial as well as contextual constraints are presented.
Spatial constraints may be based on both locational information (e.g., distance) and relational information (e.g., north of).
Different strategies for image retrieval for a number of typical queries using these approaches are described. Estimated costs
are derived for these strategies. Results are reported of a comparative study of the two approaches in terms of image insertion
time, storage space, retrieval accuracy, and retrieval time.
Received June 12, 1998 / Accepted October 13, 1998 相似文献
17.
This paper describes a laser-based computer vision system used for automatic fruit recognition. It is based on an infrared
laser range-finder sensor that provides range and reflectance images and is designed to detect spherical objects in non-structured
environments. Image analysis algorithms integrate both range and reflectance information to generate four characteristic primitives
which give evidence of the existence of spherical objects. The output of this vision system includes 3D position, radius and
surface reflectivity of each spherical object. It has been applied to the AGRIBOT orange harvesting robot, where it has obtained
good fruit detection rates and unlikely false detections. 相似文献
18.
Issues in the design of a storage server for video-on-demand 总被引:2,自引:0,他引:2
Antoine N. Mourad 《Multimedia Systems》1996,4(2):70-86
We examine issues related to the design
of a storage server for video-on-demand (VOD) applications.
The storage medium considered is magnetic disks
or arrays of disks. We investigate disk scheduling policies,
buffer management policies and I/O bus protocol issues.
We derive the number of sessions that can be
supported from a single disk or an array of disks and determine the
amount of buffering required to support a given number of users.
Furthermore,
we propose a scheduling mechanism for disk accesses that significantly
lowers the buffer-size requirements in the case of disk arrays.
The buffer size required under the proposed scheme is independent
of the number of disks in the array. This property allows for striping
video content over a large number of disks to achieve higher
concurrency in access to a particular video object.
This enables the server to satisfy hundreds of independent requests
to the same video object or to hundreds of different objects while
storing only one copy of each video object.
The reliability implications of striping content over a large number of disks
are addressed and two solutions are proposed.
Finally, we examine various policies for dealing with disk thermal calibration
and the placement of videos on disks and disk arrays. 相似文献
19.
A database model for object dynamics 总被引:1,自引:0,他引:1
M.P. Papazoglou B.J. Krämer 《The VLDB Journal The International Journal on Very Large Data Bases》1997,6(2):73-96
To effectively model complex applications in which constantly changing situations can be represented, a database system must
be able to support the runtime specification of structural and behavioral nuances for objects on an individual or group basis.
This paper introduces the role mechanism as an extension of object-oriented databases to support unanticipated behavioral
oscillations for objects that may attain many types and share a single object identity. A role refers to the ability to represent
object dynamics by seamlessly integrating idiosyncratic behavior, possibly in response to external events, with pre-existing
object behavior specified at instance creation time. In this manner, the same object can simultaneously be an instance of
different classes which symbolize the different roles that this object assumes. The role concept and its underlying linguistic
scheme simplify the design requirements of complex applications that need to create and manipulate dynamic objects.
Edited by D. McLeod / Received March 1994 / Accepted January 1996 相似文献
20.
Requirements for choosing off-the-shelf information systems (OISR) differ from requirements for development of new information
systems in that they do not necessarily provide complete specifications, thus allowing flexibility in matching an existing
IS to the stated needs. We present a framework for OISR conceptual models that consists of four essential elements: business
processes, business rules, information objects and required system services. We formalise the definitions of these concepts
based on an ontological model. The ontology-based OISR model provides a framework to evaluate modelling languages on how appropriate
they are for OISR requirements specifications. The evaluation framework is applied to the Object-Process Methodology, and
its results are compared with a similar evaluation of ARIS. This comparison demonstrates the effectiveness of the ontological
framework for evaluating modelling tools on how well they can guide selection, implementation and integration of purchased
software packages. 相似文献