共查询到20条相似文献,搜索用时 109 毫秒
1.
W. Krüger 《Machine Vision and Applications》1999,11(4):203-212
One method to detect obstacles from a vehicle moving on a planar road surface is the analysis of motion-compensated difference
images. In this contribution, a motion compensation algorithm is presented, which computes the required image-warping parameters
from an estimate of the relative motion between camera and ground plane. The proposed algorithm estimates the warping parameters
from displacements at image corners and image edges. It exploits the estimated confidence of the displacements to cope robustly
with outliers. Knowledge about camera calibration, measuremts from odometry, and the previous estimate are used for motion
prediction and to stabilize the estimation process when there is not enough information available in the measured image displacements.
The motion compensation algorithm has been integrated with modules for obstacle detection and lane tracking. This system has
been integrated in experimental vehicles and runs in real time with an overall cycle of 12.5 Hz on low-cost standard hardware.
Received: 23 April 1998 / Accepted: 25 August 1999 相似文献
2.
Abstract. The image sequence in a video taken by a moving camera may suffer from irregular perturbations because of irregularities
in the motion of the person or vehicle carrying the camera. We show how to use information in the image sequence to correct
the effects of these irregularities so that the sequence is smoothed, i.e., is approximately the same as the sequence that
would have been obtained if the motion of the camera had been smooth. Our method is based on the fact that the irregular motion
is almost entirely rotational, and that the rotational image motion can be detected and corrected if a distant object, such
as the horizon, is visible.
Received: 14 February 2001 / Accepted: 11 February 2002
Correspondence to: A. Rosenfeld 相似文献
3.
In video processing, a common first step is to segment the videos into physical units, generally called shots. A shot is a video segment that consists of one continuous action. In general, these physical units need to be clustered
to form more semantically significant units, such as scenes, sequences, programs, etc. This is the so-called story-based video
structuring. Automatic video structuring is of great importance for video browsing and retrieval. The shots or scenes are
usually described by one or several representative frames, called key-frames. Viewed from a higher level, key frames of some shots might be redundant in terms of semantics. In this paper, we propose
automatic solutions to the problems of: (i) video partitioning, (ii) key frame computing, (iii) key frame pruning. For the
first problem, an algorithm called “net comparison” is devised. It is accurate and fast because it uses both statistical and
spatial information in an image and does not have to process the entire image. For the last two problems, we develop an original
image similarity criterion, which considers both spatial layout and detail content in an image. For this purpose, coefficients
of wavelet decomposition are used to derive parameter vectors accounting for the above two aspects. The parameters exhibit
(quasi-) invariant properties, thus making the algorithm robust for many types of object/camera motions and scaling variances.
The novel “seek and spread” strategy used in key frame computing allows us to obtain a large representative range for the
key frames. Inter-shot redundancy of the key-frames is suppressed using the same image similarity measure. Experimental results
demonstrate the effectiveness and efficiency of our techniques. 相似文献
4.
In this paper, we present a method called MODEEP (Motion-based Object DEtection and Estimation of Pose) to detect independently
moving objects (IMOs) in forward-looking infrared (FLIR) image sequences taken from an airborne, moving platform. Ego-motion
effects are removed through a robust multi-scale affine image registration process. Thereafter, areas with residual motion
indicate potential object activity. These areas are detected, refined and selected using a Bayesian classifier. The resulting
regions are clustered into pairs such that each pair represents one object's front and rear end. Using motion and scene knowledge,
we estimate object pose and establish a region of interest (ROI) for each pair. Edge elements within each ROI are used to
segment the convex cover containing the IMO. We show detailed results on real, complex, cluttered and noisy sequences. Moreover,
we outline the integration of our fast and robust system into a comprehensive automatic target recognition (ATR) and action
classification system. 相似文献
5.
Image-based animation of facial expressions 总被引:1,自引:0,他引:1
Gideon Moiza Ayellet Tal Ilan Shimshoni David Barnett Yael Moses 《The Visual computer》2002,18(7):445-467
We present a novel technique for creating realistic facial animations given a small number of real images and a few parameters
for the in-between images. This scheme can also be used for reconstructing facial movies where the parameters can be automatically
extracted from the images. The in-between images are produced without ever generating a three-dimensional model of the face.
Since facial motion due to expressions are not well defined mathematically our approach is based on utilizing image patterns
in facial motion. These patterns were revealed by an empirical study which analyzed and compared image motion patterns in
facial expressions. The major contribution of this work is showing how parameterized “ideal” motion templates can generate
facial movies for different people and different expressions, where the parameters are extracted automatically from the image
sequence. To test the quality of the algorithm, image sequences (one of which was taken from a TV news broadcast) were reconstructed,
yielding movies hardly distinguishable from the originals.
Published online: 2 October 2002
Correspondence to: A. Tal
Work has been supported in part by the Israeli Ministry of Industry and Trade, The MOST Consortium 相似文献
6.
Lixin Fan Liying Fan Chew Lim Tan 《International Journal on Document Analysis and Recognition》2003,5(2-3):88-101
Abstract. For document images corrupted by various kinds of noise, direct binarization images may be severely blurred and degraded.
A common treatment for this problem is to pre-smooth input images using noise-suppressing filters. This article proposes an
image-smoothing method used for prefiltering the document image binarization. Conceptually, we propose that the influence
range of each pixel affecting its neighbors should depend on local image statistics. Technically, we suggest using coplanar matrices to capture the structural and textural distribution of similar pixels at each site. This property adapts the smoothing process
to the contrast, orientation, and spatial size of local image structures. Experimental results demonstrate the effectiveness
of the proposed method, which compares favorably with existing methods in reducing noise and preserving image features. In
addition, due to the adaptive nature of the similar pixel definition, the proposed filter output is more robust regarding
different noise levels than existing methods.
Received: October 31, 2001 / October 09, 2002
Correspondence to:L. Fan (e-mail: fanlixin@ieee.org) 相似文献
7.
Motion detection with nonstationary background 总被引:4,自引:0,他引:4
Abstract. This paper proposes a new background subtraction method for detecting moving foreground objects from a nonstationary background.
While background subtraction has traditionally worked well for a stationary background, the same cannot be implied for a nonstationary
viewing sensor. To a limited extent, motion compensation for the nonstationary background can be applied. However, in practice,
it is difficult to realize the motion compensation to sufficient pixel accuracy, and the traditional background subtraction
algorithm will fail for a moving scene. The problem is further complicated when the moving target to be detected/tracked is
small, since the pixel error in motion that is compensating the background will subsume the small target. A spatial distribution
of Gaussians (SDG) model is proposed to deal with moving object detection having motion compensation that is only approximately
extracted. The distribution of each background pixel is temporally and spatially modeled. Based on this statistical model,
a pixel in the current frame is then classified as belonging to the foreground or background. For this system to perform under
lighting and environmental changes over an extended period of time, the background distribution must be updated with each
incoming frame. A new background restoration and adaptation algorithm is developed for the nonstationary background. Test
cases involving the detection of small moving objects within a highly textured background and with a pan-tilt tracking system
are demonstrated successfully.
Received: 30 July 2001 / Accepted: 20 April 2002
Correspondence to: Chin-Seng Chau 相似文献
8.
We present a new approach to the tracking of very non-rigid patterns of motion, such as water flowing down a stream. The
algorithm is based on a “disturbance map”, which is obtained by linearly subtracting the temporal average of the previous
frames from the new frame. Every local motion creates a disturbance having the form of a wave, with a “head” at the present
position of the motion and a historical “tail” that indicates the previous locations of that motion. These disturbances serve
as loci of attraction for “tracking particles” that are scattered throughout the image. The algorithm is very fast and can
be performed in real time. We provide excellent tracking results on various complex sequences, using both stabilized and moving
cameras, showing a busy ant column, waterfalls, rapids and flowing streams, shoppers in a mall, and cars in a traffic intersection.
Received: 24 June 1997 / Accepted: 30 July 1998 相似文献
9.
Abstract. The purpose of this study is to discuss existing fractal-based algorithms and propose novel improvements of these algorithms
to identify tumors in brain magnetic-response (MR) images. Considerable research has been pursued on fractal geometry in various
aspects of image analysis and pattern recognition. Magnetic-resonance images typically have a degree of noise and randomness
associated with the natural random nature of structure. Thus, fractal analysis is appropriate for MR image analysis. For tumor
detection, we describe existing fractal-based techniques and propose three modified algorithms using fractal analysis models.
For each new method, the brain MR images are divided into a number of pieces. The first method involves thresholding the pixel
intensity values; hence, we call the technique piecewise-threshold-box-counting (PTBC) method. For the subsequent methods,
the intensity is treated as the third dimension. We implement the improved piecewise-modified-box-counting (PMBC) and piecewise-triangular-prism-surface-area
(PTPSA) methods, respectively. With the PTBC method, we find the differences in intensity histogram and fractal dimension
between normal and tumor images. Using the PMBC and PTPSA methods, we may detect and locate the tumor in the brain MR images
more accurately. Thus, the novel techniques proposed herein offer satisfactory tumor identification.
Received: 13 October 2001 / Accepted: 28 May 2002
Correspondence to: K.M. Iftekharuddin 相似文献
10.
Recently, optimization has been used in various ways to interpolate or retarget human body motions obtained by motion-capturing
systems. However, in such cases, the inner structure of a human body has rarely been taken into account, and hence there have
been difficulties in simulating physiological effects such as fatigue or injuries. In this paper, we propose a method to
create/retarget human body motions using a musculoskeletal human body model. Using our method, it is possible to create dynamically
and physiologically feasible motions. Since a muscle model based on Hill's model is included in our system, it is also possible
to retarget the original motion by changing muscular parameters. For example, using the muscle fatigue model, a motion where
a human body gradually gets tired can be simulated. By increasing the maximal force exertable by the muscles, or decreasing
it to zero, training or displacement effects of muscles can also be simulated. Our method can be used for biomechanically
correct inverse kinematics, interpolation of motions, and physiological retargetting of the human body motion. 相似文献
11.
For more than a decade, researchers have actively explored the area of image/video analysis and retrieval. Yet one fundamental
problem remains largely unsolved: how to measure perceptual similarity between two objects. For this purpose, most researchers
employ a Minkowski-type metric. Unfortunately, the Minkowski metric does not reliably find similarities in objects that are
obviously alike. Through mining a large set of visual data, our team has discovered a perceptual distance function. We call
the discovered function the dynamic partial function (DPF). When we empirically compare DPF to Minkowski-type distance functions in image retrieval and in video shot-transition
detection using our image features, DPF performs significantly better. The effectiveness of DPF can be explained by similarity theories in cognitive psychology. 相似文献
12.
We propose a system that simultaneously utilizes the stereo disparity and optical flow information of real-time stereo grayscale
multiresolution images for the recognition of objects and gestures in human interactions. For real-time calculation of the
disparity and optical flow information of a stereo image, the system first creates pyramid images using a Gaussian filter.
The system then determines the disparity and optical flow of a low-density image and extracts attention regions in a high-density
image. The three foremost regions are recognized using higher-order local autocorrelation features and linear discriminant
analysis. As the recognition method is view based, the system can process the face and hand recognitions simultaneously in
real time. The recognition features are independent of parallel translations, so the system can use unstable extractions from
stereo depth information. We demonstrate that the system can discriminate the users, monitor the basic movements of the user,
smoothly learn an object presented by users, and can communicate with users by hand signs learned in advance.
Received: 31 January 2000 / Accepted: 1 May 2001
Correspondence to: I. Yoda (e-mail: yoda@ieee.org, Tel.: +81-298-615941, Fax: +81-298-613313) 相似文献
13.
We present an autonomous mobile robot navigation system using stereo fish-eye lenses for navigation in an indoor structured
environment and for generating a model of the imaged scene. The system estimates the three-dimensional (3D) position of significant
features in the scene, and by estimating its relative position to the features, navigates through narrow passages and makes
turns at corridor ends. Fish-eye lenses are used to provide a large field of view, which images objects close to the robot
and helps in making smooth transitions in the direction of motion. Calibration is performed for the lens-camera setup and
the distortion is corrected to obtain accurate quantitative measurements. A vision-based algorithm that uses the vanishing
points of extracted segments from a scene in a few 3D orientations provides an accurate estimate of the robot orientation.
This is used, in addition to 3D recovery via stereo correspondence, to maintain the robot motion in a purely translational
path, as well as to remove the effects of any drifts from this path from each acquired image. Horizontal segments are used
as a qualitative estimate of change in the motion direction and correspondence of vertical segment provides precise 3D information
about objects close to the robot. Assuming detected linear edges in the scene as boundaries of planar surfaces, the 3D model
of the scene is generated. The robot system is implemented and tested in a structured environment at our research center.
Results from the robot navigation in real environments are presented and discussed.
Received: 25 September 1996 / Accepted: 20 October 1996 相似文献
14.
Query by video clip 总被引:15,自引:0,他引:15
Typical digital video search is based on queries involving a single shot. We generalize this problem by allowing queries
that involve a video clip (say, a 10-s video segment). We propose two schemes: (i) retrieval based on key frames follows the traditional approach of identifying shots, computing key frames from a video, and then extracting image features
around the key frames. For each key frame in the query, a similarity value (using color, texture, and motion) is obtained
with respect to the key frames in the database video. Consecutive key frames in the database video that are highly similar
to the query key frames are then used to generate the set of retrieved video clips. (ii) In retrieval using sub-sampled frames, we uniformly sub-sample the query clip as well as the database video. Retrieval is based on matching color and texture features
of the sub-sampled frames. Initial experiments on two video databases (basketball video with approximately 16,000 frames and
a CNN news video with approximately 20,000 frames) show promising results. Additional experiments using segments from one
basketball video as query and a different basketball video as the database show the effectiveness of feature representation
and matching schemes. 相似文献
15.
This paper introduces a new method for the coordination of human motion based on planning and AI techniques. Motions are considered
as black boxes that are activated according to preconditions and produce postconditions in a hybrid, continuous and discrete
world. Each part of the body is an autonomous entity that cooperates with the others as determined by global criteria, such
as occupation rate and distance to a goal (common to all the entities). With this technique, we can easily specify and solve
the motion coordination problem of a juggler that juggles with a dynamic number of balls in real time. 相似文献
16.
This paper presents a fast and simple method using a timed motion history image (tMHI) for representing motion from the gradients
in successively layered silhouettes. This representation can be used to (a) determine the current pose of the object and (b)
segment and measure the motions induced by the object in a video scene. These segmented regions are not “motion blobs”, but
instead are motion regions that are naturally connected to parts of the moving object. This method may be used as a very general
gesture recognition “toolbox”. We demonstrate the approach with recognition of waving and overhead clapping motions to control
a music synthesis program.
Accepted: 13 August 2001 相似文献
17.
In this paper, we address the analysis of 3D shape and shape change in non-rigid biological objects imaged via a stereo light
microscope. We propose an integrated approach for the reconstruction of 3D structure and the motion analysis for images in
which only a few informative features are available. The key components of this framework are: 1) image registration using
a correlation-based approach, 2) region-of-interest extraction using motion-based segmentation, and 3) stereo and motion analysis
using a cooperative spatial and temporal matching process. We describe these three stages of processing and illustrate the
efficacy of the proposed approach using real images of a live frog's ventricle. The reconstructed dynamic 3D structure of
the ventricle is demonstrated in our experimental results, and it agrees qualitatively with the observed images of the ventricle. 相似文献
18.
Fast image retrieval using color-spatial information 总被引:1,自引:0,他引:1
Beng Chin Ooi Kian-Lee Tan Tat Seng Chua Wynne Hsu 《The VLDB Journal The International Journal on Very Large Data Bases》1998,7(2):115-128
In this paper, we present an image retrieval system that employs both the color and spatial information of images to facilitate
the retrieval process. The basic unit used in our technique is a single-colored cluster, which bounds a homogeneous region of that color in an image. Two clusters from two images are similar if they are of the
same color and overlap in the image space. The number of clusters that can be extracted from an image can be very large, and
it affects the accuracy of retrieval. We study the effect of the number of clusters on retrieval effectiveness to determine
an appropriate value for “optimal' performance. To facilitate efficient retrieval, we also propose a multi-tier indexing
mechanism called the Sequenced Multi-Attribute Tree (SMAT). We implemented a two-tier SMAT, where the first layer is used to prune away clusters that are of different colors,
while the second layer discriminates clusters of different spatial locality. We conducted an experimental study on an image
database consisting of 12,000 images. Our results show the effectiveness of the proposed color-spatial approach, and the efficiency
of the proposed indexing mechanism.
Received August 1, 1997 / Accepted December 9, 1997 相似文献
19.
A model-based approach to reconstruction of 3D human arm motion from a monocular image sequence taken under orthographic
projection is presented. The reconstruction is divided into two stages. First, a 2D shape model is used to track the arm silhouettes
and second-order curves are used to model the arm based on an iteratively reweighted least square method. As a result, 2D
stick figures are extracted. In the second stage, the stick figures are backprojected into the scene. 3D postures are reconstructed
using the constraints of a 3D kinematic model of the human arm. The motion of the arm is then derived as a transition between
the arm postures. Applications of these results are foreseen in the analysis of human motion patterns.
Received: 26 January 1996 / Accepted: 17 July 1997 相似文献
20.