首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
One method to detect obstacles from a vehicle moving on a planar road surface is the analysis of motion-compensated difference images. In this contribution, a motion compensation algorithm is presented, which computes the required image-warping parameters from an estimate of the relative motion between camera and ground plane. The proposed algorithm estimates the warping parameters from displacements at image corners and image edges. It exploits the estimated confidence of the displacements to cope robustly with outliers. Knowledge about camera calibration, measuremts from odometry, and the previous estimate are used for motion prediction and to stabilize the estimation process when there is not enough information available in the measured image displacements. The motion compensation algorithm has been integrated with modules for obstacle detection and lane tracking. This system has been integrated in experimental vehicles and runs in real time with an overall cycle of 12.5 Hz on low-cost standard hardware. Received: 23 April 1998 / Accepted: 25 August 1999  相似文献   

2.
Abstract. The image sequence in a video taken by a moving camera may suffer from irregular perturbations because of irregularities in the motion of the person or vehicle carrying the camera. We show how to use information in the image sequence to correct the effects of these irregularities so that the sequence is smoothed, i.e., is approximately the same as the sequence that would have been obtained if the motion of the camera had been smooth. Our method is based on the fact that the irregular motion is almost entirely rotational, and that the rotational image motion can be detected and corrected if a distant object, such as the horizon, is visible. Received: 14 February 2001 / Accepted: 11 February 2002 Correspondence to: A. Rosenfeld  相似文献   

3.
In video processing, a common first step is to segment the videos into physical units, generally called shots. A shot is a video segment that consists of one continuous action. In general, these physical units need to be clustered to form more semantically significant units, such as scenes, sequences, programs, etc. This is the so-called story-based video structuring. Automatic video structuring is of great importance for video browsing and retrieval. The shots or scenes are usually described by one or several representative frames, called key-frames. Viewed from a higher level, key frames of some shots might be redundant in terms of semantics. In this paper, we propose automatic solutions to the problems of: (i) video partitioning, (ii) key frame computing, (iii) key frame pruning. For the first problem, an algorithm called “net comparison” is devised. It is accurate and fast because it uses both statistical and spatial information in an image and does not have to process the entire image. For the last two problems, we develop an original image similarity criterion, which considers both spatial layout and detail content in an image. For this purpose, coefficients of wavelet decomposition are used to derive parameter vectors accounting for the above two aspects. The parameters exhibit (quasi-) invariant properties, thus making the algorithm robust for many types of object/camera motions and scaling variances. The novel “seek and spread” strategy used in key frame computing allows us to obtain a large representative range for the key frames. Inter-shot redundancy of the key-frames is suppressed using the same image similarity measure. Experimental results demonstrate the effectiveness and efficiency of our techniques.  相似文献   

4.
In this paper, we present a method called MODEEP (Motion-based Object DEtection and Estimation of Pose) to detect independently moving objects (IMOs) in forward-looking infrared (FLIR) image sequences taken from an airborne, moving platform. Ego-motion effects are removed through a robust multi-scale affine image registration process. Thereafter, areas with residual motion indicate potential object activity. These areas are detected, refined and selected using a Bayesian classifier. The resulting regions are clustered into pairs such that each pair represents one object's front and rear end. Using motion and scene knowledge, we estimate object pose and establish a region of interest (ROI) for each pair. Edge elements within each ROI are used to segment the convex cover containing the IMO. We show detailed results on real, complex, cluttered and noisy sequences. Moreover, we outline the integration of our fast and robust system into a comprehensive automatic target recognition (ATR) and action classification system.  相似文献   

5.
Image-based animation of facial expressions   总被引:1,自引:0,他引:1  
We present a novel technique for creating realistic facial animations given a small number of real images and a few parameters for the in-between images. This scheme can also be used for reconstructing facial movies where the parameters can be automatically extracted from the images. The in-between images are produced without ever generating a three-dimensional model of the face. Since facial motion due to expressions are not well defined mathematically our approach is based on utilizing image patterns in facial motion. These patterns were revealed by an empirical study which analyzed and compared image motion patterns in facial expressions. The major contribution of this work is showing how parameterized “ideal” motion templates can generate facial movies for different people and different expressions, where the parameters are extracted automatically from the image sequence. To test the quality of the algorithm, image sequences (one of which was taken from a TV news broadcast) were reconstructed, yielding movies hardly distinguishable from the originals. Published online: 2 October 2002 Correspondence to: A. Tal Work has been supported in part by the Israeli Ministry of Industry and Trade, The MOST Consortium  相似文献   

6.
Abstract. For document images corrupted by various kinds of noise, direct binarization images may be severely blurred and degraded. A common treatment for this problem is to pre-smooth input images using noise-suppressing filters. This article proposes an image-smoothing method used for prefiltering the document image binarization. Conceptually, we propose that the influence range of each pixel affecting its neighbors should depend on local image statistics. Technically, we suggest using coplanar matrices to capture the structural and textural distribution of similar pixels at each site. This property adapts the smoothing process to the contrast, orientation, and spatial size of local image structures. Experimental results demonstrate the effectiveness of the proposed method, which compares favorably with existing methods in reducing noise and preserving image features. In addition, due to the adaptive nature of the similar pixel definition, the proposed filter output is more robust regarding different noise levels than existing methods. Received: October 31, 2001 / October 09, 2002 Correspondence to:L. Fan (e-mail: fanlixin@ieee.org)  相似文献   

7.
Motion detection with nonstationary background   总被引:4,自引:0,他引:4  
Abstract. This paper proposes a new background subtraction method for detecting moving foreground objects from a nonstationary background. While background subtraction has traditionally worked well for a stationary background, the same cannot be implied for a nonstationary viewing sensor. To a limited extent, motion compensation for the nonstationary background can be applied. However, in practice, it is difficult to realize the motion compensation to sufficient pixel accuracy, and the traditional background subtraction algorithm will fail for a moving scene. The problem is further complicated when the moving target to be detected/tracked is small, since the pixel error in motion that is compensating the background will subsume the small target. A spatial distribution of Gaussians (SDG) model is proposed to deal with moving object detection having motion compensation that is only approximately extracted. The distribution of each background pixel is temporally and spatially modeled. Based on this statistical model, a pixel in the current frame is then classified as belonging to the foreground or background. For this system to perform under lighting and environmental changes over an extended period of time, the background distribution must be updated with each incoming frame. A new background restoration and adaptation algorithm is developed for the nonstationary background. Test cases involving the detection of small moving objects within a highly textured background and with a pan-tilt tracking system are demonstrated successfully. Received: 30 July 2001 / Accepted: 20 April 2002 Correspondence to: Chin-Seng Chau  相似文献   

8.
We present a new approach to the tracking of very non-rigid patterns of motion, such as water flowing down a stream. The algorithm is based on a “disturbance map”, which is obtained by linearly subtracting the temporal average of the previous frames from the new frame. Every local motion creates a disturbance having the form of a wave, with a “head” at the present position of the motion and a historical “tail” that indicates the previous locations of that motion. These disturbances serve as loci of attraction for “tracking particles” that are scattered throughout the image. The algorithm is very fast and can be performed in real time. We provide excellent tracking results on various complex sequences, using both stabilized and moving cameras, showing a busy ant column, waterfalls, rapids and flowing streams, shoppers in a mall, and cars in a traffic intersection. Received: 24 June 1997 / Accepted: 30 July 1998  相似文献   

9.
Abstract. The purpose of this study is to discuss existing fractal-based algorithms and propose novel improvements of these algorithms to identify tumors in brain magnetic-response (MR) images. Considerable research has been pursued on fractal geometry in various aspects of image analysis and pattern recognition. Magnetic-resonance images typically have a degree of noise and randomness associated with the natural random nature of structure. Thus, fractal analysis is appropriate for MR image analysis. For tumor detection, we describe existing fractal-based techniques and propose three modified algorithms using fractal analysis models. For each new method, the brain MR images are divided into a number of pieces. The first method involves thresholding the pixel intensity values; hence, we call the technique piecewise-threshold-box-counting (PTBC) method. For the subsequent methods, the intensity is treated as the third dimension. We implement the improved piecewise-modified-box-counting (PMBC) and piecewise-triangular-prism-surface-area (PTPSA) methods, respectively. With the PTBC method, we find the differences in intensity histogram and fractal dimension between normal and tumor images. Using the PMBC and PTPSA methods, we may detect and locate the tumor in the brain MR images more accurately. Thus, the novel techniques proposed herein offer satisfactory tumor identification. Received: 13 October 2001 / Accepted: 28 May 2002 Correspondence to: K.M. Iftekharuddin  相似文献   

10.
Creating and retargetting motion by the musculoskeletal human body model   总被引:1,自引:1,他引:0  
Recently, optimization has been used in various ways to interpolate or retarget human body motions obtained by motion-capturing systems. However, in such cases, the inner structure of a human body has rarely been taken into account, and hence there have been difficulties in simulating physiological effects such as fatigue or injuries. In this paper, we propose a method to create/retarget human body motions using a musculoskeletal human body model. Using our method, it is possible to create dynamically and physiologically feasible motions. Since a muscle model based on Hill's model is included in our system, it is also possible to retarget the original motion by changing muscular parameters. For example, using the muscle fatigue model, a motion where a human body gradually gets tired can be simulated. By increasing the maximal force exertable by the muscles, or decreasing it to zero, training or displacement effects of muscles can also be simulated. Our method can be used for biomechanically correct inverse kinematics, interpolation of motions, and physiological retargetting of the human body motion.  相似文献   

11.
Discovery of a perceptual distance function for measuring image similarity   总被引:3,自引:0,他引:3  
For more than a decade, researchers have actively explored the area of image/video analysis and retrieval. Yet one fundamental problem remains largely unsolved: how to measure perceptual similarity between two objects. For this purpose, most researchers employ a Minkowski-type metric. Unfortunately, the Minkowski metric does not reliably find similarities in objects that are obviously alike. Through mining a large set of visual data, our team has discovered a perceptual distance function. We call the discovered function the dynamic partial function (DPF). When we empirically compare DPF to Minkowski-type distance functions in image retrieval and in video shot-transition detection using our image features, DPF performs significantly better. The effectiveness of DPF can be explained by similarity theories in cognitive psychology.  相似文献   

12.
We propose a system that simultaneously utilizes the stereo disparity and optical flow information of real-time stereo grayscale multiresolution images for the recognition of objects and gestures in human interactions. For real-time calculation of the disparity and optical flow information of a stereo image, the system first creates pyramid images using a Gaussian filter. The system then determines the disparity and optical flow of a low-density image and extracts attention regions in a high-density image. The three foremost regions are recognized using higher-order local autocorrelation features and linear discriminant analysis. As the recognition method is view based, the system can process the face and hand recognitions simultaneously in real time. The recognition features are independent of parallel translations, so the system can use unstable extractions from stereo depth information. We demonstrate that the system can discriminate the users, monitor the basic movements of the user, smoothly learn an object presented by users, and can communicate with users by hand signs learned in advance. Received: 31 January 2000 / Accepted: 1 May 2001 Correspondence to: I. Yoda (e-mail: yoda@ieee.org, Tel.: +81-298-615941, Fax: +81-298-613313)  相似文献   

13.
We present an autonomous mobile robot navigation system using stereo fish-eye lenses for navigation in an indoor structured environment and for generating a model of the imaged scene. The system estimates the three-dimensional (3D) position of significant features in the scene, and by estimating its relative position to the features, navigates through narrow passages and makes turns at corridor ends. Fish-eye lenses are used to provide a large field of view, which images objects close to the robot and helps in making smooth transitions in the direction of motion. Calibration is performed for the lens-camera setup and the distortion is corrected to obtain accurate quantitative measurements. A vision-based algorithm that uses the vanishing points of extracted segments from a scene in a few 3D orientations provides an accurate estimate of the robot orientation. This is used, in addition to 3D recovery via stereo correspondence, to maintain the robot motion in a purely translational path, as well as to remove the effects of any drifts from this path from each acquired image. Horizontal segments are used as a qualitative estimate of change in the motion direction and correspondence of vertical segment provides precise 3D information about objects close to the robot. Assuming detected linear edges in the scene as boundaries of planar surfaces, the 3D model of the scene is generated. The robot system is implemented and tested in a structured environment at our research center. Results from the robot navigation in real environments are presented and discussed. Received: 25 September 1996 / Accepted: 20 October 1996  相似文献   

14.
Query by video clip   总被引:15,自引:0,他引:15  
Typical digital video search is based on queries involving a single shot. We generalize this problem by allowing queries that involve a video clip (say, a 10-s video segment). We propose two schemes: (i) retrieval based on key frames follows the traditional approach of identifying shots, computing key frames from a video, and then extracting image features around the key frames. For each key frame in the query, a similarity value (using color, texture, and motion) is obtained with respect to the key frames in the database video. Consecutive key frames in the database video that are highly similar to the query key frames are then used to generate the set of retrieved video clips. (ii) In retrieval using sub-sampled frames, we uniformly sub-sample the query clip as well as the database video. Retrieval is based on matching color and texture features of the sub-sampled frames. Initial experiments on two video databases (basketball video with approximately 16,000 frames and a CNN news video with approximately 20,000 frames) show promising results. Additional experiments using segments from one basketball video as query and a different basketball video as the database show the effectiveness of feature representation and matching schemes.  相似文献   

15.
This paper introduces a new method for the coordination of human motion based on planning and AI techniques. Motions are considered as black boxes that are activated according to preconditions and produce postconditions in a hybrid, continuous and discrete world. Each part of the body is an autonomous entity that cooperates with the others as determined by global criteria, such as occupation rate and distance to a goal (common to all the entities). With this technique, we can easily specify and solve the motion coordination problem of a juggler that juggles with a dynamic number of balls in real time.  相似文献   

16.
Motion segmentation and pose recognition with motion history gradients   总被引:7,自引:0,他引:7  
This paper presents a fast and simple method using a timed motion history image (tMHI) for representing motion from the gradients in successively layered silhouettes. This representation can be used to (a) determine the current pose of the object and (b) segment and measure the motions induced by the object in a video scene. These segmented regions are not “motion blobs”, but instead are motion regions that are naturally connected to parts of the moving object. This method may be used as a very general gesture recognition “toolbox”. We demonstrate the approach with recognition of waving and overhead clapping motions to control a music synthesis program. Accepted: 13 August 2001  相似文献   

17.
In this paper, we address the analysis of 3D shape and shape change in non-rigid biological objects imaged via a stereo light microscope. We propose an integrated approach for the reconstruction of 3D structure and the motion analysis for images in which only a few informative features are available. The key components of this framework are: 1) image registration using a correlation-based approach, 2) region-of-interest extraction using motion-based segmentation, and 3) stereo and motion analysis using a cooperative spatial and temporal matching process. We describe these three stages of processing and illustrate the efficacy of the proposed approach using real images of a live frog's ventricle. The reconstructed dynamic 3D structure of the ventricle is demonstrated in our experimental results, and it agrees qualitatively with the observed images of the ventricle.  相似文献   

18.
Fast image retrieval using color-spatial information   总被引:1,自引:0,他引:1  
In this paper, we present an image retrieval system that employs both the color and spatial information of images to facilitate the retrieval process. The basic unit used in our technique is a single-colored cluster, which bounds a homogeneous region of that color in an image. Two clusters from two images are similar if they are of the same color and overlap in the image space. The number of clusters that can be extracted from an image can be very large, and it affects the accuracy of retrieval. We study the effect of the number of clusters on retrieval effectiveness to determine an appropriate value for “optimal' performance. To facilitate efficient retrieval, we also propose a multi-tier indexing mechanism called the Sequenced Multi-Attribute Tree (SMAT). We implemented a two-tier SMAT, where the first layer is used to prune away clusters that are of different colors, while the second layer discriminates clusters of different spatial locality. We conducted an experimental study on an image database consisting of 12,000 images. Our results show the effectiveness of the proposed color-spatial approach, and the efficiency of the proposed indexing mechanism. Received August 1, 1997 / Accepted December 9, 1997  相似文献   

19.
A model-based approach to reconstruction of 3D human arm motion from a monocular image sequence taken under orthographic projection is presented. The reconstruction is divided into two stages. First, a 2D shape model is used to track the arm silhouettes and second-order curves are used to model the arm based on an iteratively reweighted least square method. As a result, 2D stick figures are extracted. In the second stage, the stick figures are backprojected into the scene. 3D postures are reconstructed using the constraints of a 3D kinematic model of the human arm. The motion of the arm is then derived as a transition between the arm postures. Applications of these results are foreseen in the analysis of human motion patterns. Received: 26 January 1996 / Accepted: 17 July 1997  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号