首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到13条相似文献,搜索用时 15 毫秒
1.
In this paper, we present a framework for visual object tracking based on clustering trajectories of image key points extracted from an image sequence. The main contribution of our method is that the trajectories are automatically extracted from the image sequence and they are provided directly to a model-based clustering approach. In most other methodologies, the latter constitutes a difficult part since the resulting feature trajectories have a short duration, as the key points disappear and reappear due to occlusion, illumination, viewpoint changes and noise. We present here a sparse, translation invariant regression mixture model for clustering trajectories of variable length. The overall scheme is converted into a maximum a posteriori approach, where the Expectation–Maximization (EM) algorithm is used for estimating the model parameters. The proposed method detects the different objects in the input image sequence by assigning each trajectory to a cluster, and simultaneously provides their motion. Numerical results demonstrate the ability of the proposed method to offer more accurate and robust solutions in comparison with other tracking approaches, such as the mean shift tracker, the camshift tracker and the Kalman filter.  相似文献   

2.
3.
In this paper, we propose an affine parameter estimation algorithm from block motion vectors for extracting accurate motion information with the assumption that the undergoing motion can be characterized by an affine model. The motion may be caused either by a moving camera or a moving object. The proposed method first extracts motion vectors from a sequence of images by using size-variable block matching and then processes them by adaptive robust estimation to estimate affine parameters. Typically, a robust estimation filters out outliers (velocity vectors that do not fit into the model) by fitting velocity vectors to a predefined model. To filter out potential outliers, our adaptive robust estimation defines a continuous weight function based on a Sigmoid function. During the estimation process, we tune the Sigmoid function gradually to its hard-limit as the errors between the model and input data are decreased, so that we can effectively separate non-outliers from outliers with the help of the finally tuned hard-limit form of the weight function. Experimental results show that the suggested approach is very effective in estimating affine parameters reliably.  相似文献   

4.
This paper deals with motion estimation from image corner correspondences in two cases: the orthogonal corner and the general corner with known space angles. The contribution of the paper is in three folds: First, the three-dimensional structure of a corner is recovered easily from its image by introducing a new coordinate system; second, it is shown that the one corner and two points correspondences over two views are sufficient to uniquely determine the motion, i.e., the rotation and translation; third, experiments using both simulated data and real images are conducted, which present good results.  相似文献   

5.
The problem we are stating is the discrimination of non-rigid objects capable of holding our attention in a scene. Motion allows gradually obtaining all moving objects shapes. We introduce an algorithm that fuses spots obtained by means of neuronal lateral interaction in accumulative computation.  相似文献   

6.
An integrated approach to extract depth, efficiently and accurately, from a sequence of images is presented in this paper. The method combines the ability of the stereo processing to acquire highly accurate depth measurements and the efficiency of spatial and temporal gradient analysis. As a result of this integration, depth measurements of high quality are obtained at a speed approximately ten times greater than that of stereo processing. Without any a priori information of the locations of the points in the scene, the correspondence problem in stereo processing is computationally expensive. In our approach, we use spatial and temporal gradient (STG) analysis, which has been shown to provide depth with great efficiency, but limited accuracy, to guide the matching process of stereo. The camera motion used in the approach can be either lateral or axial. Extensive experiments on real scenes have shown the ability of the integrated approach to acquire depth with a mean error of less than 3%.  相似文献   

7.
We investigate how human action recognition can be improved by considering spatio-temporal layout of actions. From literature, we adopt a pipeline consisting of STIP features, a random forest to quantize the features into histograms, and an SVM classifier. Our goal is to detect 48 human actions, ranging from simple actions such as walk to complex actions such as exchange. Our contribution to improve the performance of this pipeline by exploiting a novel spatio-temporal layout of the 48 actions. Here each STIP feature does not in the video contributes to the histogram bins by a unity value, but rather by a weight given by its spatio-temporal probability. We propose 6 configurations of spatio-temporal layout, where the varied parameters are the coordinate system and the modeling of the action and its context. Our model of layout does not change any other parameter of the pipeline, it requires no re-learning of the random forest, yields a limited increase of the size of its resulting representation by only a factor two, and at a minimal additional computational cost of only a handful of operations per feature. Extensive experiments show that the layout is demonstrated to be distinctive of actions that involve trajectories, (dis)appearance, kinematics, and interactions. The visualization of each action’s layout illustrates that our approach is indeed able to model spatio-temporal patterns of each action. Each layout is experimentally shown to be optimal for a specific set of actions. Generally, the context has more effect than the choice of coordinate system. The most impressive improvements are achieved for complex actions involving items. For 43 out of 48 human actions, the performance is better or equal when spatio-temporal layout is included. In addition, we show our method outperforms state-of-the-art for the IXMAS and UT-Interaction datasets.  相似文献   

8.
This paper deals with the estimation of motion and structure with an absolute scale factor from stereo image sequences without stereo correspondence. We show that the absolute motion and structure can be determined using only motion correspondences. This property is very useful in two aspects: first, motion correspondence is easier to solve than stereo correspondence because sequences of images can be taken at short time intervals; second, it is not necessary that the rigid scene be included in the intersection of the field of view of the two cameras. It is also shown that the degenerate cases reported in this paper constitute all of the degenerate cases for the scheme and can be easily avoided.  相似文献   

9.
One method to detect obstacles from a vehicle moving on a planar road surface is the analysis of motion-compensated difference images. In this contribution, a motion compensation algorithm is presented, which computes the required image-warping parameters from an estimate of the relative motion between camera and ground plane. The proposed algorithm estimates the warping parameters from displacements at image corners and image edges. It exploits the estimated confidence of the displacements to cope robustly with outliers. Knowledge about camera calibration, measuremts from odometry, and the previous estimate are used for motion prediction and to stabilize the estimation process when there is not enough information available in the measured image displacements. The motion compensation algorithm has been integrated with modules for obstacle detection and lane tracking. This system has been integrated in experimental vehicles and runs in real time with an overall cycle of 12.5 Hz on low-cost standard hardware. Received: 23 April 1998 / Accepted: 25 August 1999  相似文献   

10.
Most existing approaches in structure from motion for deformable objects focus on non-incremental solutions utilizing batch type algorithms. All data is collected before shape and motion reconstruction take place. This methodology is inherently unsuitable for applications that require real-time learning. Ideally the online system is capable of incrementally learning and building accurate shapes using current measurement data and past reconstructed shapes. Estimation of 3D structure and camera position is done online. To rely only on the measurements up until that moment is still a challenging problem.  相似文献   

11.
This paper presents a novel method for reconstructing a 3D human body pose from stereo image sequences based on a top-down learning method. However, it is inefficient to build a statistical model using all training data. Therefore, the training data is hierarchically divided into several clusters to reduce the complexity of the learning problem. In the learning stage, the human body model database is hierarchically constructed by classifying the training data into several sub-clusters with silhouette images. The data of each cluster in the bottom level is represented by a linear combination of examples. In the reconstruction stage, the proposed method hierarchically searches a cluster for the best matching silhouette image using a silhouette history image (SHI). Then, the 3D human body pose is reconstructed from a depth image using a linear combination of examples method. By using depth information to reconstruct 3D human body pose, the similar poses in silhouette images are estimated as different 3D human body poses. The experimental results demonstrate that the proposed method is efficient and effective for reconstructing 3D human body poses.  相似文献   

12.
In this paper a new approach to motion analysis from stereo image sequences using unified temporal and spatial optical flow field (UOFF) is reported. That is, based on a four-frame rectangular model and the associated six UOFF field quantities, a set of equations is derived from which both position and velocity can be determined. It does not require feature extraction and correspondence establishment, which are known to be difficult, and only partial solutions suitable for simplistic situations have been developed. Furthermore, it is capable of detecting multiple moving objects even when partial occlusion occurs, and is potentially suitable for nonrigid motion analysis. Unlike the current existing techniques for motion analysis from stereo imagery, the recovered motion by using this new approach is for a whole continuous field instead of only for some features. It is a purely optical flow approach. Two experiments are presented to demonstrate the feasibility of the approach.  相似文献   

13.
A model-based approach to reconstruction of 3D human arm motion from a monocular image sequence taken under orthographic projection is presented. The reconstruction is divided into two stages. First, a 2D shape model is used to track the arm silhouettes and second-order curves are used to model the arm based on an iteratively reweighted least square method. As a result, 2D stick figures are extracted. In the second stage, the stick figures are backprojected into the scene. 3D postures are reconstructed using the constraints of a 3D kinematic model of the human arm. The motion of the arm is then derived as a transition between the arm postures. Applications of these results are foreseen in the analysis of human motion patterns. Received: 26 January 1996 / Accepted: 17 July 1997  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号