首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We consider the problem of segmenting multiple rigid-body motions from point correspondences in multiple affine views. We cast this problem as a subspace clustering problem in which point trajectories associated with each motion live in a linear subspace of dimension two, three or four. Our algorithm involves projecting all point trajectories onto a 5-dimensional subspace using the SVD, the PowerFactorization method, or RANSAC, and fitting multiple linear subspaces representing different rigid-body motions to the points in ℝ5 using GPCA. Unlike previous work, our approach does not restrict the motion subspaces to be four-dimensional and independent. Instead, it deals gracefully with all the spectrum of possible affine motions: from two-dimensional and partially dependent to four-dimensional and fully independent. Our algorithm can handle the case of missing data, meaning that point tracks do not have to be visible in all images, by using the PowerFactorization method to project the data. In addition, our method can handle outlying trajectories by using RANSAC to perform the projection. We compare our approach to other methods on a database of 167 motion sequences with full motions, independent motions, degenerate motions, partially dependent motions, missing data, outliers, etc. On motion sequences with complete data our method achieves a misclassification error of less that 5% for two motions and 29% for three motions.  相似文献   

2.
We address the problem of simultaneous two-view epipolar geometry estimation and motion segmentation from nonstatic scenes. Given a set of noisy image pairs containing matches of n objects, we propose an unconventional, efficient, and robust method, 4D tensor voting, for estimating the unknown n epipolar geometries, and segmenting the static and motion matching pairs into n, independent motions. By considering the 4D isotropic and orthogonal joint image space, only two tensor voting passes are needed, and a very high noise to signal ratio (up to five) can be tolerated. Epipolar geometries corresponding to multiple, rigid motions are extracted in succession. Only two uncalibrated frames are needed, and no simplifying assumption (such as affine camera model or homographic model between images) other than the pin-hole camera model is made. Our novel approach consists of propagating a local geometric smoothness constraint in the 4D joint image space, followed by global consistency enforcement for extracting the fundamental matrices corresponding to independent motions. We have performed extensive experiments to compare our method with some representative algorithms to show that better performance on nonstatic scenes are achieved. Results on challenging data sets are presented.  相似文献   

3.
Generalized principal component analysis (GPCA)   总被引:3,自引:0,他引:3  
This paper presents an algebro-geometric solution to the problem of segmenting an unknown number of subspaces of unknown and varying dimensions from sample data points. We represent the subspaces with a set of homogeneous polynomials whose degree is the number of subspaces and whose derivatives at a data point give normal vectors to the subspace passing through the point. When the number of subspaces is known, we show that these polynomials can be estimated linearly from data; hence, subspace segmentation is reduced to classifying one point per subspace. We select these points optimally from the data set by minimizing certain distance function, thus dealing automatically with moderate noise in the data. A basis for the complement of each subspace is then recovered by applying standard PCA to the collection of derivatives (normal vectors). Extensions of GPCA that deal with data in a high-dimensional space and with an unknown number of subspaces are also presented. Our experiments on low-dimensional data show that GPCA outperforms existing algebraic algorithms based on polynomial factorization and provides a good initialization to iterative techniques such as k-subspaces and expectation maximization. We also present applications of GPCA to computer vision problems such as face clustering, temporal video segmentation, and 3D motion segmentation from point correspondences in multiple affine views.  相似文献   

4.
Recently, subspace constraints have been widely exploited in many computer vision problems such as multibody grouping. Under linear projection models, feature points associated with multiple bodies reside in multiple subspaces. Most existing factorization-based algorithms can segment objects undergoing independent motions. However, intersections among the correlated motion subspaces will lead most previous factorization-based algorithms to erroneous segmentation. To overcome this limitation, in this paper, we formulate the problem of multibody grouping as inference of multiple subspaces from a high-dimensional data space. A novel and robust algorithm is proposed to capture the configuration of the multiple subspace structure and to find the segmentation of objects by clustering the feature points into these inferred subspaces, no matter whether they are independent or correlated. In the proposed method, an oriented-frame (OF), which is a multidimensional coordinate frame, is associated with each data point indicating the point's preferred subspace configuration. Based on the similarity between the subspaces, novel mechanisms of subspace evolution and voting are developed. By filtering the outliers due to their structural incompatibility, the subspace configurations will emerge. Compared with most existing factorization-based algorithms that cannot correctly segment correlated motions, such as motions of articulated objects, the proposed method has a robust performance in both independent and correlated motion segmentation. A number of controlled and real experiments show the effectiveness of the proposed method. However, the current approach does not deal with transparent motions and motion subspaces of different dimensions.  相似文献   

5.
In this paper we show how to carry out an automatic alignment of a pan-tilt camera platform with its natural coordinate frame, using only images obtained from the cameras during controlled motion of the unit. An active camera in aligned orientation represents the zero position for each axis, and allows axis odometry to be referred to a fixed reference frame; such referral is otherwise only possible using mechanical means, such as end-stops, which cannot take account of the unknown relationship between the camera coordinate frame and its mounting. The algorithms presented involve the calculation of two-view transformations (homographies or epipolar geometry) between pairs of images related by controlled rotation about individual head axes. From these relationships, which can be calculated linearly or optimised iteratively, an invariant line to the motion can be extracted which represents an aligned viewing direction. We present methods for general and degenerate motion (translating or non-translating), and general and degenerate scenes (non-planar and planar, but otherwise unknown), which do not require knowledge of the camera calibration, and are resistant to lens distortion non-linearity.Detailed experimentation in simulation, and in real scenes, demonstrate the speed, accuracy, and robustness of the methods, with the advantages of applicability to a wide range circumstances and no need to involve calibration objects or complex motions. Accuracy of within half a degree can be achieved with a single motion, and we also show how to improve on this by incorporating images from further motions, using a natural extension of the basic algorithm.  相似文献   

6.
提出了一种基于递归最短生成树算法的H.264压缩域实时分割运动对象的算法.首先将从H.264编码端提取的运动矢量进行归一化、空间内插,得到稠密运动矢量场,再采用全局运动补偿技术抵消全局运动的影响,最后采用改进的"递归最短生成树"(RSST)算法对稠密运动矢量进行聚类,实现对运动对象的分割.实验结果表明,该算法对视频序列能实现较准确的分割.  相似文献   

7.
We present a new approach to motion rearrangement that preserves the syntactic structures of an input motion automatically by learning a context‐free grammar from the motion data. For grammatical analysis, we reduce an input motion into a string of terminal symbols by segmenting the motion into a series of subsequences, and then associating a group of similar subsequences with the same symbol. To obtain the most repetitive and precise set of terminals, we search for an optimial segmentation such that a large number of subsequences can be clustered into groups with little error. Once the input motion has been encoded as a string, a grammar induction algorithm is employed to build up a context‐free grammar so that the grammar can reconstruct the original string accurately as well as generate novel strings sharing their syntactic structures with the original string. Given any new strings from the learned grammar, it is straightforward to synthesize motion sequences by replacing each terminal symbol with its associated motion segment, and stitching every motion segment sequentially. We demonstrate the usefulness and flexibility of our approach by learning grammars from a large diversity of human motions, and reproducing their syntactic structures in new motion sequences.  相似文献   

8.
Motion segmentation and non-rigid structure from motion are two challenging computer vision problems that have attracted numerous research interests. While the previous works handle these two problems separately, we present a general motion segmentation framework in this paper for solving these two seemingly different problems in a unified manner. At the heart of our general motion segmentation framework is a model selection mechanism based on finding the minimal basis subspace representation, by seeking the joint sparse representation of the data matrix. However, such formulation is NP-hard and we solve the convex proxy instead. Unlike other compressive sensing related works, this convex proxy solution is insufficient for our problem. The convex relaxation artefacts and noise yield multiple subspace representations, making identification of the exact number of motion subspaces challenging. We solve for the right number of subspaces by transforming this problem into a Facility Location problem with global cost and solve the factor graph formulation using max product belief propagation message passing.  相似文献   

9.
10.
Bio-inspired energy models compute motion along the lines suggested by the neurophysiological studies of V1 and MT areas in both monkeys and humans: neural populations extract the structure of motion from local competition among MT-like cells. We describe here a neural structure that works as a dynamic filter above this MT layer for image segmentation and takes advantage of neural population coding in the cortical processing areas. We apply the model to the real-life case of an automatic watch-out system for car-overtaking situations seen from the rear-view mirror. The ego-motion of the host car induces a global motion pattern whereas an overtaking vehicle produces a pattern that contrasts highly with this global ego-motion field. We describe how a simple, competitive, neural processing scheme can take full advantage of this motion structure for segmenting overtaking-cars  相似文献   

11.
We present an algorithm for extracting and classifying two-dimensional motion in an image sequence based on motion trajectories. First, a multiscale segmentation is performed to generate homogeneous regions in each frame. Regions between consecutive frames are then matched to obtain two-view correspondences. Affine transformations are computed from each pair of corresponding regions to define pixel matches. Pixels matches over consecutive image pairs are concatenated to obtain pixel-level motion trajectories across the image sequence. Motion patterns are learned from the extracted trajectories using a time-delay neural network. We apply the proposed method to recognize 40 hand gestures of American Sign Language. Experimental results show that motion patterns of hand gestures can be extracted and recognized accurately using motion trajectories.  相似文献   

12.
Time-varying imagery is often described in terms of image flow fields (i.e., image motion), which correspond to the perceptive projection of feature motions in three dimensions (3D). In the case of multiple moving objects with smooth surfaces, the image flow possesses an analytic structure that reflects these 3D properties. This paper describes the analytic structure of image flow fields in the image space-time domain, and its use for segmentation and 3D motion computation. First we discuss thelocal flow structure as embodied in the concept ofneighborhood deformation. The local image deformation is effectively represented by a set of 12 basis deformations, each of which is responsible for an independent deformation. This local representation provides us with sufficient information for the recovery of 3D object structure and motion, in the case of relative rigid body motions. We next discuss theglobal flow structure embodied in the partitioning of the entire image plane intoanalytic regions separated byboundaries of analyticity, such that each small neighborhood within the analytic region is described in terms of deformation bases. This analysis reveals an effective mechanism for detecting the analytic boundaries of flow fields, thereby segmenting the image into meaningful regions. The notion ofconsistency which is often used in the image segmentation is made explicit by the mathematical notion ofanalyticity derived from the projection relation of 3D object motion. The concept of flow analyticity is then extended to the temporal domain, suggesting a more robust algorithm for recovering image flow from multiple frames. Finally, we argue that the process of flow segmentation can be understood in the framework of grouping process. The general concept ofcoherence orgrouping through local support (such as the second-order flows in our case) is discussed.  相似文献   

13.
In this paper, we deal with the problem of synthesizing novel motions of standing-up martial arts such as Kickboxing, Karate, and Taekwondo performed by a pair of human-like characters while reflecting their interactions. Adopting an example-based paradigm, we address three non-trivial issues embedded in this problem: motion modeling, interaction modeling, and motion synthesis. For the first issue, we present a semi-automatic motion labeling scheme based on force-based motion segmentation and learning-based action classification. We also construct a pair of motion transition graphs each of which represents an individual motion stream. For the second issue, we propose a scheme for capturing the interactions between two players. A dynamic Bayesian network is adopted to build a motion transition model on top of the coupled motion transition graph that is constructed from an example motion stream. For the last issue, we provide a scheme for synthesizing a novel sequence of coupled motions, guided by the motion transition model. Although the focus of the present work is on martial arts, we believe that the framework of the proposed approach can be conveyed to other two-player motions as well.  相似文献   

14.
Normalized cuts and image segmentation   总被引:60,自引:0,他引:60  
We propose a novel approach for solving the perceptual grouping problem in vision. Rather than focusing on local features and their consistencies in the image data, our approach aims at extracting the global impression of an image. We treat image segmentation as a graph partitioning problem and propose a novel global criterion, the normalized cut, for segmenting the graph. The normalized cut criterion measures both the total dissimilarity between the different groups as well as the total similarity within the groups. We show that an efficient computational technique based on a generalized eigenvalue problem can be used to optimize this criterion. We applied this approach to segmenting static images, as well as motion sequences, and found the results to be very encouraging  相似文献   

15.
Fractal geometry is receiving increased attention as a model for natural phenomena. In this paper we first present a new method for estimating the fractal dimension from image surfaces and show that it performs better at describing and segmenting generated fractal sets. Since the fractal dimension alone is not sufficient to characterize natural textures, we define a new class of texture measures based on the concept of lacunarity and use them, together with the fractal dimension, to describe and segment natural texture images.  相似文献   

16.
Motion constraint   总被引:3,自引:0,他引:3  
In this paper, we propose a hybrid postural control approach taking advantage of data-driven and goal-oriented methods while overcoming their limitations. In particular, we take advantage of the latent space characterizing a given motion database. We introduce a motion constraint operating in the latent space to benefit from its much smaller dimension compared to the joint space. This allows its transparent integration into a Prioritized Inverse Kinematics framework. If its priority is high the constraint may restrict the solution to lie within the motion database space. We are more interested in the alternate case of an intermediate priority level that channels the postural control through a spatiotemporal pattern representative of the motion database while achieving a broader range of goals. We illustrate this concept with a sparse database of large range full-body reach motions.  相似文献   

17.
We present a simple and fast geometric method for modeling data by a union of affine subspaces. The method begins by forming a collection of local best-fit affine subspaces, i.e., subspaces approximating the data in local neighborhoods. The correct sizes of the local neighborhoods are determined automatically by the Jones?? ?? 2 numbers (we prove under certain geometric conditions that our method finds the optimal local neighborhoods). The collection of subspaces is further processed by a greedy selection procedure or a spectral method to generate the final model. We discuss applications to tracking-based motion segmentation and clustering of faces under different illuminating conditions. We give extensive experimental evidence demonstrating the state of the art accuracy and speed of the suggested algorithms on these problems and also on synthetic hybrid linear data as well as the MNIST handwritten digits data; and we demonstrate how to use our algorithms for fast determination of the number of affine subspaces.  相似文献   

18.
A General Framework for Assembly Planning: The Motion Space Approach   总被引:2,自引:0,他引:2  
Assembly planning is the problem of finding a sequence of motions to assemble a product from its parts. We present a general framework for finding assembly motions based on the concept of motion space . Assembly motions are parameterized such that each point in motion space represents a mating motion that is independent of the moving part set. For each motion we derive blocking relations that explicitly state which parts collide with other parts; each subassembly (rigid subset of parts) that does not collide with the rest of the assembly can easily be derived from the blocking relations. Motion space is partitioned into an arrangement of cells such that the blocking relations are fixed within each cell. We apply the approach to assembly motions of several useful types, including one-step translations, multistep translations, and infinitesimal rigid motions. Several efficiency improvements are described, as well as methods to include additional assembly constraints into the framework. The resulting algorithms have been implemented and tested extensively on complex assemblies. We conclude by describing some remaining open problems. Received November 15, 1996; revised January 15, 1998.  相似文献   

19.
Many video sequences consist of a locally dynamic background containing moving foreground subjects. In this paper we propose a novel way of re‐displaying these sequences, by giving the user control over a virtual camera frame. Based on video mosaicing, we first compute a static high quality background panorama. After segmenting and removing the foreground subjects from the original video, the remaining elements are merged into a dynamic background panorama, which seamlessly extends the original video footage. We then re‐display this augmented video by warping and cropping the panorama. The virtual camera can have an enlarged field‐of‐view and a controlled camera motion. Our technique is able to process videos with complex camera motions, reconstructing high quality panoramas without parallax artefacts, visible seams or blurring, while retaining repetitive dynamic elements.  相似文献   

20.
We present an interactive method for creating animation sequences of characters based on captured motion data in an exploratory way as in assembling construction toys. The key component of our method is a path browser that can retrieve and visualize paths as diverse as possible connecting a given pair of initial and final motion fragments instantiated in the space. With the aid of our path browser, the user can develop large‐scale assembly of motions gradually through iterations of arranging and putting together motion fragments. For the efficient retrieval of connecting paths, we use a bidirectional search tree that grows from the initial and final configurations simultaneously under the guidance of a mixed strategy for both global exploration and local optimization. The usefulness of our approach is demonstrated through experiments with a variety of motion data including box moving, basketball, and breakdance data. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号