首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
In this paper, we present a new framework for three-dimensional (3D) reconstruction of multiple rigid objects from dynamic scenes. Conventional 3D reconstruction from multiple views is applicable to static scenes, in which the configuration of objects is fixed while the images are taken. In our framework, we aim to reconstruct the 3D models of multiple objects in a more general setting where the configuration of the objects varies among views. We solve this problem by object-centered decomposition of the dynamic scenes using unsupervised co-recognition approach. Unlike conventional motion segmentation algorithms that require small motion assumption between consecutive views, co-recognition method provides reliable accurate correspondences of a same object among unordered and wide-baseline views. In order to segment each object region, we benefit from the 3D sparse points obtained from the structure-from-motion. These points are reliable and serve as automatic seed points for a seeded-segmentation algorithm. Experiments on various real challenging image sequences demonstrate the effectiveness of our approach, especially in the presence of abrupt independent motions of objects.  相似文献   

2.
Camera networks have gained increased importance in recent years. Existing approaches mostly use point correspondences between different camera views to calibrate such systems. However, it is often difficult or even impossible to establish such correspondences. But even without feature point correspondences between different camera views, if the cameras are temporally synchronized then the data from the cameras are strongly linked together by the motion correspondence: all the cameras observe the same motion. The present article therefore develops the necessary theory to use this motion correspondence for general rigid as well as planar rigid motions. Given multiple static affine cameras which observe a rigidly moving object and track feature points located on this object, what can be said about the resulting point trajectories? Are there any useful algebraic constraints hidden in the data? Is a 3D reconstruction of the scene possible even if there are no point correspondences between the different cameras? And if so, how many points are sufficient? Is there an algorithm which warrants finding the correct solution to this highly non-convex problem? This article addresses these questions and thereby introduces the concept of low-dimensional motion subspaces. The constraints provided by these motion subspaces enable an algorithm which ensures finding the correct solution to this non-convex reconstruction problem. The algorithm is based on multilinear analysis, matrix and tensor factorizations. Our new approach can handle extreme configurations, e.g. a camera in a camera network tracking only one single point. Results on synthetic as well as on real data sequences act as a proof of concept for the presented insights.  相似文献   

3.
We address the problem of epipolar geometry estimation by formulating it as one of hyperplane inference from a sparse and noisy point set in an 8D space. Given a set of noisy point correspondences in two images of a static scene without correspondences, even in the presence of moving objects, our method extracts good matches and rejects outliers. The methodology is novel and unconventional, since, unlike most other methods optimizing certain scalar, objective functions, our approach does not involve initialization or any iterative search in the parameter space. Therefore, it is free of the problem of local optima or poor convergence. Further, since no search is involved, it is unnecessary to impose simplifying assumption to the scene being analyzed for reducing the search complexity. Subject to the general epipolar constraint only, we detect wrong matches by a computation scheme, 8D tensor voting, which is an instance of the more general N-dimensional tensor voting framework. In essence, the input set of matches is first transformed into a sparse 8D point set. Dense, 8D tensor kernels are then used to vote for the most salient hyperplane that captures all inliers inherent in the input. With this filtered set of matches, the normalized eight-point algorithm can be used to estimate the fundamental matrix accurately. By making use of efficient data structure and locality, our method is both time and space efficient despite the higher dimensionality. We demonstrate the general usefulness of our method using example image pairs for aerial image analysis, with widely different views, and from nonstatic 3D scenes. Each example contains a considerable number of wrong matches  相似文献   

4.
We address the problem of camera motion and 3D structure reconstruction from line correspondences across multiple views, from initialization to final bundle adjustment. One of the main difficulties when dealing with line features is their algebraic representation. First, we consider the triangulation problem. Based on Plücker coordinates to represent the 3D lines, we propose a maximum likelihood algorithm, relying on linearizing the Plücker constraint and on a Plücker correction procedure, computing the closest Plücker coordinates to a given 6-vector. Second, we consider the bundle adjustment problem, which is essentially a nonlinear optimization process on camera motion and 3D line parameters. Previous overparameterizations of 3D lines induce gauge freedoms and/or internal consistency constraints. We propose the orthonormal representation, which allows handy nonlinear optimization of 3D lines using the minimum four parameters with an unconstrained optimization engine. We compare our algorithms to existing ones on simulated and real data. Results show that our triangulation algorithm outperforms standard linear and bias-corrected quasi-linear algorithms, and that bundle adjustment using our orthonormal representation yields results similar to the standard maximum likelihood trifocal tensor algorithm, while being usable for any number of views.  相似文献   

5.
Sparse optic flow maps are general enough to obtain useful information about camera motion. Usually, correspondences among features over an image sequence are estimated by radiometric similarity. When the camera moves under known conditions, global geometrical constraints can be introduced in order to obtain a more robust estimation of the optic flow. In this paper, a method is proposed for the computation of a robust sparse optic flow (OF) which integrates the geometrical constraints induced by camera motion to verify the correspondences obtained by radiometric-similarity-based techniques. A raw OF map is estimated by matching features by correlation. The verification of the resulting correspondences is formulated as an optimization problem that is implemented on a Hopfield neural network (HNN). Additional constraints imposed in the energy function permit us to achieve a subpixel accuracy in the image locations of matched features. Convergence of the HNN is reached in a small enough number of iterations to make the proposed method suitable for real-time processing. It is shown that the proposed method is also suitable for identifying independently moving objects in front of a moving vehicle. Received: 26 December 1995 / Accepted: 20 February 1997  相似文献   

6.
7.
We present a novel optimisation framework for the estimation of the multi-body motion segmentation and 3D reconstruction of a set of point trajectories in the presence of missing data. The proposed solution not only assigns the trajectories to the correct motion but it also solves for the 3D location of multi-body shape and it fills the missing entries in the measurement matrix. Such a solution is based on two fundamental principles: each of the multi-body motions is controlled by a set of metric constraints that are given by the specific camera model, and the shape matrix that describes the multi-body 3D shape is generally sparse. We jointly include such constraints in a unique optimisation framework which, starting from an initial segmentation, iteratively enforces these set of constraints in three stages. First, metric constraints are used to estimate the 3D metric shape and to fill the missing entries according to an orthographic camera model. Then, wrongly segmented trajectories are detected by using sparse optimisation of the shape matrix. A final reclassification strategy assigns the detected points to the right motion or discards them as outliers. We provide experiments that show consistent improvements to previous approaches both on synthetic and real data.  相似文献   

8.
一种基于无人机序列成像的地形地貌重建方法   总被引:1,自引:0,他引:1  
以无人机平台上普通摄像机获取的序列图像为对象,提出了一种对地三维重建的自动化处理方法。首先提出了基于视差分析的序列图像关键帧选择方法,对关键帧图像特征点进行鲁棒性的提取与匹配;第二步用加权的RANSAC算法估计基础矩阵,同时获取准确匹配的内点集。根据已标定的像机内参数,解算相对运动并进行优化。最后对待重建的目标点提出几何约束和单应约束融合的方法实现快速准确匹配,通过三角交会完成目标形貌三维重建。仿真实验结果表明该算法对序列图像具有较好的自动化程度和鲁棒性。  相似文献   

9.
Motion segmentation in moving camera videos is a very challenging task because of the motion dependence between the camera and moving objects. Camera motion compensation is recognized as an effective approach. However, existing work depends on prior-knowledge on the camera motion and scene structure for model selection. This is not always available in practice. Moreover, the image plane motion suffers from depth variations, which leads to depth-dependent motion segmentation in 3D scenes. To solve these problems, this paper develops a prior-free dependent motion segmentation algorithm by introducing a modified Helmholtz-Hodge decomposition (HHD) based object-motion oriented map (OOM). By decomposing the image motion (optical flow) into a curl-free and a divergence-free component, all kinds of camera-induced image motions can be represented by these two components in an invariant way. HHD identifies the camera-induced image motion as one segment irrespective of depth variations with the help of OOM. To segment object motions from the scene, we deploy a novel spatio-temporal constrained quadtree labeling. Extensive experimental results on benchmarks demonstrate that our method improves the performance of the state-of-the-art by 10%~20% even over challenging scenes with complex background.  相似文献   

10.
When a rigid scene is imaged by a moving camera, the set of all displacements of all points across multiple frames often resides in a low-dimensional linear subspace. Linear subspace constraints have been used successfully in the past for recovering 3D structure and 3D motion information from multiple frames (e.g., by using the factorization method of Tomasi and Kanade (1992, International Journal of Computer Vision, 9:137–154)). These methods assume that the 2D correspondences have been precomputed. However, correspondence estimation is a fundamental problem in motion analysis. In this paper we show how the multi-frame subspace constraints can be used for constraining the 2D correspondence estimation process itself.We show that the multi-frame subspace constraints are valid not only for affine cameras, but also for a variety of imaging models, scene models, and motion models. The multi-frame subspace constraints are first translated from constraints on correspondences to constraints directly on image measurements (e.g., image brightness quantities). These brightness-based subspace constraints are then used for estimating the correspondences, by requiring that all corresponding points across all video frames reside in the appropriate low-dimensional linear subspace.The multi-frame subspace constraints are geometrically meaningful, and are {not} violated at depth discontinuities, nor when the camera-motion changes abruptly. These constraints can therefore replace {heuristic} constraints commonly used in optical-flow estimation, such as spatial or temporal smoothness.  相似文献   

11.
Silhouette coherence for camera calibration under circular motion   总被引:1,自引:0,他引:1  
We present a new approach to camera calibration as a part of a complete and practical system to recover digital copies of sculpture from uncalibrated image sequences taken under turntable motion. In this paper, we introduce the concept of the silhouette coherence of a set of silhouettes generated by a 3D object. We show how the maximization of the silhouette coherence can be exploited to recover the camera poses and focal length. Silhouette coherence can be considered as a generalization of the well-known epipolar tangency constraint for calculating motion from silhouettes or outlines alone. Further, silhouette coherence exploits all the geometric information encoded in the silhouette (not just at epipolar tangency points) and can be used in many practical situations where point correspondences or outer epipolar tangents are unavailable. We present an algorithm for exploiting silhouette coherence to efficiently and reliably estimate camera motion. We use this algorithm to reconstruct very high quality 3D models from uncalibrated circular motion sequences, even when epipolar tangency points are not available or the silhouettes are truncated. The algorithm has been integrated into a practical system and has been tested on more than 50 uncalibrated sequences to produce high quality photo-realistic models. Three illustrative examples are included in this paper. The algorithm is also evaluated quantitatively by comparing it to a state-of-the-art system that exploits only epipolar tangents  相似文献   

12.
We address the problem of estimating three-dimensional motion, and structure from motion with an uncalibrated moving camera. We show that point correspondences between three images, and the fundamental matrices computed from these point correspondences, are sufficient to recover the internal orientation of the camera (its calibration), the motion parameters, and to compute coherent perspective projection matrices which enable us to reconstruct 3-D structure up to a similarity. In contrast with other methods, no calibration object with a known 3-D shape is needed, and no limitations are put upon the unknown motions to be performed or the parameters to be recovered, as long as they define a projective camera.The theory of the method, which is based on the constraint that the observed points are part of a static scene, thus allowing us to link the intrinsic parameters and the fundamental matrix via the absolute conic, is first detailed. Several algorithms are then presented, and their performances compared by means of extensive simulations and illustrated by several experiments with real images.  相似文献   

13.
This paper addresses the problem of self-calibration from one unknown motion of an uncalibrated stereo rig. Unlike the existing methods for stereo rig self-calibration, which have been focused on applying the autocalibration paradigm using both motion and stereo correspondences, our method does not require the recovery of stereo correspondences. Our method combines purely algebraic constraints with implicit geometric constraints. Assuming that the rotational part of the stereo geometry has two unknown degrees of freedom (i.e., the third dof is roughly known), and that the principle point of each camera is known, we first show that the computation of the intrinsic and extrinsic parameters of the stereo rig can be recovered from the motion correspondences only, i.e., the monocular fundamental matrices. We then provide an initialization procedure for the proposed non-linear method. We provide an extensive performance study for the method in the presence of image noise. In addition, we study some of the aspects related to the 3D motion that govern the accuracy of the proposed self-calibration method. Experiments conducted on synthetic and real data/images demonstrate the effectiveness and efficiency of the proposed method.  相似文献   

14.
Stereo by intra- and inter-scanline search using dynamic programming   总被引:14,自引:0,他引:14  
This paper presents a stereo matching algorithm using the dynamic programming technique. The stereo matching problem, that is, obtaining a correspondence between right and left images, can be cast as a search problem. When a pair of stereo images is rectified, pairs of corresponding points can be searched for within the same scanlines. We call this search intra-scanline search. This intra-scanline search can be treated as the problem of finding a matching path on a two-dimensional (2D) search plane whose axes are the right and left scanlines. Vertically connected edges in the images provide consistency constraints across the 2D search planes. Inter-scanline search in a three-dimensional (3D) search space, which is a stack of the 2D search planes, is needed to utilize this constraint. Our stereo matching algorithm uses edge-delimited intervals as elements to be matched, and employs the above mentioned two searches: one is inter-scanline search for possible correspondences of connected edges in right and left images and the other is intra-scanline search for correspondences of edge-delimited intervals on each scanline pair. Dynamic programming is used for both searches which proceed simultaneously: the former supplies the consistency constraint to the latter while the latter supplies the matching score to the former. An interval-based similarity metric is used to compute the score. The algorithm has been tested with different types of images including urban aerial images, synthesized images, and block scenes, and its computational requirement has been discussed.  相似文献   

15.
The majority of visual simultaneous localization and mapping (SLAM) approaches consider feature correspondences as an input to the joint process of estimating the camera pose and the scene structure. In this paper, we propose a new approach for simultaneously obtaining the correspondences, the camera pose, the scene structure, and the illumination changes, all directly using image intensities as observations. Exploitation of all possible image information leads to more accurate estimates and avoids the inherent difficulties of reliably associating features. We also show here that, in this case, structural constraints can be enforced within the procedure as well (instead of a posteriori), namely the cheirality, the rigidity, and those related to the lighting variations. We formulate the visual SLAM problem as a nonlinear image alignment task. The proposed parameters to perform this task are optimally computed by an efficient second-order approximation method for fast processing and avoidance of irrelevant minima. Furthermore, a new solution to the visual SLAM initialization problem is described whereby no assumptions are made about either the scene or the camera motion. Experimental results are provided for a variety of scenes, including urban and outdoor ones, under general camera motion and different types of perturbations.   相似文献   

16.
The classic approach to structure from motion entails a clear separation between motion estimation and structure estimation and between two-dimensional (2D) and three-dimensional (3D) information. For the recovery of the rigid transformation between different views only 2D image measurements are used. To have available enough information, most existing techniques are based on the intermediate computation of optical flow which, however, poses a problem at the locations of depth discontinuities. If we knew where depth discontinuities were, we could (using a multitude of approaches based on smoothness constraints) accurately estimate flow values for image patches corresponding to smooth scene patches; but to know the discontinuities requires solving the structure from motion problem first. This paper introduces a novel approach to structure from motion which addresses the processes of smoothing, 3D motion and structure estimation in a synergistic manner. It provides an algorithm for estimating the transformation between two views obtained by either a calibrated or uncalibrated camera. The results of the estimation are then utilized to perform a reconstruction of the scene from a short sequence of images.The technique is based on constraints on image derivatives which involve the 3D motion and shape of the scene, leading to a geometric and statistical estimation problem. The interaction between 3D motion and shape allows us to estimate the 3D motion while at the same time segmenting the scene. If we use a wrong 3D motion estimate to compute depth, we obtain a distorted version of the depth function. The distortion, however, is such that the worse the motion estimate, the more likely we are to obtain depth estimates that vary locally more than the correct ones. Since local variability of depth is due either to the existence of a discontinuity or to a wrong 3D motion estimate, being able to differentiate between these two cases provides the correct motion, which yields the least varying estimated depth as well as the image locations of scene discontinuities. We analyze the new constraints, show their relationship to the minimization of the epipolar constraint, and present experimental results using real image sequences that indicate the robustness of the method.  相似文献   

17.
Efficient visibility computation is a prominent requirement when designing automated camera control techniques for dynamic 3D environments; computer games, interactive storytelling or 3D media applications all need to track 3D entities while ensuring their visibility and delivering a smooth cinematic experience. Addressing this problem requires to sample a large set of potential camera positions and estimate visibility for each of them, which in practice is intractable despite the efficiency of ray-casting techniques on recent platforms. In this work, we introduce a novel GPU-rendering technique to efficiently compute occlusions of tracked targets in Toric Space coordinates – a parametric space designed for cinematic camera control. We then rely on this occlusion evaluation to derive an anticipation map predicting occlusions for a continuous set of cameras over a user-defined time window. We finally design a camera motion strategy exploiting this anticipation map to minimize the occlusions of tracked entities over time. The key features of our approach are demonstrated through comparison with traditionally used ray-casting on benchmark scenes, and through an integration in multiple game-like 3D scenes with heavy, sparse and dense occluders.  相似文献   

18.
基本矩阵的5点和4点算法   总被引:4,自引:0,他引:4  
基本矩阵(Fundamental Matrix)是两幅图像之间的基本约束,在摄像机标定和三维重建 中起着至关重要的作用.本文证明,当摄像机在两幅图像之间的运动为纯平移运动时,给定5对 图像对应点,如果其中的4对对应点为共面空间点的投影(称为共面对应点),则可以线性确定基 本矩阵.另外,如果摄像机不是5参数模型(完全针孔模型),而是4参数模型(畸变因子为零),则 此时仅使用该4对共面对应点即可线性确定基本矩阵.据我们所知,这些结果在文献中还没有类 似的报导.  相似文献   

19.
We address the problem of finding the correspondences of two point sets in 3D undergoing a rigid transformation. Using these correspondences the motion between the two sets can be computed to perform registration. Our approach is based on the analysis of the rigid motion equations as expressed in the Geometric Algebra framework. Through this analysis it was apparent that this problem could be cast into a problem of finding a certain 3D plane in a different space that satisfies certain geometric constraints. In order to find this plane in a robust way, the Tensor Voting methodology was used. Unlike other common algorithms for point registration (like the Iterated Closest Points algorithm), ours does not require an initialization, works equally well with small and large transformations, it cannot be trapped in “local minima” and works even in the presence of large amounts of outliers. We also show that this algorithm is easily extended to account for multiple motions and certain non-rigid or elastic transformations.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号