首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Existing approaches to recover structure of 3D deformable objects and camera motion parameters from an uncalibrated images assume the object’s shape could be modelled well by a linear subspace. These methods have been proven effective and well suited when the deformations are relatively small, but fail to reconstruct the objects with relatively large deformations. This paper describes a novel approach for 3D non-rigid shape reconstruction, based on manifold decision forest technique. The use of this technique can be justified by noting that a specific type of shape variations might be governed by only a small number of parameters, and therefore can be well represented in a low-dimensional manifold. The key contributions of this work are the use of random decision forests for the shape manifold learning and robust metric for calculation of the re-projection error. The learned manifold defines constraints imposed on the reconstructed shapes. Due to a nonlinear structure of the learned manifold, this approach is more suitable to deal with large and complex object deformations when compared to the linear constraints. The robust metric is applied to reduce the effect of measurement outliers on the quality of the reconstruction. In many practical applications outliers cannot be completely removed and therefore the use of robust techniques is of particular practical interest. The proposed method is validated on 2D points sequences projected from the 3D motion capture data for ground truth comparison and also on real 2D video sequences. Experiments show that the newly proposed method provides better performance compared to previously proposed ones, including the robustness with respect to measurement noise, missing measurements and outliers present in the data.  相似文献   

2.
《Real》1995,1(2):127-138
For surface reconstruction using motion, objects are placed on a rotating disc in front of a single camera. For camera calibration the method by Tsai was implemented, extended (calculation of distorted from undisturbed coordinates) and optimized (e.g. with respect to the number of calibration planes and points in each plane) (1). The way the calibration results can be used for this special case of surface reconstruction of objects on a rotating disc is described. Motion vectors calculated from point correspondences are used as input for this calculation of 3-D point positions. In two theorems, new reconstruction formulae are given. Experimentally, accurate depth values could be obtained for sparse object surface points. It is suggested to combine these exact values with "surface drafts" calculated by approaches based on reflectance properties.  相似文献   

3.
The metric reconstruction of a non-rigid object viewed by a generic camera poses new challenges since current approaches for Structure from Motion assume the rigidity constraint of a shape as an essential condition. In this work, we focus on the estimation of the 3-D Euclidean shape and motion of a non-rigid shape observed by a perspective camera. In such case deformation and perspective effects are difficult to decouple – the parametrization of the 3-D non-rigid body may mistakenly account for the perspective distortion. Our method relies on the fact that it is often a reasonable assumption that some of the points on the object’s surface are deforming throughout the sequence while others remain rigid. Thus, relying on the rigidity constraints of a subset of rigid points, we estimate the perspective to metric upgrade transformation. First, we use an automatic segmentation algorithm to identify the set of rigid points. These are then used to estimate the internal camera calibration parameters and the overall rigid motion. Finally, we formulate the problem of non-rigid shape and motion estimation as a non-linear optimization where the objective function to be minimized is the image reprojection error. The prior information that some of the points in the object are rigid can also be added as a constraint to the non-linear minimization scheme in order to avoid ambiguous configurations. We perform experiments on different synthetic and real data sets which show that even when using a minimal set of rigid points and when varying the intrinsic camera parameters it is possible to obtain reliable metric information.  相似文献   

4.
《Advanced Robotics》2013,27(10):1057-1072
It is an easy task for the human visual system to gaze continuously at an object moving in three-dimensional (3-D) space. While tracking the object, human vision seems able to comprehend its 3-D shape with binocular vision. We conjecture that, in the human visual system, the function of comprehending the 3-D shape is essential for robust tracking of a moving object. In order to examine this conjecture, we constructed an experimental system of binocular vision for motion tracking. The system is composed of a pair of active pan-tilt cameras and a robot arm. The cameras are for simulating the two eyes of a human while the robot arm is for simulating the motion of the human body below the neck. The two active cameras are controlled so as to fix their gaze at a particular point on an object surface. The shape of the object surface around the point is reconstructed in real-time from the two images taken by the cameras based on the differences in the image brightness. If the two cameras successfully gaze at a single point on the object surface, it is possible to reconstruct the local object shape in real-time. At the same time, the reconstructed shape is used for keeping a fixation point on the object surface for gazing, which enables robust tracking of the object. Thus these two processes, reconstruction of the 3-D shape and maintaining the fixation point, must be mutually connected and form one closed loop. We demonstrate the effectiveness of this framework for visual tracking through several experiments.  相似文献   

5.
Camera view invariant 3-D object retrieval is an important issue in many traditional and emerging applications such as security, surveillance, computer-aided design (CAD), virtual reality, and place recognition. One straightforward method for camera view invariant 3-D object retrieval is to consider all the possible camera views of 3-D objects. However, capturing and maintaining such views require an enormous amount of time and labor. In addition, all camera views should be indexed for reasonable retrieval performance, which requires extra storage space and maintenance overhead. In the case of shape-based 3-D object retrieval, such overhead could be relieved by considering the symmetric shape feature of most objects. In this paper, we propose a new shape-based indexing and matching scheme of real or rendered 3-D objects for camera view invariant object retrieval. In particular, in order to remove redundant camera views to be indexed, we propose a camera view skimming scheme, which includes: i) mirror shape pairing and ii) camera view pruning according to the symmetrical patterns of object shapes. Since our camera view skimming scheme considerably reduces the number of camera views to be indexed, it could relieve the storage requirement and improve the matching speed without sacrificing retrieval accuracy. Through various experiments, we show that our proposed scheme can achieve excellent performance.  相似文献   

6.
Recovering the 3D shape of an object from shading is a challenging problem due to the complexity of modeling light propagation and surface reflections. Photometric Stereo (PS) is broadly considered a suitable approach for high-resolution shape recovery, but its functionality is restricted to a limited set of object surfaces and controlled lighting setup. In particular, PS models generally consider reflection from objects as purely diffuse, with specularities being regarded as a nuisance that breaks down shape reconstruction. This is a serious drawback for implementing PS approaches, since most common materials have prominent specular components. In this paper, we propose a PS model that solves the problem for both diffuse and specular components aimed at shape recovery of generic objects with the approach being independent of the albedo values thanks to the image ratio formulation used. Notably, we show that by including specularities, it is possible to solve the PS problem for a minimal number of three images using a setup with three calibrated lights and a standard industrial camera. Even if an initial separation of diffuse and specular components is still required for each input image, experimental results on synthetic and real objects demonstrate the feasibility of our approach for shape reconstruction of complex geometries.  相似文献   

7.
An approach based on fuzzy logic for matching both articulated and non-articulated objects across multiple non-overlapping field of views (FoVs) from multiple cameras is proposed. We call it fuzzy logic matching algorithm (FLMA). The approach uses the information of object motion, shape and camera topology for matching objects across camera views. The motion and shape information of targets are obtained by tracking them using a combination of ConDensation and CAMShift tracking algorithms. The information of camera topology is obtained and used by calculating the projective transformation of each view with the common ground plane. The algorithm is suitable for tracking non-rigid objects with both linear and non-linear motion. We show videos of tracking objects across multiple cameras based on FLMA. From our experiments, the system is able to correctly match the targets across views with a high accuracy.  相似文献   

8.
Most existing approaches in structure from motion for deformable objects focus on non-incremental solutions utilizing batch type algorithms. All data is collected before shape and motion reconstruction take place. This methodology is inherently unsuitable for applications that require real-time learning. Ideally the online system is capable of incrementally learning and building accurate shapes using current measurement data and past reconstructed shapes. Estimation of 3D structure and camera position is done online. To rely only on the measurements up until that moment is still a challenging problem.  相似文献   

9.
In this paper we present an efficient contour-tracking algorithm which can track 2D silhouette of objects in extended image sequences. We demonstrate the ability of the tracker by tracking highly deformable contours (such as walking people) captured by a static camera. We represent contours (silhouette) of moving objects by using a cubic B-spline. The tracking algorithm is based on tracking a lower dimensional shape space (as opposed to tracking in spline space). Tracking the lower dimensional space has proved to be fast and efficient. The tracker is also coupled with an automatic motion-model switching algorithm, which makes the tracker robust and reliable when the object of interest is moving with multiple motion. The model-based tracking technique provided is capable of tracking rigid and non-rigid object contours with good tracking accuracy.  相似文献   

10.
Several non-rigid structure from motion methods have been proposed so far in order to recover both the motion and the non-rigid structure of an object. However, these monocular algorithms fail to give reliable 3D shape estimates when the overall rigid motion of the sequence is small. Aiming to overcome this limitation, in this paper we propose a novel approach for the 3D Euclidean reconstruction of deformable objects observed by an uncalibrated stereo rig. Using a stereo setup drastically improves the 3D model estimation when the observed 3D shape is mostly deforming without undergoing strong rigid motion. Our approach is based on the following steps. Firstly, the stereo system is automatically calibrated and used to compute metric rigid structures from pairs of views. Afterwards, these 3D shapes are aligned to a reference view using a RANSAC method in order to compute the mean shape of the object and to select the subset of points which have remained rigid throughout the sequence. The selected rigid points are then used to compute frame-wise shape registration and to robustly extract the motion parameters from frame to frame. Finally, all this information is used as initial estimates of a non-linear optimization which allows us to refine the initial solution and also to recover the non-rigid 3D model. Exhaustive results on synthetic and real data prove the performance of our proposal estimating motion, non-rigid models and stereo camera parameters even when there is no rigid motion in the original sequence.  相似文献   

11.
Category-level object recognition, segmentation, and tracking in videos becomes highly challenging when applied to sequences from a hand-held camera that features extensive motion and zooming. An additional challenge is then to develop a fully automatic video analysis system that works without manual initialization of a tracker or other human intervention, both during training and during recognition, despite background clutter and other distracting objects. Moreover, our working hypothesis states that category-level recognition is possible based only on an erratic, flickering pattern of interest point locations without extracting additional features. Compositions of these points are then tracked individually by estimating a parametric motion model. Groups of compositions segment a video frame into the various objects that are present and into background clutter. Objects can then be recognized and tracked based on the motion of their compositions and on the shape they form. Finally, the combination of this flow-based representation with an appearance-based one is investigated. Besides evaluating the approach on a challenging video categorization database with significant camera motion and clutter, we also demonstrate that it generalizes to action recognition in a natural way. Electronic Supplementary Material  The online version of this article () contains supplementary material, which is available to authorized users. This work was supported in part by the Swiss national science foundation under contract no. 200021-107636.  相似文献   

12.
We present an approach to identify noncooperative individuals at a distance from a sequence of images, using 3-D face models. Most biometric features (such as fingerprints, hand shape, iris, or retinal scans) require cooperative subjects in close proximity to the biometric system. We process images acquired with an ultrahigh-resolution video camera, infer the location of the subjects' head, use this information to crop the region of interest, build a 3-D face model, and use this 3-D model to perform biometric identification. To build the 3-D model, we use an image sequence, as natural head and body motion provides enough viewpoint variation to perform stereomotion for 3-D face reconstruction. We have conducted experiments on a 2-D and 3-D databases collected in our laboratory. First, we found that metric 3-D face models can be used for recognition by using simple scaling method even though there is no exact scale in the 3-D reconstruction. Second, experiments using a commercial 3-D matching engine suggest the feasibility of the proposed approach for recognition against 3-D galleries at a distance (3, 6, and 9 m). Moreover, we show initial 3-D face modeling results on various factors including head motion, outdoor lighting conditions, and glasses. The evaluation results suggest that video data alone, at a distance of 3 to 9 meters, can provide a 3-D face shape that supports successful face recognition. The performance of 3-D–3-D recognition with the currently generated models does not quite match that of 2-D–2-D. We attribute this to the quality of the inferred models, and this suggests a clear path for future research.   相似文献   

13.

Sparse 3D reconstruction, based on interest points detection and matching, does not allow to obtain a suitable 3D surface reconstruction because of its incapacity to recover a cloud of well distributed 3D points on the surface of objects/scenes. In this work, we present a new approach to retrieve a 3D point cloud that leads to a 3D surface model of quality and in a suitable time. First of all, our method uses the structure from motion approach to retrieve a set of 3D points (which correspond to matched interest points). After that, we proposed an algorithm, based on the match propagation and the use of particle swarm optimization (PSO), which significantly increases the number of matches and to have a regular distribution of these matches. It takes as input the obtained matches, their corresponding 3D points and the camera parameters. Afterwards, at each time, a match of best ZNCC value is selected and a set of these neighboring points is defined. The point corresponding to a neighboring point and its 3D coordinates are recovered by the minimization of a nonlinear cost function by the use of PSO algorithm respecting the constraint of photo-consistency. Experimental results show the feasibility and efficiency of the proposed approach.

  相似文献   

14.
Camera networks have gained increased importance in recent years. Existing approaches mostly use point correspondences between different camera views to calibrate such systems. However, it is often difficult or even impossible to establish such correspondences. But even without feature point correspondences between different camera views, if the cameras are temporally synchronized then the data from the cameras are strongly linked together by the motion correspondence: all the cameras observe the same motion. The present article therefore develops the necessary theory to use this motion correspondence for general rigid as well as planar rigid motions. Given multiple static affine cameras which observe a rigidly moving object and track feature points located on this object, what can be said about the resulting point trajectories? Are there any useful algebraic constraints hidden in the data? Is a 3D reconstruction of the scene possible even if there are no point correspondences between the different cameras? And if so, how many points are sufficient? Is there an algorithm which warrants finding the correct solution to this highly non-convex problem? This article addresses these questions and thereby introduces the concept of low-dimensional motion subspaces. The constraints provided by these motion subspaces enable an algorithm which ensures finding the correct solution to this non-convex reconstruction problem. The algorithm is based on multilinear analysis, matrix and tensor factorizations. Our new approach can handle extreme configurations, e.g. a camera in a camera network tracking only one single point. Results on synthetic as well as on real data sequences act as a proof of concept for the presented insights.  相似文献   

15.
Ye Lu  Ze-Nian Li 《Pattern recognition》2008,41(3):1159-1172
A new method of video object extraction is proposed to automatically extract the object of interest from actively acquired videos. Traditional video object extraction techniques often operate under the assumption of homogeneous object motion and extract various parts of the video that are motion consistent as objects. In contrast, the proposed active video object extraction (AVOE) approach assumes that the object of interest is being actively tracked by a non-calibrated camera under general motion and classifies the possible movements of the camera that result in the 2D motion patterns as recovered from the image sequence. Consequently, the AVOE method is able to extract the single object of interest from the active video. We formalize the AVOE process using notions from Gestalt psychology. We define a new Gestalt factor called “shift and hold” and present 2D object extraction algorithms. Moreover, since an active video sequence naturally contains multiple views of the object of interest, we demonstrate that these views can be combined to form a single 3D object regardless of whether the object is static or moving in the video.  相似文献   

16.
We present an algorithm for identifying and tracking independently moving rigid objects from optical flow. Some previous attempts at segmentation via optical flow have focused on finding discontinuities in the flow field. While discontinuities do indicate a change in scene depth, they do not in general signal a boundary between two separate objects. The proposed method uses the fact that each independently moving object has a unique epipolar constraint associated with its motion. Thus motion discontinuities based on self-occlusion can be distinguished from those due to separate objects. The use of epipolar geometry allows for the determination of individual motion parameters for each object as well as the recovery of relative depth for each point on the object. The algorithm assumes an affine camera where perspective effects are limited to changes in overall scale. No camera calibration parameters are required. A Kalman filter based approach is used for tracking motion parameters with time  相似文献   

17.
We address the problem of estimating three-dimensional motion, and structure from motion with an uncalibrated moving camera. We show that point correspondences between three images, and the fundamental matrices computed from these point correspondences, are sufficient to recover the internal orientation of the camera (its calibration), the motion parameters, and to compute coherent perspective projection matrices which enable us to reconstruct 3-D structure up to a similarity. In contrast with other methods, no calibration object with a known 3-D shape is needed, and no limitations are put upon the unknown motions to be performed or the parameters to be recovered, as long as they define a projective camera.The theory of the method, which is based on the constraint that the observed points are part of a static scene, thus allowing us to link the intrinsic parameters and the fundamental matrix via the absolute conic, is first detailed. Several algorithms are then presented, and their performances compared by means of extensive simulations and illustrated by several experiments with real images.  相似文献   

18.
摄像机简化模型对三维重构的影响--分析与实验   总被引:1,自引:1,他引:0  
讨论了摄像机简化模型对三维重构的影响.主要结论有:当摄像机在两幅图像间的运动为纯平移运动时,从理论上证明了使用摄像机简化模型重构空间点与实际空间点之间满足仿射变换;当摄像机在两幅图像间的运动为一般刚体运动时,使用简化模型的重构只有在一定条件下才能较好地保持原物体的形状;在简化模型下,基于Kruppa方程的方法所估计的焦距精度不能满足三维重构的要求.实验结果表明:在三维重构中不能盲目地使用简化模型,必须对摄像机内参数进行全面标定.  相似文献   

19.
Video understanding has attracted significant research attention in recent years, motivated by interest in video surveillance, rich media retrieval and vision-based gesture interfaces. Typical methods focus on analyzing both the appearance and motion of objects in video. However, the apparent motion induced by a moving camera can dominate the observed motion, requiring sophisticated methods for compensating for camera motion without a priori knowledge of scene characteristics. This paper introduces two new methods for global motion compensation that are both significantly faster and more accurate than state of the art approaches. The first employs RANSAC to robustly estimate global scene motion even when the scene contains significant object motion. Unlike typical RANSAC-based motion estimation work, we apply RANSAC not to the motion of tracked features but rather to a number of segments of image projections. The key insight of the second method involves reliably classifying salient points into foreground and background, based upon the entropy of a motion inconsistency measure. Extensive experiments on established datasets demonstrate that the second approach is able to remove camera-based observed motion almost completely while still preserving foreground motion.  相似文献   

20.
We present a novel optimisation framework for the estimation of the multi-body motion segmentation and 3D reconstruction of a set of point trajectories in the presence of missing data. The proposed solution not only assigns the trajectories to the correct motion but it also solves for the 3D location of multi-body shape and it fills the missing entries in the measurement matrix. Such a solution is based on two fundamental principles: each of the multi-body motions is controlled by a set of metric constraints that are given by the specific camera model, and the shape matrix that describes the multi-body 3D shape is generally sparse. We jointly include such constraints in a unique optimisation framework which, starting from an initial segmentation, iteratively enforces these set of constraints in three stages. First, metric constraints are used to estimate the 3D metric shape and to fill the missing entries according to an orthographic camera model. Then, wrongly segmented trajectories are detected by using sparse optimisation of the shape matrix. A final reclassification strategy assigns the detected points to the right motion or discards them as outliers. We provide experiments that show consistent improvements to previous approaches both on synthetic and real data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号