首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
共形几何代数与运动和形状的刻画   总被引:2,自引:0,他引:2  
共形几何代数在基于运动和形状刻画的视觉和图形学若干问题中的应用,反映了它能够提供统一和有效的表示和算法,这些应用主要集中在采纳几何体的Grassmann分级表示以及刚体运动的旋量和扭量表示.着重介绍了Grassmann分级表示如何被应用于单眼视觉问题并带来解决方法的简化;通过对刚体运动不同表示的分析,介绍旋量和扭量表示如何克服刚体运动蹬矩阵表示中参数空间具有过多非线性约束的缺点,从而为姿态估计、形状逼近和曲线拼接等问题的解决提供简化方案.  相似文献   

4.
5.
Silhouette-based occluded object recognition through curvature scale space   总被引:4,自引:0,他引:4  
A complete and practical system for occluded object recognition has been developed which is very robust with respect to noise and local deformations of shape (due to weak perspective distortion, segmentation errors and non-rigid material) as well as scale, position and orientation changes of the objects. The system has been tested on a wide variety of free-form 3D objects. An industrial application is envisaged where a fixed camera and a light-box are utilized to obtain images. Within the constraints of the system, every rigid 3D object can be modeled by a limited number of classes of 2D contours corresponding to the object's resting positions on the light-box. The contours in each class are related to each other by a 2D similarity transformation. The Curvature Scale Space technique [26, 28] is then used to obtain a novel multi-scale segmentation of the image and the model contours. Object indexing [16, 32, 36] is used to narrow down the search space. An efficient local matching algorithm is utilized to select the best matching models. Received: 5 August 1996 / Accepted: 19 March 1997  相似文献   

6.
In this work a method is presented to track and estimate pose of articulated objects using the motion of a sparse set of moving features. This is achieved by using a bottom-up generative approach based on the Pictorial Structures representation [1]. However, unlike previous approaches that rely on appearance, our method is entirely dependent on motion. Initial low-level part detection is based on how a region moves as opposed to its appearance. This work is best described as Pictorial Structures using motion. A standard feature tracker is used to automatically extract a sparse set of features. These features typically contain many tracking errors, however, the presented approach is able to overcome both this and their sparsity. The proposed method is applied to two problems: 2D pose estimation of articulated objects walking side onto the camera and 3D pose estimation of humans walking and jogging at arbitrary orientations to the camera. In each domain quantitative results are reported that improve on state of the art. The motivation of this work is to illustrate the information present in low-level motion that can be exploited for the task of pose estimation.  相似文献   

7.
The comparison and alignment of two similar objects is a fundamental problem in pattern recognition and computer vision that has been considered using various approaches. In this work, we employ a complex representation for an algebraic curve, and illustrate how the algebraic transformation which relates two Euclidean equivalent curves can be determined using this representation. The idea is based on a complex representation of 2D points expressed in terms of the orthogonalx andy variables, with rotations of the complex numbers described using Euler's identity. We develop a simple formula for integer multiples of the rotation angle of the Euclidean transformation in terms of the real coefficients of implicit polynomial equations that are used to model 2D free-form objects. When there is a translation, it can be determined using some new results on the conic-line factors of implicit polynomial curves. Experimental results are presented for data sets characterised by both noisy and missing data points to illustrate and validate our procedures.  相似文献   

8.
Part II uses the foundations of Part I [35] to define constraint equations for 2D-3D pose estimation of different corresponding entities. Most articles on pose estimation concentrate on specific types of correspondences, mostly between points, and only rarely use line correspondences. The first aim of this part is to extend pose estimation scenarios to correspondences of an extended set of geometric entities. In this context we are interested to relate the following (2D) image and (3D) model types: 2D point/3D point, 2D line/3D point, 2D line/3D line, 2D conic/3D circle, 2D conic/3D sphere. Furthermore, to handle articulated objects, we describe kinematic chains in this context in a similar manner. We ensure that all constraint equations end up in a distance measure in the Euclidean space, which is well posed in the context of noisy data. We also discuss the numerical estimation of the pose. We propose to use linearized twist transformations which result in well conditioned and fast solvable systems of equations. The key idea is not to search for the representation of the Lie group, describing the rigid body motion, but for the representation of their generating Lie algebra. This leads to real-time capable algorithms.Bodo Rosenhahn gained his diploma degree in Computer Science in 1999. Since then he has been pursuing his Ph.D. at the Cognitive Systems Group, Institute of Computer Science, Christian-Albrechts University Kiel, Germany. He is working on geometric applications of Clifford algebras in computer vision.Prof. Dr. Gerald Sommer received a diploma degree in physics from the Friedrich-Schiller-Universität Jena, Germany, in 1969, a Ph.D. degree in physics from the same university in 1975, and a habilitation degree in engineering from the Technical University Ilmenau, Germany, in 1988. Since 1993 he is leading the research group Cognitive Systems at the Christian-Albrechts-Universität Kiel, Germany. Currently he is also the scientific coordinator of the VISATEC project.  相似文献   

9.
We present a novel representation and rendering method for free‐viewpoint video of human characters based on multiple input video streams. The basic idea is to approximate the articulated 3D shape of the human body using a subdivision into textured billboards along the skeleton structure. Billboards are clustered to fans such that each skeleton bone contains one billboard per source camera. We call this representation articulated billboards. In the paper we describe a semi‐automatic, data‐driven algorithm to construct and render this representation, which robustly handles even challenging acquisition scenarios characterized by sparse camera positioning, inaccurate camera calibration, low video resolution, or occlusions in the scene. First, for each input view, a 2D pose estimation based on image silhouettes, motion capture data, and temporal video coherence is used to create a segmentation mask for each body part. Then, from the 2D poses and the segmentation, the actual articulated billboard model is constructed by a 3D joint optimization and compensation for camera calibration errors. The rendering method includes a novel way of blending the textural contributions of each billboard and features an adaptive seam correction to eliminate visible discontinuities between adjacent billboards textures. Our articulated billboards do not only minimize ghosting artifacts known from conventional billboard rendering, but also alleviate restrictions to the setup and sensitivities to errors of more complex 3D representations and multiview reconstruction techniques. Our results demonstrate the flexibility and the robustness of our approach with high quality free‐viewpoint video generated from broadcast footage of challenging, uncontrolled environments.  相似文献   

10.
A novel method for representing 3D objects that unifies viewer and model centered object representations is presented. A unified 3D frequency-domain representation, called volumetric frequency representation (VFR), encapsulates both the spatial structure of the object and a continuum of its views in the same data structure. The frequency-domain image of an object viewed from any direction can be directly extracted employing an extension of the projection slice theorem, where each Fourier-transformed view is a planar slice of the volumetric frequency representation. The VFR is employed for pose-invariant recognition of complex objects, such as faces. The recognition and pose estimation is based on an efficient matching algorithm in a four-dimensional Fourier space. Experimental examples of pose estimation and recognition of faces in various poses are also presented  相似文献   

11.
A transitory image sequence is one in which no scene element is visible through the entire sequence. This article deals with some major theoretical and algorithmic issues associated with the task of estimating structure and motion from transitory image sequences. It is shown that integration with a transitory sequence has properties that are very different from those with a nontransitory one. Two representations, world-centered (WC) and camera-centered (CC), behave very differently with a transitory sequence. The asymptotic error rates derived in this article indicate that one representation is significantly superior to the other, depending on whether one needs camera-centered or world-centered estimates. We introduce an efficient “cross-frame” estimation technique for the CC representation. For the WC representation, our analysis indicates that a good technique should be based on camera global pose instead of interframe motions. Rigorous experiments were conducted with real-image sequences taken by a fully calibrated camera system  相似文献   

12.
《Real》1997,3(6):415-432
Real-time motion capture plays a very important role in various applications, such as 3D interface for virtual reality systems, digital puppetry, and real-time character animation. In this paper we challenge the problem of estimating and recognizing the motion of articulated objects using theoptical motion capturetechnique. In addition, we present an effective method to control the articulated human figure in realtime.The heart of this problem is the estimation of 3D motion and posture of an articulated, volumetric object using feature points from a sequence of multiple perspective views. Under some moderate assumptions such as smooth motion and known initial posture, we develop a model-based technique for the recovery of the 3D location and motion of a rigid object using a variation of Kalman filter. The posture of the 3D volumatric model is updated by the 2D image flow of the feature points for all views. Two novel concepts – the hierarchical Kalman filter (KHF) and the adaptive hierarchical structure (AHS) incorporating the kinematic properties of the articulated object – are proposed to extend our formulation for the rigid object to the articulated one. Our formulation also allows us to avoid two classic problems in 3D tracking: the multi-view correspondence problem, and the occlusion problem. By adding more cameras and placing them appropriately, our approach can deal with the motion of the object in a very wide area. Furthermore, multiple objects can be handled by managing multiple AHSs and processing multiple HKFs.We show the validity of our approach using the synthetic data acquired simultaneously from the multiple virtual camera in a virtual environment (VE) and real data derived from a moving light display with walking motion. The results confirm that the model-based algorithm works well on the tracking of multiple rigid objects.  相似文献   

13.
Multiple View Geometry of General Algebraic Curves   总被引:1,自引:0,他引:1  
We introduce a number of new results in the context of multi-view geometry from general algebraic curves. We start with the recovery of camera geometry from matching curves. We first show how one can compute, without any knowledge on the camera, the homography induced by a single planar curve. Then we continue with the derivation of the extended Kruppa's equations which are responsible for describing the epipolar constraint of two projections of a general algebraic curve. As part of the derivation of those constraints we address the issue of dimension analysis and as a result establish the minimal number of algebraic curves required for a solution of the epipolar geometry as a function of their degree and genus.We then establish new results on the reconstruction of general algebraic curves from multiple views. We address three different representations of curves: (i) the regular point representation in which we show that the reconstruction from two views of a curve of degree d admits two solutions, one of degree d and the other of degree d(d – 1). Moreover using this representation, we address the problem of homography recovery for planar curves, (ii) dual space representation (tangents) for which we derive a lower bound for the number of views necessary for reconstruction as a function of the curve degree and genus, and (iii) a new representation (to computer vision) based on the set of lines meeting the curve which does not require any curve fitting in image space, for which we also derive lower bounds for the number of views necessary for reconstruction as a function of curve degree alone.  相似文献   

14.
目前基于彩色图像的手姿态2D关键点热图估计大多数采用卷积姿势机或沙漏网络进行,但这两种网络不能同时满足高分辨率表示保持学习和多尺度特征融合。针对该问题引用了一种多尺度高分辨率保持的网络,该网络采用高低分辨率表示并行设计的结构,并通过融合所有分辨率表示增强各分辨率表示的特征,而且拥有多个阶段提取高质量特征用于2D热图估计。为得到3D手姿态,还使用了全局旋转视角不变的方法将2D热图映射到3D姿态。在三个公开数据集(RHD、STB、Dexter+Object)上分别对2D手姿态估计和3D手姿态估计进行了实验,结果验证了该方法在手姿态估计中的有效性。  相似文献   

15.
16.
We describe an approach to category-level detection and viewpoint estimation for rigid 3D objects from single 2D images. In contrast to many existing methods, we directly integrate 3D reasoning with an appearance-based voting architecture. Our method relies on a nonparametric representation of a joint distribution of shape and appearance of the object class. Our voting method employs a novel parameterization of joint detection and viewpoint hypothesis space, allowing efficient accumulation of evidence. We combine this with a re-scoring and refinement mechanism, using an ensemble of view-specific support vector machines. We evaluate the performance of our approach in detection and pose estimation of cars on a number of benchmark datasets. Finally we introduce the “Weizmann Cars ViewPoint” (WCVP) dataset, a benchmark for evaluating continuous pose estimation.  相似文献   

17.
We advance new active computer vision algorithms based on the Feature space Trajectory (FST) representations of objects and a neural network processor for computation of distances in global feature space. Our algorithms classify rigid objects and estimate their pose from intensity images. They also indicate how to automatically reposition the sensor if the class or pose of an object is ambiguous from a given viewpoint and they incorporate data from multiple object views in the final object classification. An FST in a global eigenfeature space is used to represent 3D distorted views of an object. Assuming that an observed feature vector consists of Gaussian noise added to a point on the FST, we derive a probability density function for the observation conditioned on the class and pose of the object. Bayesian estimation and hypothesis testing theory are then used to derive approximations to the maximum a posterioriprobability pose estimate and the minimum probability of error classifier. Confidence measures for the class and pose estimates, derived using Bayes theory, determine when additional observations are required, as well as where the sensor should be positioned to provide the most useful information.  相似文献   

18.
Planning rigid body motions using elastic curves   总被引:1,自引:0,他引:1  
This paper tackles the problem of computing smooth, optimal trajectories on the Euclidean group of motions SE(3). The problem is formulated as an optimal control problem where the cost function to be minimized is equal to the integral of the classical curvature squared. This problem is analogous to the elastic problem from differential geometry and thus the resulting rigid body motions will trace elastic curves. An application of the Maximum Principle to this optimal control problem shifts the emphasis to the language of symplectic geometry and to the associated Hamiltonian formalism. This results in a system of first order differential equations that yield coordinate free necessary conditions for optimality for these curves. From these necessary conditions we identify an integrable case and these particular set of curves are solved analytically. These analytic solutions provide interpolating curves between an initial given position and orientation and a desired position and orientation that would be useful in motion planning for systems such as robotic manipulators and autonomous-oriented vehicles.  相似文献   

19.
Tracking is a very important research subject in a real-time augmented reality context. The main requirements for trackers are high accuracy and little latency at a reasonable cost. In order to address these issues, a real-time, robust, and efficient 3D model-based tracking algorithm is proposed for a "video see through" monocular vision system. The tracking of objects in the scene amounts to calculating the pose between the camera and the objects. Virtual objects can then be projected into the scene using the pose. In this paper, nonlinear pose estimation is formulated by means of a virtual visual servoing approach. In this context, the derivation of point-to-curves interaction matrices are given for different 3D geometrical primitives including straight lines, circles, cylinders, and spheres. A local moving edges tracker is used in order to provide real-time tracking of points normal to the object contours. Robustness is obtained by integrating an M-estimator into the visual control law via an iteratively reweighted least squares implementation. This approach is then extended to address the 3D model-free augmented reality problem. The method presented in this paper has been validated on several complex image sequences including outdoor environments. Results show the method to be robust to occlusion, changes in illumination, and mistracking.  相似文献   

20.
目的 视觉里程计(visual odometry,VO)仅需要普通相机即可实现精度可观的自主定位,已经成为计算机视觉和机器人领域的研究热点,但是当前研究及应用大多基于场景为静态的假设,即场景中只有相机运动这一个运动模型,无法处理多个运动模型,因此本文提出一种基于分裂合并运动分割的多运动视觉里程计方法,获得场景中除相机运动外多个运动目标的运动状态。方法 基于传统的视觉里程计框架,引入多模型拟合的方法分割出动态场景中的多个运动模型,采用RANSAC(random sample consensus)方法估计出多个运动模型的运动参数实例;接着将相机运动信息以及各个运动目标的运动信息转换到统一的坐标系中,获得相机的视觉里程计结果,以及场景中各个运动目标对应各个时刻的位姿信息;最后采用局部窗口光束法平差直接对相机的姿态以及计算出来的相机相对于各个运动目标的姿态进行校正,利用相机运动模型的内点和各个时刻获得的相机相对于运动目标的运动参数,对多个运动模型的轨迹进行优化。结果 本文所构建的连续帧运动分割方法能够达到较好的分割结果,具有较好的鲁棒性,连续帧的分割精度均能达到近100%,充分保证后续估计各个运动模型参数的准确性。本文方法不仅能够有效估计出相机的位姿,还能估计出场景中存在的显著移动目标的位姿,在各个分段路径中相机自定位与移动目标的定位结果位置平均误差均小于6%。结论 本文方法能够同时分割出动态场景中的相机自身运动模型和不同运动的动态物体运动模型,进而同时估计出相机和各个动态物体的绝对运动轨迹,构建出多运动视觉里程计过程。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号