首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The complex EGI: a new representation for 3-D pose determination   总被引:1,自引:0,他引:1  
The complex extended Gaussian image (CEGI), a 3D object representation that can be used to determine the pose of an object, is described. In this representation, the weight associated with each outward surface normal is a complex weight. The normal distance of the surface from the predefined origin is encoded as the phase of the weight, whereas the magnitude of the weight is the visible area of the surface. This approach decouples the orientation and translation determination into two distinct least-squares problems. The justification for using such a scheme is twofold: it not only allows the pose of the object to be extracted, but it also distinguishes a convex object from a nonconvex object having the same EGI representation. The CEGI scheme has the advantage of not requiring explicit spatial object-model surface correspondence in determining object orientation and translation. Experiments involving synthetic data of two polyhedral and two smooth objects are presented to illustrate the feasibility of this method  相似文献   

2.
《Pattern recognition letters》2003,24(9-10):1489-1501
In this paper, some interesting tools to handle convex objects are presented. The approach is based on the representation of convex objects by their extended Gaussian image (EGI). A new approach for 3D convex object metamorphosis is explored. Some measures of similarity based on the EGI representation are also examined.  相似文献   

3.
Detecting objects, estimating their pose, and recovering their 3D shape are critical problems in many vision and robotics applications. This paper addresses the above needs using a two stages approach. In the first stage, we propose a new method called DEHV – Depth-Encoded Hough Voting. DEHV jointly detects objects, infers their categories, estimates their pose, and infers/decodes objects depth maps from either a single image (when no depth maps are available in testing) or a single image augmented with depth map (when this is available in testing). Inspired by the Hough voting scheme introduced in [1], DEHV incorporates depth information into the process of learning distributions of image features (patches) representing an object category. DEHV takes advantage of the interplay between the scale of each object patch in the image and its distance (depth) from the corresponding physical patch attached to the 3D object. Once the depth map is given, a full reconstruction is achieved in a second (3D modelling) stage, where modified or state-of-the-art 3D shape and texture completion techniques are used to recover the complete 3D model. Extensive quantitative and qualitative experimental analysis on existing datasets [2], [3], [4] and a newly proposed 3D table-top object category dataset shows that our DEHV scheme obtains competitive detection and pose estimation results. Finally, the quality of 3D modelling in terms of both shape completion and texture completion is evaluated on a 3D modelling dataset containing both in-door and out-door object categories. We demonstrate that our overall algorithm can obtain convincing 3D shape reconstruction from just one single uncalibrated image.  相似文献   

4.
A new representation for parametric curves and surfaces is introduced here. It is in rational form and uses rational Gaussian bases. This representation allows design of 2-D and 3-D shapes, and makes recovery of shapes from noisy image data possible. The standard deviations of Gaussians in a curve or surface control the smoothness of a recovered shape. The control points of a surface in this representation are not required to form a regular grid and a scattered set of control points is sufficient to reconstruct a surface. Examples of shape design, shape recovery, and image segmentation using the proposed representation are given.  相似文献   

5.
Hau-San  Kent K.T.  Horace H.S.   《Pattern recognition》2004,37(12):2307-2322
Classification of 3D head models based on their shape attributes for subsequent indexing and retrieval are important in many applications, as in hierarchical content-based retrieval of these head models for virtual scene composition, and the automatic annotation of these characters in such scenes. While simple feature representations are preferred for more efficient classification operations, these features may not be adequate for distinguishing between the subtly different head model classes. In view of these, we propose an optimization approach based on genetic algorithm (GA) where the original model representation is transformed in such a way that the classification rate is significantly enhanced while retaining the efficiency and simplicity of the original representation. Specifically, based on the Extended Gaussian Image (EGI) representation for 3D models which summarizes the surface normal orientation statistics, we consider these orientations as random variables, and proceed to search for an optimal transformation for these variables based on genetic optimization. The resulting transformed distributions for these random variables are then used as the modified classifier inputs. Experiments have shown that the optimized transformation results in a significant improvement in classification results for a large variety of class structures. More importantly, the transformation can be indirectly realized by bin removal and bin count merging in the original histogram, thus retaining the advantage of the original EGI representation.  相似文献   

6.
7.
Human ear recognition in 3D   总被引:4,自引:0,他引:4  
  相似文献   

8.
9.
10.
11.
A novel method for representing 3D objects that unifies viewer and model centered object representations is presented. A unified 3D frequency-domain representation, called volumetric frequency representation (VFR), encapsulates both the spatial structure of the object and a continuum of its views in the same data structure. The frequency-domain image of an object viewed from any direction can be directly extracted employing an extension of the projection slice theorem, where each Fourier-transformed view is a planar slice of the volumetric frequency representation. The VFR is employed for pose-invariant recognition of complex objects, such as faces. The recognition and pose estimation is based on an efficient matching algorithm in a four-dimensional Fourier space. Experimental examples of pose estimation and recognition of faces in various poses are also presented  相似文献   

12.
Recovering three-dimensional (3D) points from image correspondences is an important and fundamental task in computer vision. Traditionally, the task is completed by triangulation whose accuracy has its limitation in some applications. In this paper, we present a framework that incorporates surface characteristics such as Gaussian and mean curvatures into 3D point reconstruction to enhance the reconstruction accuracy. A Gaussian and mean curvature estimation scheme suitable to the proposed framework is also introduced in this paper. Based on this estimation scheme and the proposed framework, the 3D point recovery from image correspondences is formulated as an optimization problem with the surface curvatures modeled as soft constraints. To analyze the performance of proposed 3D reconstruction approach, we generated some synthetic data, including the points on the surfaces of a plane, a cylinder and a sphere, to test the approach. The experimental results demonstrated that the proposed framework can indeed improve the accuracy of 3D point reconstruction. Some real-image data were also tested and the results also confirm this point.  相似文献   

13.
Part II uses the foundations of Part I [35] to define constraint equations for 2D-3D pose estimation of different corresponding entities. Most articles on pose estimation concentrate on specific types of correspondences, mostly between points, and only rarely use line correspondences. The first aim of this part is to extend pose estimation scenarios to correspondences of an extended set of geometric entities. In this context we are interested to relate the following (2D) image and (3D) model types: 2D point/3D point, 2D line/3D point, 2D line/3D line, 2D conic/3D circle, 2D conic/3D sphere. Furthermore, to handle articulated objects, we describe kinematic chains in this context in a similar manner. We ensure that all constraint equations end up in a distance measure in the Euclidean space, which is well posed in the context of noisy data. We also discuss the numerical estimation of the pose. We propose to use linearized twist transformations which result in well conditioned and fast solvable systems of equations. The key idea is not to search for the representation of the Lie group, describing the rigid body motion, but for the representation of their generating Lie algebra. This leads to real-time capable algorithms.Bodo Rosenhahn gained his diploma degree in Computer Science in 1999. Since then he has been pursuing his Ph.D. at the Cognitive Systems Group, Institute of Computer Science, Christian-Albrechts University Kiel, Germany. He is working on geometric applications of Clifford algebras in computer vision.Prof. Dr. Gerald Sommer received a diploma degree in physics from the Friedrich-Schiller-Universität Jena, Germany, in 1969, a Ph.D. degree in physics from the same university in 1975, and a habilitation degree in engineering from the Technical University Ilmenau, Germany, in 1988. Since 1993 he is leading the research group Cognitive Systems at the Christian-Albrechts-Universität Kiel, Germany. Currently he is also the scientific coordinator of the VISATEC project.  相似文献   

14.
The recognition in image data of viewed patches of spheres, cylinders, and planes in the 3-D world is discussed as a first step to complex object recognition or complex object location and orientation estimation. Accordingly, an image is partitioned into small square windows, each of which is a view of a piece of a sphere, or of a cylinder, or of a plane. Windows are processed in parallel for recognition of content. New concepts and techniques include approximations of the image within a window by 2-D quadric polynomials where each approximation is constrained by one of the hypotheses that the 3-D surface shape seen is either planar, cylindrical, or spherical; a recognizer based upon these approximations to determine whether the object patch viewed is a piece of a sphere, or a piece of a cylinder, or a piece of a plane; lowpass filtering of the image by the approximation. The shape recognition is computationally simple, and for large windows is approximately Bayesian minimum-probability-of-error recognition. These classifications are useful for many purposes. One such purpose is to enable a following processor to use an appropriate estimator to estimate shape, and orientation and location parameters for the 3-D surface seen within a window.  相似文献   

15.
Recent studies have demonstrated that high-level semantics in data can be captured using sparse representation. In this paper, we propose an approach to human body pose estimation in static images based on sparse representation. Given a visual input, the objective is to estimate 3D human body pose using feature space information and geometrical information of the pose space. On the assumption that each data point and its neighbors are likely to reside on a locally linear patch of the underlying manifold, our method learns the sparse representation of the new input using both feature and pose space information and then estimates the corresponding 3D pose by a linear combination of the bases of the pose dictionary. Two strategies for dictionary construction are presented: (i) constructing the dictionary by randomly selecting the frames of a sequence and (ii) selecting specific frames of a sequence as dictionary atoms. We analyzed the effect of each strategy on the accuracy of pose estimation. Extensive experiments on datasets of various human activities show that our proposed method outperforms state-of-the-art methods.  相似文献   

16.
描述了一个能够快速精确地对三维人脸姿态进行自动估计的系统,提出了利用人脸的反射对称特性自动估计三维人脸姿态的方法,通过扩展高斯图像及最小包围球来得到三维人脸对称平面,利用搜索得到的鼻尖顶点对人脸进行估计,然后对估计在规定范围内进行修正,最终得到精确的估计结果。以三维扫描仪扫描的真实人脸数据作为输入对系统进行了验证,实验表明该方法不但具有很好的精确性和鲁棒性,而且能够很好地应用到实际应用中。  相似文献   

17.
三维人脸识别研究综述   总被引:10,自引:0,他引:10  
近二十多年来,虽然基于图像的人脸识别已取得很大进展,并可在约束环境下获得很好的识别性能,但仍受光照、姿态、表情等变化的影响很大,其本质原因在于图像是三维物体在二维空间的简约投影.因此,利用脸部曲面的显式三维表达进行人脸识别正成为近几年学术界的研究热点.文中分析了三维人脸识别的产生动机、概念与基本过程;根据特征形式,将三维人脸识别算法分为基于空域直接匹配、基于局部特征匹配、基于整体特征匹配三大类进行综述;对二维和三维的双模态融合方法进行分类阐述;列出了部分代表性的三维人脸数据库;对部分方法进行实验比较,并分析了方法有效性的原因;总结了目前三维人脸识别技术的优势与困难,并探讨了未来的研究趋势.  相似文献   

18.
Visual learning and recognition of 3-d objects from appearance   总被引:33,自引:9,他引:24  
The problem of automatically learning object models for recognition and pose estimation is addressed. In contrast to the traditional approach, the recognition problem is formulated as one of matching appearance rather than shape. The appearance of an object in a two-dimensional image depends on its shape, reflectance properties, pose in the scene, and the illumination conditions. While shape and reflectance are intrinsic properties and constant for a rigid object, pose and illumination vary from scene to scene. A compact representation of object appearance is proposed that is parametrized by pose and illumination. For each object of interest, a large set of images is obtained by automatically varying pose and illumination. This image set is compressed to obtain a low-dimensional subspace, called the eigenspace, in which the object is represented as a manifold. Given an unknown input image, the recognition system projects the image to eigenspace. The object is recognized based on the manifold it lies on. The exact position of the projection on the manifold determines the object's pose in the image.A variety of experiments are conducted using objects with complex appearance characteristics. The performance of the recognition and pose estimation algorithms is studied using over a thousand input images of sample objects. Sensitivity of recognition to the number of eigenspace dimensions and the number of learning samples is analyzed. For the objects used, appearance representation in eigenspaces with less than 20 dimensions produces accurate recognition results with an average pose estimation error of about 1.0 degree. A near real-time recognition system with 20 complex objects in the database has been developed. The paper is concluded with a discussion on various issues related to the proposed learning and recognition methodology.  相似文献   

19.
姿态机(PoseMachine)是一种成熟的2D人体姿态估计方法,其具有强大的对人体关键点间复杂的上下文关联的表示力(representation power)。卷积神经网络广泛应用于计算机视觉领域中,其具有出色的图像特征提取能力。基于姿态机和卷积神经网络,提出了一种的手的关键点估计方法。该方法将姿态机应用于手的关键点估计问题,且用卷积神经网络来实现姿态机的各个组件。测试表明,该方法具有与目前先进的手的关键点估计方法相当的预测性能。  相似文献   

20.
刘长红  杨扬  陈勇 《计算机科学》2010,37(3):268-270
判别式3D人体姿态估计方法直接学习图像观测到姿态之间的映射,需要大量训练集,而GPR对这种大训练集的映射模型学习由于计算复杂度太高而受到极大限制。提出了一种基于GPR和LWPR的增量式映射模型的学习方法,利用GPR学习各局部映射模型,基于LWPR的思想在线调整现有的模型和训练新的局部模型以及姿态估计。实验表明,该方法能够极大地减少大数据集上高斯过程回归的计算代价,并获得准确的姿态估计。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号