首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
In this article we present an approach for localizing planar parts of furniture in depth data from range cameras. It estimates both their six-degree-of-freedom poses and their dimensions. The system has been designed for enabling robots to autonomously manipulate furniture. Range cameras are a promising sensor category for this application. As many of them provide data with considerable noise and distortions, detecting objects, for example, using canonical methods for range data segmentation or feature extraction, is complicated. In contrast, our approach is able to overcome these issues. This is done by combining concepts of 2D and 3D computer vision as well as integrating intensity and range information in multiple steps of our processing chain. Therefore it can be employed on range sensors with both low and high signal-to-noise ratios and in particular on time-of-flight cameras. This concept can be adapted to various object shapes. It has been implemented for object parts with shapes similar to ellipses as a proof-of-concept. For this, a state-of-the-art ellipse detection method has been enhanced regarding our application.  相似文献   

2.
Real-time and robust tracking of 3D objects based on a 3D model with multiple cameras is still an unsolved problem albeit relevant in many practical and industrial applications. Major problems are caused by appearance changes of the object. We present a template-based tracking algorithm for piecewise planar objects. It is robust against changes in the appearance of the object (occlusion, illumination variation, specularities). The version we propose supports multiple cameras. The method consists in minimizing the error between the observed images of the object and the warped images of the planes. We use the mutual information as registration function combined with an inverse composition approach for reducing the computational costs and get a near-real-time algorithm. We discuss different hypotheses that can be made for the optimization algorithm.  相似文献   

3.
In this paper, we introduce a method to estimate the object’s pose from multiple cameras. We focus on direct estimation of the 3D object pose from 2D image sequences. Scale-Invariant Feature Transform (SIFT) is used to extract corresponding feature points from adjacent images in the video sequence. We first demonstrate that centralized pose estimation from the collection of corresponding feature points in the 2D images from all cameras can be obtained as a solution to a generalized Sylvester’s equation. We subsequently derive a distributed solution to pose estimation from multiple cameras and show that it is equivalent to the solution of the centralized pose estimation based on Sylvester’s equation. Specifically, we rely on collaboration among the multiple cameras to provide an iterative refinement of the independent solution to pose estimation obtained for each camera based on Sylvester’s equation. The proposed approach to pose estimation from multiple cameras relies on all of the information available from all cameras to obtain an estimate at each camera even when the image features are not visible to some of the cameras. The resulting pose estimation technique is therefore robust to occlusion and sensor errors from specific camera views. Moreover, the proposed approach does not require matching feature points among images from different camera views nor does it demand reconstruction of 3D points. Furthermore, the computational complexity of the proposed solution grows linearly with the number of cameras. Finally, computer simulation experiments demonstrate the accuracy and speed of our approach to pose estimation from multiple cameras.  相似文献   

4.
In this paper, we present an algorithm to probabilistically estimate object shapes in a 3D dynamic scene using their silhouette information derived from multiple geometrically calibrated video camcorders. The scene is represented by a 3D volume. Every object in the scene is associated with a distinctive label to represent its existence at every voxel location. The label links together automatically-learned view-specific appearance models of the respective object, so as to avoid the photometric calibration of the cameras. Generative probabilistic sensor models can be derived by analyzing the dependencies between the sensor observations and object labels. Bayesian reasoning is then applied to achieve robust reconstruction against real-world environment challenges, such as lighting variations, changing background etc. Our main contribution is to explicitly model the visual occlusion process and show: (1) static objects (such as trees or lamp posts), as parts of the pre-learned background model, can be automatically recovered as a byproduct of the inference; (2) ambiguities due to inter-occlusion between multiple dynamic objects can be alleviated, and the final reconstruction quality is drastically improved. Several indoor and outdoor real-world datasets are evaluated to verify our framework.  相似文献   

5.
Registration of 3D data is a key problem in many applications in computer vision, computer graphics and robotics. This paper provides a family of minimal solutions for the 3D-to-3D registration problem in which the 3D data are represented as points and planes. Such scenarios occur frequently when a 3D sensor provides 3D points and our goal is to register them to a 3D object represented by a set of planes. In order to compute the 6 degrees-of-freedom transformation between the sensor and the object, we need at least six points on three or more planes. We systematically investigate and develop pose estimation algorithms for several configurations, including all minimal configurations, that arise from the distribution of points on planes. We also identify the degenerate configurations in such registrations. The underlying algebraic equations used in many registration problems are the same and we show that many 2D-to-3D and 3D-to-3D pose estimation/registration algorithms involving points, lines, and planes can be mapped to the proposed framework. We validate our theory in simulations as well as in three real-world applications: registration of a robotic arm with an object using a contact sensor, registration of planar city models with 3D point clouds obtained using multi-view reconstruction, and registration between depth maps generated by a Kinect sensor.  相似文献   

6.
A fundamental problem in autonomous vehicle navigation is the identification of obstacle free space in cluttered and unstructured environments. Features such as walls, people, furniture, doors and stairs, etc are potential hazards. The approach taken in this paper is motivated by the recent development on infra-red time-of-flight cameras that provide video frame rate low resolution depth maps. We propose to exploit the temporal information content provided by the high refresh rate of such cameras to overcome the limitations due to low spatial resolution and high depth uncertainty and aim to provide robust and accurate estimates of planar surfaces in the environment. These surfaces’ estimates are then used to provide statistical tests to identify obstacles and dangers in the environment. Classical 3D spatial RANSAC is extended to 4D spatio-temporal RANSAC by developing spatio-temporal models of planar surfaces that incorporate a linear motion model as well as linear environment features. A 4D-vector product is used for hypotheses generation from data that is randomly sampled across both spatial and temporal variations. The algorithm is fully posed in the spatio-temporal representation and there is no need to correlate points or hypothesis between temporal images. The proposed algorithm is computationally fast and robust for estimation of planar surfaces in general and the ground plane in particular. There are potential applications in mobile robotics, autonomous vehicular navigation, and automotive safety systems. The claims of the paper are supported by experimental results obtained from real video data for a time-of-flight range sensor mounted on an automobile navigating in an undercover parking lot.  相似文献   

7.
In this paper, we propose an original evolutionary-based method for 3D panoramic reconstruction from an uncalibrated stereovision system (USS). The USS is composed of five cameras located on an arc of a circle around the object to be analyzed. The main originality of this work concerns the process of the calculation of the 3D information. Actually, with our method, 3D coordinates are directly obtained without any prior estimation of the fundamental matrix. The method operates in two steps. Firstly, points of interest are detected in pairs of images acquired by two consecutive cameras of the USS are matched. And secondly, using evolutionary algorithms, we jointly compute the transformed matrix between the two images and the respective depth of the points of interest. The accuracy of the proposed method is validated through a comparison with the depth values obtained using a traditional method. In order to perform 3D panoramic object reconstruction, the process is repeated for all the pairs of consecutive cameras. The 3D points thus obtained throughout the successive steps of the process which correspond to the different points of interest, are combined in order to obtain a set of 3D points all around the analyzed object.  相似文献   

8.
Reliable manipulation of everyday household objects is essential to the success of service robots. In order to accurately manipulate these objects, robots need to know objects’ full 6-DOF pose, which is challenging due to sensor noise, clutters, and occlusions. In this paper, we present a new approach for effectively guessing the object pose given an observation of just a small patch of the object, by leveraging the fact that many household objects can only keep stable on a planar surface under a small set of poses. In particular, for each stable pose of an object, we slice the object with horizontal planes and extract multiple cross-section 2D contours. The pose estimation is then reduced to find a stable pose whose contour matches best with that of the sensor data, and this can be solved efficiently by cross-correlation. Experiments on the manipulation tasks in the DARPA Robotics Challenge validate our approach. In addition, we also investigate our method’s performance on object recognition tasks raising in the challenge.  相似文献   

9.
The view-independent visualization of 3D scenes is most often based on rendering accurate 3D models or utilizes image-based rendering techniques. To compute the 3D structure of a scene from a moving vision sensor or to use image-based rendering approaches, we need to be able to estimate the motion of the sensor from the recorded image information with high accuracy, a problem that has been well-studied. In this work, we investigate the relationship between camera design and our ability to perform accurate 3D photography, by examining the influence of camera design on the estimation of the motion and structure of a scene from video data. By relating the differential structure of the time varying plenoptic function to different known and new camera designs, we can establish a hierarchy of cameras based upon the stability and complexity of the computations necessary to estimate structure and motion. At the low end of this hierarchy is the standard planar pinhole camera for which the structure from motion problem is non-linear and ill-posed. At the high end is a camera, which we call the full field of view polydioptric camera, for which the motion estimation problem can be solved independently of the depth of the scene which leads to fast and robust algorithms for 3D Photography. In between are multiple view cameras with a large field of view which we have built, as well as omni-directional sensors.  相似文献   

10.
In previous optimization-based methods of 3D planar-faced object reconstruction from single 2D line drawings, the missing depths of the vertices of a line drawing (and other parameters in some methods) are used as the variables of the objective functions. A 3D object with planar faces is derived by finding values for these variables that minimize the objective functions. These methods work well for simple objects with a small number N of variables. As N grows, however, it is very difficult for them to find expected objects. This is because with the nonlinear objective functions in a space of large dimension N, the search for optimal solutions can easily get trapped into local minima. In this paper, we use the parameters of the planes that pass through the planar faces of an object as the variables of the objective function. This leads to a set of linear constraints on the planes of the object, resulting in a much lower dimensional nullspace where optimization is easier to achieve. We prove that the dimension of this nullspace is exactly equal to the minimum number of vertex depths which define the 3D object. Since a practical line drawing is usually not an exact projection of a 3D object, we expand the nullspace to a larger space based on the singular value decomposition of the projection matrix of the line drawing. In this space, robust 3D reconstruction can be achieved. Compared with two most related methods, our method not only can reconstruct more complex 3D objects from 2D line drawings, but also is computationally more efficient.  相似文献   

11.
近年来, 距离传感器与摄像机的组合系统标定在无人车环境感知中得到了广泛的研究与应用, 其中基于平面特征的方法简单易行而被广泛采用. 然而, 目前多数方法基于点匹配进行, 易错且鲁棒性较低. 本文提出了一种基于共面圆的距离传感器与相机的组合系统相对位姿估计方法. 该方法使用含有两个共面圆的标定板, 可以获取相机与标定板间的位姿, 以及距离传感器与标定板间的位姿. 此外, 移动标定板获取多组数据, 根据计算得到两个共面圆的圆心在距离传感器和相机下的坐标, 优化重投影误差与3D对应点之间的误差, 得到距离传感器与相机之间的位姿关系. 该方法不需要进行特征点的匹配, 利用射影不变性来获取相机与三维距离传感器的位姿. 仿真实验与真实数据实验结果表明, 本方法对噪声有较强的鲁棒性, 得到了精确的结果.  相似文献   

12.
We present a method for automatically estimating the motion of an articulated object filmed by two or more fixed cameras. We focus our work on the case where the quality of the images is poor, and where only an approximation of a geometric model of the tracked object is available. Our technique uses physical forces applied to each rigid part of a kinematic 3D model of the object we are tracking. These forces guide the minimization of the differences between the pose of the 3D model and the pose of the real object in the video images. We use a fast recursive algorithm to solve the dynamical equations of motion of any 3D articulated model. We explain the key parts of our algorithms: how relevant information is extracted from the images, how the forces are created, and how the dynamical equations of motion are solved. A study of what kind of information should be extracted in the images and of when our algorithms fail is also presented. Finally we present some results about the tracking of a person. We also show the application of our method to the tracking of a hand in sequences of images, showing that the kind of information to extract from the images depends on their quality and of the configuration of the cameras.  相似文献   

13.
In this paper, we show that the rotating 1D calibrating object used in the literature is in essence equivalent to a familiar 2D planar calibration object. In addition, we also show that when the 1D object undergoes a planar motion rather than rotating around a fixed point, such equivalence still holds but the traditional way fails to handle it. Experiments are carried out to verify the theoretical correctness and numerical robustness of our results.  相似文献   

14.
15.
16.
To measure the 3D shape of large objects, scanning by a moving range sensor is one of the most efficient methods. However, if we use moving range sensors, the obtained data have some distortions due to the movement of the sensor during the scanning process. In this paper, we propose a method for recovering correct 3D range data from a moving range sensor by using the multiple view geometry under projective projections in space-time. We assume that range sensor radiates laser beams in a raster scan order, and they are observed from two cameras. We first show that we can deal with range data as 2D images, and show that the extended multiple view geometry can be used for representing the relationship between the 2D image of range data and the 2D image of cameras. We next show that the extended multiple view geometry can be used for rectifying 3D data obtained by the moving range sensor. The method is implemented and tested in synthetic images and range data. The stability of the recovered 3D shape is also evaluated.  相似文献   

17.
18.
因为彩色镜头和深度镜头不在同一位置,并且深度图像测量精度差、分辨率低、没有颜色纹理信息,传统的手眼标定方法并不适用于RGB-D相机.本文提出一种利用简单低成本的3D打印球作为标定件对机械臂与RGB-D相机进行手眼标定的方法.本方法只需要测量标定件的3D位置信息,避免使用测量不便、精度稍差的姿态信息.文中给出了该方法的封闭解和迭代优化解.100组仿真结果表明,标定精度与RGB-D相机自身测量精度一致;封闭解不需要机械臂与相机时间同步;迭代优化解的标定精度略有提升,误差最大值和误差方差都很稳定.最后,在7自由度的KUKA ⅡWA机械臂和Kinect相机上做了手眼标定实验,结果与仿真实验一致.总之,本文方法简单可靠,可实现机械臂与RGB-D相机之间的快速部署手眼标定.  相似文献   

19.
基于事件的相机是一种生物启发的新型视觉传感器,可实时高效地捕捉场景的变化.与基于帧的传统相机不同,事件相机仅报告触发的像素级亮度变化(称为事件),并以微秒级分辨率输出异步事件流.该类视觉传感器已经逐渐成为图像处理、计算机视觉、机器人感知与状态估计、神经形态学等领域的研究热点.首先,本文阐述了事件相机的基本原理、发展历程、优势与挑战;然后,介绍了3种典型事件相机(包括DVS(dynamic vision sensor)、ATIS(asynchronous time based image sensor)和DAVIS(dynamic and active pixel vision sensor))以及多种新型事件相机;接下来,重点回顾了事件相机在特征提取、深度估计、光流估计、强度图像估计与三维重建、目标识别与跟踪、自主定位与位姿估计、视觉里程计与SLAM、多传感器融合等方面的应用研究;最后,归纳了事件相机的研究进展,并探讨了未来的发展趋势.  相似文献   

20.
Wheel odometry is a common method for high resolution relative localisation. However, wheel odometry relies on the integrity and accuracy of a kinematic model. In this paper, a new method for relative localisation, ‘visiodometry’, which does not rely on a kinematic model, is proposed. The system consists of two ground-facing cameras mounted on either side of the robot. From the sequence of images acquired, the relative change in pose of the robot is estimated using a phase correlation based method. Results on a plain coloured carpeted surface, show that the method provides a truly odometric type sensor data input similar in modality and resolution to wheel odometry. A method to calibrate the visiodometry system using a 1D object is also presented.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号