首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 45 毫秒
1.
Three-dimensional (3-D) models of outdoor scenes are widely used for object recognition, navigation, mixed reality, and so on. Because such models are often made manually with high costs, automatic 3-D reconstruction has been widely investigated. In related work, a dense 3-D model is generated by using a stereo method. However, such approaches cannot use several hundreds images together for dense depth estimation because it is difficult to accurately calibrate a large number of cameras. In this paper, we propose a dense 3-D reconstruction method that first estimates extrinsic camera parameters of a hand-held video camera, and then reconstructs a dense 3-D model of a scene. In the first process, extrinsic camera parameters are estimated by tracking a small number of predefined markers of known 3-D positions and natural features automatically. Then, several hundreds dense depth maps obtained by multi-baseline stereo are combined together in a voxel space.So, we can acquire a dense 3-D model of the outdoor scene accurately by using several hundreds input images captured by a hand-held video camera.  相似文献   

2.
目的 越来越多的应用依赖于对场景深度图像准确且快速的观测和分析,如机器人导航以及在电影和游戏中对虚拟场景的设计建模等.飞行时间深度相机等直接的深度测量设备可以实时的获取场景的深度图像,但是由于硬件条件的限制,采集的深度图像分辨率比较低,无法满足实际应用的需要.通过立体匹配算法对左右立体图对之间进行匹配获得视差从而得到深度图像是计算机视觉的一种经典方法,但是由于左右图像之间遮挡以及无纹理区域的影响,立体匹配算法在这些区域无法匹配得到正确的视差,导致立体匹配算法在实际应用中存在一定的局限性.方法 结合飞行时间深度相机等直接的深度测量设备和立体匹配算法的优势,提出一种新的深度图像重建方法.首先结合直接的深度测量设备采集的深度图像来构造自适应局部匹配权值,对左右图像之间的局部窗立体匹配过程进行约束,得到基于立体匹配算法的深度图像;然后基于左右检测原理将采集到的深度图像和匹配得到的深度图像进行有效融合;接着提出一种局部权值滤波算法,来进一步提高深度图像的重建质量.结果 实验结果表明,无论在客观指标还是视觉效果上,本文提出的深度图像重建算法较其他立体匹配算法可以得到更好的结果.其中错误率比较实验表明,本文算法较传统的立体匹配算法在深度重建错误率上可以提升10%左右.峰值信噪比实验结果表明,本文算法在峰值信噪比上可以得到10 dB左右的提升.结论 提出的深度图像重建方法通过结合高分辨率左右立体图对和初始的低分辨率深度图像,可以有效地重建高质量高分辨率的深度图像.  相似文献   

3.
By using mirror reflections of a scene, stereo images can be captured with a single camera (catadioptric stereo). In addition to simplifying data acquisition single camera stereo provides both geometric and radiometric advantages over traditional two camera stereo. In this paper, we discuss the geometry and calibration of catadioptric stereo with two planar mirrors. In particular, we will show that the relative orientation of a catadioptric stereo rig is restricted to the class of planar motions thus reducing the number of external calibration parameters from 6 to 5. Next we derive the epipolar geometry for catadioptric stereo and show that it has 6 degrees of freedom rather than 7 for traditional stereo. Furthermore, we show how focal length can be recovered from a single catadioptric image solely from a set of stereo correspondences. To test the accuracy of the calibration we present a comparison to Tsai camera calibration and we measure the quality of Euclidean reconstruction. In addition, we will describe a real-time system which demonstrates the viability of stereo with mirrors as an alternative to traditional two camera stereo.  相似文献   

4.
5.
We present a novel approach to track the position and orientation of a stereo camera using line features in the images. The method combines the strengths of trifocal tensors and Bayesian filtering. The trifocal tensor provides a geometric constraint to lock line features among every three frames. It eliminates the explicit reconstruction of the scene even if the 3-D scene structure is not known. Such a trifocal constraint thus makes the algorithm fast and robust. The twist motion model is applied to further improve its computation efficiency. Another major contribution is that our approach can obtain the 3-D camera motion using as little as 2 line correspondences instead of 13 in the traditional approaches. This makes the approach attractive for realistic applications. The performance of the proposed method has been evaluated using both synthetic and real data with encouraging results. Our algorithm is able to estimate 3-D camera motion in real scenarios accurately having little drifting from an image sequence longer than a 1,000 frames.  相似文献   

6.
立体匹配是计算机视觉研究的经典难题,其算法的复杂度和精度直接影响了视觉系统对外部景物的重建性能。为此提出了一种新的基于神经网络的立体匹配方法,其基本思想是:在实现核线重排的前提下,利用唯一性、相容性以及相似性等匹配约束条件,建立反映对应极线间所有匹配点约束关系的能量函数,将其映射到二维Hopfield网络进行极小化求解,网络最后的稳态表示匹配点的对应关系;通过对图中所有极线进行上述操作,可以得到所求的视差图。与传统方法相比,本算法具有两个明显的特点:(1)匹配基元采用了普通的图像点,可以直接获得稠密的深度图;(2)Hopfield网的外部输入不再为常数,而是一个反映对应点灰度相似性关系的值。通过对合成图以及真实图景进行测试,验证了该方法的有效性。  相似文献   

7.
In this paper, we propose a stereo method specifically designed for image-based rendering. For effective image-based rendering, the interpolated views need only be visually plausible. The implication is that the extracted depths do not need to be correct, as long as the recovered views appear to be correct. Our stereo algorithm relies on over-segmenting the source images. Computing match values over entire segments rather than single pixels provides robustness to noise and intensity bias. Color-based segmentation also helps to more precisely delineate object boundaries, which is important for reducing boundary artifacts in synthesized views. The depths of the segments for each image are computed using loopy belief propagation within a Markov Random Field framework. Neighboring MRFs are used for occlusion reasoning and ensuring that neighboring depth maps are consistent. We tested our stereo algorithm on several stereo pairs from the Middlebury data set, and show rendering results based on two of these data sets. We also show results for video-based rendering.  相似文献   

8.
The recovery of 3-D shape information (depth) using stereo vision analysis is one of the major areas in computer vision and has given rise to a great deal of literature in the recent past. The widely known stereo vision methods are the passive stereo vision approaches that use two cameras. Obtaining 3-D information involves the identification of the corresponding 2-D points between left and right images. Most existing methods tackle this matching task from singular points, i.e. finding points in both image planes with more or less the same neighborhood characteristics. One key problem we have to solve is that we are on the first instance unable to know a priori whether a point in the first image has a correspondence or not due to surface occlusion or simply because it has been projected out of the scope of the second camera. This makes the matching process very difficult and imposes a need of an a posteriori stage to remove false matching.In this paper we are concerned with the active stereo vision systems which offer an alternative to the passive stereo vision systems. In our system, a light projector that illuminates objects to be analyzed by a pyramid-shaped laser beam replaces one of the two cameras. The projections of laser rays on the objects are detected as spots in the image. In this particular case, only one image needs to be treated, and the stereo matching problem boils down to associating the laser rays and their corresponding real spots in the 2-D image. We have expressed this problem as a minimization of a global function that we propose to perform using Genetic Algorithms (GAs). We have implemented two different algorithms: in the first, GAs are performed after a deterministic search. In the second, data is partitioned into clusters and GAs are independently applied in each cluster. In our second contribution in this paper, we have described an efficient system calibration method. Experimental results are presented to illustrate the feasibility of our approach. The proposed method yields high accuracy 3-D reconstruction even for complex objects. We conclude that GAs can effectively be applied to this matching problem.  相似文献   

9.
An integrated approach to extract depth, efficiently and accurately, from a sequence of images is presented in this paper. The method combines the ability of the stereo processing to acquire highly accurate depth measurements and the efficiency of spatial and temporal gradient analysis. As a result of this integration, depth measurements of high quality are obtained at a speed approximately ten times greater than that of stereo processing. Without any a priori information of the locations of the points in the scene, the correspondence problem in stereo processing is computationally expensive. In our approach, we use spatial and temporal gradient (STG) analysis, which has been shown to provide depth with great efficiency, but limited accuracy, to guide the matching process of stereo. The camera motion used in the approach can be either lateral or axial. Extensive experiments on real scenes have shown the ability of the integrated approach to acquire depth with a mean error of less than 3%.  相似文献   

10.
11.
关于融合多针图确定物体三维表面绝对深度的研究   总被引:1,自引:1,他引:0  
利用PS(Photometric Stereo)系统很易确定物体表面方向及相对深度,但不能确定绝 对深度.为确定绝对深度,本文提出的算法首先利用BPS(Binocular Photometric Stereo) 系统获得一对表面方向图,然后,基于geodesic dome分割这对方向图并计算图中对应区域 间的视差.最后,通过施加多种约束,经适当融合及精确的视差匹配,确定景物物体3D表 面绝对深度.这一方法对进一步研究怎样确定任意3D表面深度并复原景物结构有着十分重 要的意义.  相似文献   

12.
We propose a 3D environment modelling method using multiple pairs of high-resolution spherical images. Spherical images of a scene are captured using a rotating line scan camera. Reconstruction is based on stereo image pairs with a vertical displacement between camera views. A 3D mesh model for each pair of spherical images is reconstructed by stereo matching. For accurate surface reconstruction, we propose a PDE-based disparity estimation method which produces continuous depth fields with sharp depth discontinuities even in occluded and highly textured regions. A full environment model is constructed by fusion of partial reconstruction from spherical stereo pairs at multiple widely spaced locations. To avoid camera calibration steps for all camera locations, we calculate 3D rigid transforms between capture points using feature matching and register all meshes into a unified coordinate system. Finally a complete 3D model of the environment is generated by selecting the most reliable observations among overlapped surface measurements considering surface visibility, orientation and distance from the camera. We analyse the characteristics and behaviour of errors for spherical stereo imaging. Performance of the proposed algorithm is evaluated against ground-truth from the Middlebury stereo test bed and LIDAR scans. Results are also compared with conventional structure-from-motion algorithms. The final composite model is rendered from a wide range of viewpoints with high quality textures.  相似文献   

13.
在立体视觉中,视差间接反映物体的深度信息,视差计算是深度计算的基础。常见的视差计算方法研究都是面向双目立体视觉,而双焦单目立体视觉的视差分布不同于双目视差,具有沿极线辐射的特点。针对双焦单目立体视觉的特点,提出了一种单目立体视差的计算方法。对于计算到的初步视差图,把视差点分类为匹配计算点和误匹配点。通过均值偏移向量(Mean Shift)算法,实现了对误匹配点依赖于匹配点和图像分割的视差估计,最终得到致密准确的视差图。实验证明,这种方法可以通过双焦立体图像对高效地获得场景的视差图。  相似文献   

14.
Active Appearance-Based Robot Localization Using Stereo Vision   总被引:2,自引:0,他引:2  
A vision-based robot localization system must be robust: able to keep track of the position of the robot at any time even if illumination conditions change and, in the extreme case of a failure, able to efficiently recover the correct position of the robot. With this objective in mind, we enhance the existing appearance-based robot localization framework in two directions by exploiting the use of a stereo camera mounted on a pan-and-tilt device. First, we move from the classical passive appearance-based localization framework to an active one where the robot sometimes executes actions with the only purpose of gaining information about its location in the environment. Along this line, we introduce an entropy-based criterion for action selection that can be efficiently evaluated in our probabilistic localization system. The execution of the actions selected using this criterion allows the robot to quickly find out its position in case it gets lost. Secondly, we introduce the use of depth maps obtained with the stereo cameras. The information provided by depth maps is less sensitive to changes of illumination than that provided by plain images. The main drawback of depth maps is that they include missing values: points for which it is not possible to reliably determine depth information. The presence of missing values makes Principal Component Analysis (the standard method used to compress images in the appearance-based framework) unfeasible. We describe a novel Expectation-Maximization algorithm to determine the principal components of a data set including missing values and we apply it to depth maps. The experiments we present show that the combination of the active localization with the use of depth maps gives an efficient and robust appearance-based robot localization system.  相似文献   

15.
We present a new feature based algorithm for stereo correspondence. Most of the previous feature based methods match sparse features like edge pixels, producing only sparse disparity maps. Our algorithm detects and matches dense features between the left and right images of a stereo pair, producing a semi-dense disparity map. Our dense feature is defined with respect to both images of a stereo pair, and it is computed during the stereo matching process, not a preprocessing step. In essence, a dense feature is a connected set of pixels in the left image and a corresponding set of pixels in the right image such that the intensity edges on the boundary of these sets are stronger than their matching error (which is the difference in intensities between corresponding boundary pixels). Our algorithm produces accurate semi-dense disparity maps, leaving featureless regions in the scene unmatched. It is robust, requires little parameter tuning, can handle brightnessdifferences between images, nonlinear errors, and is fast (linear complexity).  相似文献   

16.
Our paper introduces a novel approach for controlling stereo camera parameters in interactive 3D environments in a way that specifically addresses the interplay of binocular depth perception and saliency of scene contents. Our proposed Dynamic Attention-Aware Disparity Control (DADC) method produces depth-rich stereo rendering that improves viewer comfort through joint optimization of stereo parameters. While constructing the optimization model, we consider the importance of scene elements, as well as their distance to the camera and the locus of attention on the display. Our method also optimizes the depth effect of a given scene by considering the individual user’s stereoscopic disparity range and comfortable viewing experience by controlling accommodation/convergence conflict. We validate our method in a formal user study that also reveals the advantages, such as superior quality and practical relevance, of considering our method.  相似文献   

17.
Extracting View-Dependent Depth Maps from a Collection of Images   总被引:1,自引:0,他引:1  
Stereo correspondence algorithms typically produce a single depth map. In addition to the usual problems of occlusions and textureless regions, such algorithms cannot model the variation in scene or object appearance with respect to the viewing position. In this paper, we propose a new representation that overcomes the appearance variation problem associated with an image sequence. Rather than estimating a single depth map, we associate a depth map with each input image (or a subset of them). Our representation is motivated by applications such as view interpolation and depth-based segmentation for model-building or layer extraction. We describe two approaches to extract such a representation from a sequence of images.The first approach, which is more classical, computes the local depth map associated with each chosen reference frame independently. The novelty of this approach lies in its combination of shiftable windows, temporal selection, and graph cut optimization. The second approach simultaneously optimizes a set of self-consistent depth maps at multiple key-frames. Since multiple depth maps are estimated simultaneously, visibility can be modeled explicitly and disparity consistency imposed across the different depth maps. Results, which include a difficult specular scene example, show the effectiveness of our approach.  相似文献   

18.
Constructing a Multivalued Representation for View Synthesis   总被引:2,自引:1,他引:1  
A fundamental problem in computer vision and graphics is that of arbitrary view synthesis for static 3-D scenes, whereby a user-specified viewpoint of the given scene may be created directly from a representation. We propose a novel compact representation for this purpose called the multivalued representation (MVR). Starting with an image sequence captured by a moving camera undergoing either unknown planar translation or orbital motion, a MVR is derived for each preselected reference frame, and may then be used to synthesize arbitrary views of the scene. The representation itself is comprised of multiple depth and intensity levels in which the k-th level consists of points occluded by exactly k surfaces. To build a MVR with respect to a particular reference frame, dense depth maps are first computed for all the neighboring frames of the reference frame. The depth maps are then combined together into a single map, where points are organized by occlusions rather than by coherent affine motions. This grouping facilitates an automatic process to determine the number of levels and helps to reduce the artifacts caused by occlusions in the scene. An iterative multiframe algorithm is presented for dense depth estimation that both handles low-contrast regions and produces piecewise smooth depth maps. Reconstructed views as well as arbitrary flyarounds of real scenes are presented to demonstrate the effectiveness of the approach.  相似文献   

19.
立体图像对的生成   总被引:1,自引:0,他引:1  
获取同一场景的立体图像对是实现双目立体成像的一个关键问题。提出了一种在三维场景已经建好的情况下生成立体图像对的方法。该方法根据双目立体视觉的原理,利用3DS MAX中的摄像机对象对场景中的物体进行坐标变换和透视投影变换,分别生成左眼视图和右眼视图。实验结果表明,两个目标摄像机与三维模型的位置关系以及基线长度是影响立体效果的重要因素,改变目标摄像机与三维模型的位置,可以分别生成正视差、负视差的立体图像对,当AB与CO的比例参数为0.05时,生成的立体图像对的立体效果较佳。  相似文献   

20.
Large-Scale 6-DOF SLAM With Stereo-in-Hand   总被引:1,自引:0,他引:1  
In this paper, we describe a system that can carry out simultaneous localization and mapping (SLAM) in large indoor and outdoor environments using a stereo pair moving with 6 DOF as the only sensor. Unlike current visual SLAM systems that use either bearing-only monocular information or 3-D stereo information, our system accommodates both monocular and stereo. Textured point features are extracted from the images and stored as 3-D points if seen in both images with sufficient disparity, or stored as inverse depth points otherwise. This allows the system to map both near and far features: the first provide distance and orientation, and the second provide orientation information. Unlike other vision-only SLAM systems, stereo does not suffer from “scale drift” because of unobservability problems, and thus, no other information such as gyroscopes or accelerometers is required in our system. Our SLAM algorithm generates sequences of conditionally independent local maps that can share information related to the camera motion and common features being tracked. The system computes the full map using the novel conditionally independent divide and conquer algorithm, which allows constant time operation most of the time, with linear time updates to compute the full map. To demonstrate the robustness and scalability of our system, we show experimental results in indoor andoutdoor urban environments of 210 m and 140 m loop trajectories, with the stereo camera being carried in hand by a person walking at normal walking speeds of 4--5 km/h.   相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号