首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
For piecewise planar scene modeling, many challenging issues still persist, in particular, how to generate sufficient candidate planes and how to assign an optimal plane for each spatial patch. To address these issues, we present a novel multi-view piecewise planar stereo method for the complete reconstruction. In our method, reconstruction is formulated as an energy-based plane labeling problem, where photo-consistency and geometric constraints are incorporated to a unified superpixel-level MRF (Markov Random Field) framework. To enhance the efficacy of the plane inference and optimization, an effective multi-direction plane sweeping with much restricted search space is carried out to generate sufficient and reliable candidate planes. Experiments show that our method can effectively handle many challenging factors (e.g., slant surfaces, textureless regions) and achieve satisfactory results.  相似文献   

2.
王伟  高伟  朱海  胡占义 《自动化学报》2017,43(4):674-684
对于基于图像的城市场景重建,由于光照变化、透视畸变、弱纹理区域等因素的影响,传统像素级与区域级的重建算法通常难以获得可靠的重建结果.为了解决此问题,本文提出一种快速、鲁棒的分段平面重建算法.根据城市场景结构特征与分段平面假设,本文算法首先利用基于连通域检测的空间平面拟合方法从初始空间点中抽取充分且可靠的候选空间平面,然后在MRF(Markov random field)能量最小化框架下将场景的完整结构推断问题转化为平面标记问题进行求解.由于候选平面集与融合灰度一致性度量、空间几何与可见性约束的能量模型的高可靠性,场景的完整结构因此可被有效地重建.实验结果表明,本文算法能较好地克服传统算法可靠性差、重建场景不完整等缺点,同时具有较高的计算效率.  相似文献   

3.
基于稀疏点云的多平面场景稠密重建   总被引:1,自引:0,他引:1  
缪君  储珺  张桂梅  王璐 《自动化学报》2015,41(4):813-822
多平面场景是生活中常见的一种场景,然而由于该类场景中常常存在物体表面纹理缺乏和纹理重复的现象,导致从多视图像重建获得的三维点云数据中存在点云过于稀疏甚至孔洞等问题,进而导致以微面片拟合三维点云所得到的重建表面出现平面颠簸现象.针对这些问题,本文提出了一种基于稀疏点云的分段平面场景重建方法.首先,利用分层抽样代替随机抽样,改进了J-Linkage多模型估计算法;然后,利用该方法对稀疏点云进行多平面拟合,来获得场景的多平面模型;最后,将多平面模型和无监督的图像分割相结合,提取并重建场景中的平面区域.场景中的非平面部分用CMVS/PMVS(Clustering views for multi-view stereo/patch-based multi-view stereo)算法重建.多平面模型估计的实验表明,改进的J-Linkage算法提高了模型估计的准确度.三维重建的实验证实,提出的重建方法在有效地克服孔洞和平面颠簸问题的同时,还能重建出完整平面区域.  相似文献   

4.
王伟  任国恒  陈立勇  张效尉 《自动化学报》2019,45(11):2187-2198
在基于图像的城市场景三维重建中,场景分段平面重建算法可以克服场景中的弱纹理、光照变化等因素的影响而快速恢复场景完整的近似结构.然而,在初始空间点较为稀疏、候选平面集不完备、图像过分割质量较低等问题存在时,可靠性往往较低.为了解决此问题,本文根据城市场景的结构特征构造了一种新颖的融合场景结构先验、空间点可见性与颜色相似性的平面可靠性度量,然后采用图像区域与相应平面协同优化的方式对场景结构进行了推断.实验结果表明,本文算法利用稀疏空间点即可有效重建出完整的场景结构,整体上具有较高的精度与效率.  相似文献   

5.
Urban environments possess many regularities which can be efficiently exploited for 3D dense reconstruction from multiple widely separated views. We present an approach utilizing properties of piecewise planarity and restricted number of plane orientations to suppress reconstruction and matching ambiguities causing failures of standard dense stereo methods. We formulate the problem of the 3D reconstruction in MRF framework built on an image pre-segmented into superpixels. Using this representation, we propose novel photometric and superpixel boundary consistency terms explicitly derived from superpixels and show that they overcome many difficulties of standard pixel-based formulations and handle favorably problematic scenarios containing many repetitive structures and no or low textured regions. We demonstrate our approach on several wide-baseline scenes demonstrating superior performance compared to previously proposed methods.  相似文献   

6.
In spite of advanced acquisition technology, consumer cameras remain an attractive means for capturing 3D data. For reconstructing buildings it is easy to obtain large numbers of photos representing complete, all-around coverage of a building; however, such large photos collections are often unordered and unorganized, with unknown viewpoints. We present a method for reconstructing piecewise planar building models based on a near-linear time process that sorts such unorganized collections, quickly creating an image graph, an initial pose for each camera, and a piecewise-planar facade model. Our sorting technique first estimates single-view, piecewise planar geometry from each photo, then merges these single-view models together in an analysis phase that reasons about the global scene geometry. A key contribution of our technique is to perform this reasoning based on a number of typical constraints of buildings. This sorting process results in a piecewise planar model of the scene, a set of good initial camera poses, and a correspondence between photos. This information is useful in itself as an approximate scene model, but also represents a good initialization for structure from motion and multi-view stereo techniques from which refined models can be derived, at greatly reduced computational cost compared to prior techniques.  相似文献   

7.
This paper presents a method for the 3D reconstruction of a piecewise‐planar surface from range images, typically laser scans with millions of points. The reconstructed surface is a watertight polygonal mesh that conforms to observations at a given scale in the visible planar parts of the scene, and that is plausible in hidden parts. We formulate surface reconstruction as a discrete optimization problem based on detected and hypothesized planes. One of our major contributions, besides a treatment of data anisotropy and novel surface hypotheses, is a regularization of the reconstructed surface w.r.t. the length of edges and the number of corners. Compared to classical area‐based regularization, it better captures surface complexity and is therefore better suited for man‐made environments, such as buildings. To handle the underlying higher‐order potentials, that are problematic for MRF optimizers, we formulate minimization as a sparse mixed‐integer linear programming problem and obtain an approximate solution using a simple relaxation. Experiments show that it is fast and reaches near‐optimal solutions.  相似文献   

8.
Occlusion and lack of visibility in crowded and cluttered scenes make it difficult to track individual people correctly and consistently, particularly in a single view. We present a multi-view approach to solving this problem. In our approach we neither detect nor track objects from any single camera or camera pair; rather evidence is gathered from all the cameras into a synergistic framework and detection and tracking results are propagated back to each view. Unlike other multi-view approaches that require fully calibrated views our approach is purely image-based and uses only 2D constructs. To this end we develop a planar homographic occupancy constraint that fuses foreground likelihood information from multiple views, to resolve occlusions and localize people on a reference scene plane. For greater robustness this process is extended to multiple planes parallel to the reference plane in the framework of plane to plane homologies. Our fusion methodology also models scene clutter using the Schmieder and Weathersby clutter measure, which acts as a confidence prior, to assign higher fusion weight to views with lesser clutter. Detection and tracking are performed simultaneously by graph cuts segmentation of tracks in the space-time occupancy likelihood data. Experimental results with detailed qualitative and quantitative analysis, are demonstrated in challenging multi-view, crowded scenes.  相似文献   

9.
The article describes a reconstruction pipeline that generates piecewise-planar models of man-made environments using two calibrated views. The 3D space is sampled by a set of virtual cut planes that intersect the baseline of the stereo rig and implicitly define possible pixel correspondences across views. The likelihood of these correspondences being true matches is measured using signal symmetry analysis [1], which enables to obtain profile contours of the 3D scene that become lines whenever the virtual cut planes intersect planar surfaces. The detection and estimation of these lines cuts is formulated as a global optimization problem over the symmetry matching cost, and pairs of reconstructed lines are used to generate plane hypotheses that serve as input to PEARL clustering [2]. The PEARL algorithm alternates between a discrete optimization step, which merges planar surface hypotheses and discards detections with poor support, and a continuous optimization step, which refines the plane poses taking into account surface slant. The pipeline outputs an accurate semi-dense Piecewise-Planar Reconstruction of the 3D scene. In addition, the input images can be segmented into piecewise-planar regions using a standard labeling formulation for assigning pixels to plane detections. Extensive experiments with both indoor and outdoor stereo pairs show significant improvements over state-of-the-art methods with respect to accuracy and robustness.  相似文献   

10.
We investigate the problem of automatically creating 3D models of man-made environments that we represent as collections of textured planes. A typical approach is to automatically reconstruct a sparse 3D model made of points, and to manually indicate their plane membership, as well as the delineation of the planes: this is the piecewise planar segmentation phase. Texture images are then extracted by merging perspectively corrected input images. We propose an automatic approach to the piecewise planar segmentation phase, that detects the number of planes to approximate the scene surface to some extent, and the parameters of these planes, from a sparse 3D model made of points. Our segmentation method is inspired from the robust estimator ransac. It generates and scores plane hypotheses by random sampling of the 3D points. Our plane scoring function and our plane comparison function, required to prevent detecting the same plane twice, are designed to detect planes with large or small support. The plane scoring function recovers the plane delineation and quantifies the saliency of the plane hypothesis based on approximate photoconsistency. We finally refine all the 3D model parameters, i.e., the planes and the points on these planes, as well as camera pose, by minimizing the reprojection error with respect to the measured image points, using bundle adjustment. The approach is validated on simulated and real data.  相似文献   

11.
王伟  余淼  胡占义 《自动化学报》2014,40(12):2782-2796
提出一种高精度的基于匹配扩散的稠密深度图估计算法. 算法分为像素级与区域级两阶段的匹配扩散过程.前者主要对视图间的稀疏特征点匹配进行扩散以获取相对稠密的初始深度图; 而后者则在多幅初始深度图的基础上, 根据场景分段平滑的假设, 在能量函数最小化框架下利用平面拟合及多方向平面扫描等方法解决存在匹配多义性问题区域(如弱纹理区域)的深度推断问题. 在标准数据集及真实数据集上的实验表明, 本文算法对视图中的光照变化、透视畸变等因素具有较强的适应性, 并能有效地对弱纹理区域的深度信息进行推断, 从而可以获得高精度、稠密的深度图.  相似文献   

12.
We propose a parallel computation model, called cellular matrix model (CMM), to address large-size Euclidean graph matching problems in the plane. The parallel computation takes place by partitioning the plane into a regular grid of cells, each cell being affected to a single processor. Each processor operates on local data, starting from its cell location and extending its search to the neighborhood cells in a spiral search way. In order to deal with large-size problems, memory size and processor number are fixed as O(N), where N is the problem size. Then one key point is that closest point searching in the plane is performed in O(1) expected time for uniform or bounded distribution, for each processor independently. We define a generic loop that models the parallel projection between graphs and their matching, as executed by the many cells at a given level of computation granularity. To illustrate its efficacy and versatility, we apply the CMM, on GPU platforms, to two problems in image processing: superpixel segmentation and stereo matching energy minimization. Firstly, we propose an extended version of the well-known SLIC superpixel segmentation algorithm, which we call SPASM algorithm, by using a parallel 2D self-organizing map instead of k-means algorithm. Secondly, we investigate the idea of distributed variable neighborhood search, and propose a parallel search heuristic, called distributed local search (DLS), for global energy minimization of stereo matching problem. We evaluate the approach with regards to the state-of-the-art graph cut and belief propagation algorithms. For each problem, we argue that the parallel GPU implementation provides new competitive quality/time trade-offs, with substantial acceleration factors as the problem size increases.  相似文献   

13.
We present a multi-frame narrow-baseline stereo matching algorithm based on extracting and matching edges across multiple frames. Edge matching allows us to focus on the important features at the very beginning, and deal with occlusion boundaries as well as untextured regions. Given the initial sparse matches, we fit overlapping local planes to form a coarse, over-complete representation of the scene. After breaking up the reference image in our sequence into superpixels, we perform a Markov random field optimization to assign each superpixel to one of the plane hypotheses. Finally, we refine our continuous depth map estimate using a piecewise-continuous variational optimization. Our approach successfully deals with depth discontinuities, occlusions, and large textureless regions, while also producing detailed and accurate depth maps. We show that our method out-performs competing methods on high-resolution multi-frame stereo benchmarks and is well-suited for view interpolation applications.  相似文献   

14.
Event cameras are bio-inspired vision sensors that output pixel-level brightness changes instead of standard intensity frames. They offer significant advantages over standard cameras, namely a very high dynamic range, no motion blur, and a latency in the order of microseconds. However, because the output is composed of a sequence of asynchronous events rather than actual intensity images, traditional vision algorithms cannot be applied, so that a paradigm shift is needed. We introduce the problem of event-based multi-view stereo (EMVS) for event cameras and propose a solution to it. Unlike traditional MVS methods, which address the problem of estimating dense 3D structure from a set of known viewpoints, EMVS estimates semi-dense 3D structure from an event camera with known trajectory. Our EMVS solution elegantly exploits two inherent properties of an event camera: (1) its ability to respond to scene edges—which naturally provide semi-dense geometric information without any pre-processing operation—and (2) the fact that it provides continuous measurements as the sensor moves. Despite its simplicity (it can be implemented in a few lines of code), our algorithm is able to produce accurate, semi-dense depth maps, without requiring any explicit data association or intensity estimation. We successfully validate our method on both synthetic and real data. Our method is computationally very efficient and runs in real-time on a CPU.  相似文献   

15.
Many vision tasks rely upon the identification of sets of corresponding features among different images. This paper presents a method that, given some corresponding features in two stereo images, matches them with features extracted from a second stereo pair captured from a distant viewpoint. The proposed method is based on the assumption that the viewed scene contains two planar surfaces and exploits geometric constraints that are imposed by the existence of these planes to first transfer and then match image features between the two stereo pairs. The resulting scheme handles point and line features in a unified manner and is capable of successfully matching features extracted from stereo pairs that are acquired from considerably different viewpoints. Experimental results are presented, which demonstrate that the performance of the proposed method compares favorably to that of epipolar and tensor-based approaches.  相似文献   

16.
We present a novel method for recovering the 3D structure and scene flow from calibrated multi-view sequences. We propose a 3D point cloud parametrization of the 3D structure and scene flow that allows us to directly estimate the desired unknowns. A unified global energy functional is proposed to incorporate the information from the available sequences and simultaneously recover both depth and scene flow. The functional enforces multi-view geometric consistency and imposes brightness constancy and piecewise smoothness assumptions directly on the 3D unknowns. It inherently handles the challenges of discontinuities, occlusions, and large displacements. The main contribution of this work is the fusion of a 3D representation and an advanced variational framework that directly uses the available multi-view information. This formulation allows us to advantageously bind the 3D unknowns in time and space. Different from optical flow and disparity, the proposed method results in a nonlinear mapping between the images’ coordinates, thus giving rise to additional challenges in the optimization process. Our experiments on real and synthetic data demonstrate that the proposed method successfully recovers the 3D structure and scene flow despite the complicated nonconvex optimization problem.  相似文献   

17.
Correspondence establishment is a central problem of stereo vision. In a work Aloimonos and Herve (IEEE Trans Pattern Anal Mach Intell 12(5):504–510, 1990) presented an algorithm that could reconstruct a single planar surface without establishing point-to-point correspondences. The work uses images that are taken under a specific stereo configuration. In this paper, we generalize the algorithm to one for general stereo configuration of the cameras. We further provide an extension of the algorithm, so that not only distant or planar scene but also multi-surface polyhedral scene can be reconstructed. Experimental results on a number of real image sets are presented to illustrate the performance of the algorithm.  相似文献   

18.
Rapid advances in image acquisition and storage technology underline the need for real-time algorithms that are capable of solving large-scale image processing and computer-vision problems. The minimum st cut problem, which is a classical combinatorial optimization problem, is a prominent building block in many vision and imaging algorithms such as video segmentation, co-segmentation, stereo vision, multi-view reconstruction, and surface fitting to name a few. That is why finding a real-time algorithm which optimally solves this problem is of great importance. In this paper, we introduce to computer vision the Hochbaum’s pseudoflow (HPF) algorithm, which optimally solves the minimum st cut problem. We compare the performance of HPF, in terms of execution times and memory utilization, with three leading published algorithms: (1) Goldberg’s and Tarjan’s Push-Relabel; (2) Boykov’s and Kolmogorov’s augmenting paths; and (3) Goldberg’s partial augment-relabel. While the common practice in computer-vision is to use either BK or PRF algorithms for solving the problem, our results demonstrate that, in general, HPF algorithm is more efficient and utilizes less memory than these three algorithms. This strongly suggests that HPF is a great option for many real-time computer-vision problems that require solving the minimum st cut problem.  相似文献   

19.
This paper presents a novel video stabilization approach by leveraging the multiple planes structure of video scene to stabilize inter‐frame motion. As opposed to previous stabilization procedure operating in a single plane, our approach primarily deals with multiplane videos and builds their multiple planes structure for performing stabilization in respective planes. Hence, a robust plane detection scheme is devised to detect multiple planes by classifying feature trajectories according to reprojection errors generated by plane induced homographies. Then, an improved planar stabilization technique is applied by conforming to the compensated homography in each plane. Finally, multiple stabilized planes are coherently fused by content‐preserving image warps to obtain the output stabilized frames. Our approach does not need any stereo reconstruction, yet is able to produce commendable results due to awareness of multiple planes structure in the stabilization. Experimental results demonstrate the effectiveness and efficiency of our approach to robust stabilization on multiplane videos.  相似文献   

20.
Building facade detection is an important problem in computer vision, with applications in mobile robotics and semantic scene understanding. In particular, mobile platform localization and guidance in urban environments can be enabled with accurate models of the various building facades in a scene. Toward that end, we present a system for detection, segmentation, and parameter estimation of building facades in stereo imagery. The proposed method incorporates multilevel appearance and disparity features in a binary discriminative model, and generates a set of candidate planes by sampling and clustering points from the image with Random Sample Consensus (RANSAC), using local normal estimates derived from Principal Component Analysis (PCA) to inform the planar models. These two models are incorporated into a two-layer Markov Random Field (MRF): an appearance- and disparity-based discriminative classifier at the mid-level, and a geometric model to segment the building pixels into facades at the high-level. By using object-specific stereo features, our discriminative classifier is able to achieve substantially higher accuracy than standard boosting or modeling with only appearance-based features. Furthermore, the results of our MRF classification indicate a strong improvement in accuracy for the binary building detection problem and the labeled planar surface models provide a good approximation to the ground truth planes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号