首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
目的 由于室内点云场景中物体的密集性、复杂性以及多遮挡等带来的数据不完整和多噪声问题,极大地限制了室内点云场景的重建工作,无法保证场景重建的准确度。为了更好地从无序点云中恢复出完整的场景,提出了一种基于语义分割的室内场景重建方法。方法 通过体素滤波对原始数据进行下采样,计算场景三维尺度不变特征变换(3D scale-invariant feature transform,3D SIFT)特征点,融合下采样结果与场景特征点从而获得优化的场景下采样结果;利用随机抽样一致算法(random sample consensus,RANSAC)对融合采样后的场景提取平面特征,将该特征输入PointNet网络中进行训练,确保共面的点具有相同的局部特征,从而得到每个点在数据集中各个类别的置信度,在此基础上,提出了一种基于投影的区域生长优化方法,聚合语义分割结果中同一物体的点,获得更精细的分割结果;将场景物体的分割结果划分为内环境元素或外环境元素,分别采用模型匹配的方法、平面拟合的方法从而实现场景的重建。结果 在S3DIS (Stanford large-scale 3D indoor space dataset)数据集上进行实验,本文融合采样算法对后续方法的效率和效果有着不同程度的提高,采样后平面提取算法的运行时间仅为采样前的15%;而语义分割方法在全局准确率(overall accuracy,OA)和平均交并比(mean intersection over union,mIoU)两个方面比PointNet网络分别提高了2.3%和4.2%。结论 本文方法能够在保留关键点的同时提高计算效率,在分割准确率方面也有着明显提升,同时可以得到高质量的重建结果。  相似文献   

2.
With the development of computer vision technologies, 3D reconstruction has become a hotspot. At present, 3D reconstruction relies heavily on expensive equipment and has poor real-time performance. In this paper, we aim at solving the problem of 3D reconstruction of an indoor scene with large vertical span. In this paper, we propose a novel approach for 3D reconstruction of indoor scenes with only a Kinect. Firstly, this method uses a Kinect sensor to get color images and depth images of an indoor scene. Secondly, the combination of scale-invariant feature transform and random sample consensus algorithm is used to determine the transformation matrix of adjacent frames, which can be seen as the initial value of iterative closest point (ICP). Thirdly, we establish the relative coordinate relation between pair-wise frames which are the initial point cloud data by using ICP. Finally, we achieve the 3D visual reconstruction model of indoor scene by the top-down image registration of point cloud data. This approach not only mitigates the sensor perspective restriction and achieves the indoor scene reconstruction of large vertical span, but also develops the fast algorithm of indoor scene reconstruction with large amount of cloud data. The experimental results show that the proposed algorithm has better accuracy, better reconstruction effect, and less running time for point cloud registration. In addition, the proposed method has great potential applied to 3D simultaneous location and mapping.  相似文献   

3.
Since indoor scenes are frequently changed in daily life, such as re‐layout of furniture, the 3D reconstructions for them should be flexible and easy to update. We present an automatic 3D scene update algorithm to indoor scenes by capturing scene variation with RGBD cameras. We assume an initial scene has been reconstructed in advance in manual or other semi‐automatic way before the change, and automatically update the reconstruction according to the newly captured RGBD images of the real scene update. It starts with an automatic segmentation process without manual interaction, which benefits from accurate labeling training from the initial 3D scene. After the segmentation, objects captured by RGBD camera are extracted to form a local updated scene. We formulate an optimization problem to compare to the initial scene to locate moved objects. The moved objects are then integrated with static objects in the initial scene to generate a new 3D scene. We demonstrate the efficiency and robustness of our approach by updating the 3D scene of several real‐world scenes.  相似文献   

4.
从深度图RGB-D域中联合学习RGB图像特征与3D几何信息有利于室内场景语义分割,然而传统分割方法通常需要精确的深度图作为输入,严重限制了其应用范围。提出一种新的室内场景理解网络框架,建立基于语义特征与深度特征提取网络的联合学习网络模型提取深度感知特征,通过几何信息指导的深度特征传输模块与金字塔特征融合模块将学习到的深度特征、多尺度空间信息与语义特征相结合,生成具有更强表达能力的特征表示,实现更准确的室内场景语义分割。实验结果表明,联合学习网络模型在NYU-Dv2与SUN RGBD数据集上分别取得了69.5%与68.4%的平均分割准确度,相比传统分割方法具有更好的室内场景语义分割性能及更强的适用性。  相似文献   

5.
Current state-of-the-art image-based scene reconstruction techniques are capable of generating high-fidelity 3D models when used under controlled capture conditions. However, they are often inadequate when used in more challenging environments such as sports scenes with moving cameras. Algorithms must be able to cope with relatively large calibration and segmentation errors as well as input images separated by a wide-baseline and possibly captured at different resolutions. In this paper, we propose a technique which, under these challenging conditions, is able to efficiently compute a high-quality scene representation via graph-cut optimisation of an energy function combining multiple image cues with strong priors. Robustness is achieved by jointly optimising scene segmentation and multiple view reconstruction in a view-dependent manner with respect to each input camera. Joint optimisation prevents propagation of errors from segmentation to reconstruction as is often the case with sequential approaches. View-dependent processing increases tolerance to errors in through-the-lens calibration compared to global approaches. We evaluate our technique in the case of challenging outdoor sports scenes captured with manually operated broadcast cameras as well as several indoor scenes with natural background. A comprehensive experimental evaluation including qualitative and quantitative results demonstrates the accuracy of the technique for high quality segmentation and reconstruction and its suitability for free-viewpoint video under these difficult conditions.  相似文献   

6.
To speed up the reconstruction of 3D dynamic scenes in an ordinary hardware platform, we propose an efficient framework to reconstruct 3D dynamic objects using a multiscale-contour-based interpolation from multi-view videos. Our framework takes full advantage of spatio-temporal-contour consistency. It exploits the property to interpolate single contours, two neighboring contours which belong to the same model, and two contours which belong to the same view at different times, corresponding to point-, contour-, and model-level interpolations, respectively. The framework formulates the interpolation of two models as point cloud transport rather than non-rigid surface deformation. Our framework speeds up the reconstruction of a dynamic scene while improving the accuracy of point-pairing which is used to perform the interpolation. We obtain a higher frame rate, spatio-temporal-coherence, and a quasi-dense point cloud sequence with color information. Experiments with real data were conducted to test the efficiency of the framework.  相似文献   

7.
We propose a novel approach to robot‐operated active understanding of unknown indoor scenes, based on online RGBD reconstruction with semantic segmentation. In our method, the exploratory robot scanning is both driven by and targeting at the recognition and segmentation of semantic objects from the scene. Our algorithm is built on top of a volumetric depth fusion framework and performs real‐time voxel‐based semantic labeling over the online reconstructed volume. The robot is guided by an online estimated discrete viewing score field (VSF) parameterized over the 3D space of 2D location and azimuth rotation. VSF stores for each grid the score of the corresponding view, which measures how much it reduces the uncertainty (entropy) of both geometric reconstruction and semantic labeling. Based on VSF, we select the next best views (NBV) as the target for each time step. We then jointly optimize the traverse path and camera trajectory between two adjacent NBVs, through maximizing the integral viewing score (information gain) along path and trajectory. Through extensive evaluation, we show that our method achieves efficient and accurate online scene parsing during exploratory scanning.  相似文献   

8.
A 3D model reconstruction workflow with hand-held cameras is developed. The exterior and interior orientation models combined with the state-of-the-art structure from motion and multi-view stereo techniques are applied to extract dense point cloud and reconstruct 3D model from digital images. An overview of the presented 3D model reconstruction methods is given. The whole procedure including tie point extraction, relative orientation, bundle block adjustment, dense point production and 3D model reconstruction is all reviewed in brief. Among them, we focus on bundle block adjustment procedure; the mathematical and technical details of bundle block adjustment are introduced and discussed. Finally, four scenes of images collected by hand-held cameras are tested in this paper. The preliminary results have shown that sub-pixel (<1 pixel) accuracy can be achieved with the proposed exterior–interior orientation models and satisfactory 3D models can be reconstructed using images collected by hand-held cameras. This work can be applied in indoor navigation, crime scene reconstruction, heritage reservation and other applications in geosciences.  相似文献   

9.
三维重建技术常用于自动驾驶、机器人、无人机和增强现实等领域。视差估计是三维重建的关键步骤,随着数据集的增加、硬件和网络模型的发展,深度学习视差估计模型被广泛使用并取得良好效果。然而,这些方法常用室外场景的物体,很少使用在室内场景的数据集中。回顾了双目视差估计的深度学习方法,选用5种深度学习网络:PSMNet(pyramid stereo matching network)、GA-Net(guided aggregation network)、LEAStereo(hierarchical neural architecture search for deep stereo matching)、DeepPruner(learning efficient stereo matching via differentiable patchmatch)、BGNet(bilateral grid learning for stereo matching networks),将其运用在一套真实世界的街景数据集(KITTI2015)和两套室内场景数据集(Middlebury2014、Instereo2K);分析各模型搭建方法,评估深度学习在室内场景影像视差估计中的性能,并与传统的SGM方法进行比较。针对深度学习视差估计方法的研究内容,指出其面临的问题及挑战。  相似文献   

10.
针对使用传统单目相机的全自动三维重建方法结果精确度差和整体结构理解缺失等问题,提出一种结合视觉惯性里程计和由运动到结构的全自动室内三维布局重建系统.首先利用视觉里程计获得关键帧图像序列和对应空间位置姿态,并利用运动恢复结构算法计算精确相机位姿;然后利用多图视立体几何算法生成高质量稠密点云;最后基于曼哈顿世界假设,针对典型的现代建筑室内场景,设计一种基于规则的自底向上的布局重建方法,得到最终房间外轮廓布局.使用浙江大学CAD&CG实验室场景现场扫描数据集和人工合成的稠密点云数据集作为实验数据,在Ubuntu 16.04和PCL 1.9环境下进行实验.结果表明,文中方法对三维点云噪声容忍度高,能够有效地重建出室内场景的三维外轮廓布局.  相似文献   

11.
Geometric hashing (GH) and partial pose clustering are well-known algorithms for pattern recognition. However, the performance of both these algorithms degrades rapidly with an increase in scene clutter and the measurement uncertainty in the detected features. The primary contribution of this paper is the formulation of a framework that unifies the GH and the partial pose clustering paradigms for pattern recognition in cluttered scenes. The proposed scheme has a better discrimination capability as compared to the GA algorithm, thus improving recognition accuracy. The scheme is incorporated in a Bayesian MLE framework to make it robust to the presence of sensor noise. It is able to handle partial occlusions, is robust to measurement uncertainty in the data features and to the presence of spurious scene features (scene clutter). An efficient hash table representation of 3D features extracted from range images is also proposed. Simulations with real and synthetic 2D/3D objects show that the scheme performs better than the GH algorithm in scenes with a large amount of clutter.  相似文献   

12.
When constructing a dense 3D model of an indoor static scene from a sequence of RGB-D images, the choice of the 3D representation (e.g. 3D mesh, cloud of points or implicit function) is of crucial importance. In the last few years, the volumetric truncated signed distance function (TSDF) and its extensions have become popular in the community and largely used for the task of dense 3D modelling using RGB-D sensors. However, as this representation is voxel based, it offers few possibilities for manipulating and/or editing the constructed 3D model, which limits its applicability. In particular, the amount of data required to maintain the volumetric TSDF rapidly becomes huge which limits possibilities for portability. Moreover, simplifications (such as mesh extraction and surface simplification) significantly reduce the accuracy of the 3D model (especially in the color space), and editing the 3D model is difficult. We propose a novel compact, flexible and accurate 3D surface representation based on parametric surface patches augmented by geometric and color texture images. Simple parametric shapes such as planes are roughly fitted to the input depth images, and the deviations of the 3D measurements to the fitted parametric surfaces are fused into a geometric texture image (called the Bump image). A confidence and color texture image are also built. Our 3D scene representation is accurate yet memory efficient. Moreover, updating or editing the 3D model becomes trivial since it is reduced to manipulating 2D images. Our experimental results demonstrate the advantages of our proposed 3D representation through a concrete indoor scene reconstruction application.  相似文献   

13.
王伟  任国恒  陈立勇  张效尉 《自动化学报》2019,45(11):2187-2198
在基于图像的城市场景三维重建中,场景分段平面重建算法可以克服场景中的弱纹理、光照变化等因素的影响而快速恢复场景完整的近似结构.然而,在初始空间点较为稀疏、候选平面集不完备、图像过分割质量较低等问题存在时,可靠性往往较低.为了解决此问题,本文根据城市场景的结构特征构造了一种新颖的融合场景结构先验、空间点可见性与颜色相似性的平面可靠性度量,然后采用图像区域与相应平面协同优化的方式对场景结构进行了推断.实验结果表明,本文算法利用稀疏空间点即可有效重建出完整的场景结构,整体上具有较高的精度与效率.  相似文献   

14.
Reconstructing the World’s Museums   总被引:2,自引:0,他引:2  
Virtual exploration tools for large indoor environments (e.g. museums) have so far been limited to either blueprint-style 2D maps that lack photo-realistic views of scenes, or ground-level image-to-image transitions, which are immersive but ill-suited for navigation. On the other hand, photorealistic aerial maps would be a useful navigational guide for large indoor environments, but it is impossible to directly acquire photographs covering a large indoor environment from aerial viewpoints. This paper presents a 3D reconstruction and visualization system for automatically producing clean and well-regularized texture-mapped 3D models for large indoor scenes, from ground-level photographs and 3D laser points. The key component is a new algorithm called “inverse constructive solid geometry (CSG)” for reconstructing a scene with a CSG representation consisting of volumetric primitives, which imposes powerful regularization constraints. We also propose several novel techniques to adjust the 3D model to make it suitable for rendering the 3D maps from aerial viewpoints. The visualization system enables users to easily browse a large-scale indoor environment from a bird’s-eye view, locate specific room interiors, fly into a place of interest, view immersive ground-level panorama views, and zoom out again, all with seamless 3D transitions. We demonstrate our system on various museums, including the Metropolitan Museum of Art in New York City—one of the largest art galleries in the world.  相似文献   

15.
目的 视觉定位旨在利用易于获取的RGB图像对运动物体进行目标定位及姿态估计。室内场景中普遍存在的物体遮挡、弱纹理区域等干扰极易造成目标关键点的错误估计,严重影响了视觉定位的精度。针对这一问题,本文提出一种主被动融合的室内定位系统,结合固定视角和移动视角的方案优势,实现室内场景中运动目标的精准定位。方法 提出一种基于平面先验的物体位姿估计方法,在关键点检测的单目定位框架基础上,使用平面约束进行3自由度姿态优化,提升固定视角下室内平面中运动目标的定位稳定性。基于无损卡尔曼滤波算法设计了一套数据融合定位系统,将从固定视角得到的被动式定位结果与从移动视角得到的主动式定位结果进行融合,提升了运动目标的位姿估计结果的可靠性。结果 本文提出的主被动融合室内视觉定位系统在iGibson仿真数据集上的平均定位精度为2~3 cm,定位误差在10 cm内的准确率为99%;在真实场景中平均定位精度为3~4 cm,定位误差在10 cm内的准确率在90%以上,实现了cm级的定位精度。结论 提出的室内视觉定位系统融合了被动式和主动式定位方法的优势,能够以较低设备成本实现室内场景中高精度的目标定位结果,并在遮挡、目标...  相似文献   

16.
In this paper, a novel approach for creating 3D models of building scenes is presented. The proposed method is fully automated and fast, and accurately reconstructs both outdoor images of a building and indoor scenes, with perspective cues in real-time, using only one image. It combines the extracted line segments to identify the vanishing points of the image, the orientation, the different planes that are depicted in the image and concludes whether the image depicts indoor or outdoor scenes. In addition, the proposed method efficiently eliminates the perspective distortion and produces an accurate 3D model of the scene without any intervention from the user. The main innovation of the method is that it uses only one image for the 3D reconstruction, while other state-of-the-art methods rely on the processing of multiple images. A website and a database of 100 images were created to prove the efficiency of the proposed method in terms of time needed for the 3D reconstruction, its automation and 3D model accuracy and can be used by anyone so as to easily produce user-generated 3D content: http://3d-test.iti.gr:8080/3d-test/3D_recon/  相似文献   

17.
基于二维激光雷达的自动室内三维重建系统   总被引:1,自引:0,他引:1  
设计了一个基于二维激光雷达的自动室内三维重建系统.系统的硬件由一套自行设计的基于2D激光雷达的三维扫描系统和一台电脑构成.介绍了系统的软件模块,提出了结合最近点迭代(ICP)和通用多边形裁剪(GPC)的3D平面场景合成方法.ICP能够获得不同采集位置之间的位置变化,以此能将各个不同位置获得的3D场景转换到同一坐标系下.场景合成时的碎平面问题通过GPC方法来解决.实验结果表明:该系统成本低,精度高,能稳定可靠地实现室内场景的自动三维重建.  相似文献   

18.
In this paper, we present methods for 3D volumetric reconstruction of visual scenes photographed by multiple calibrated cameras placed at arbitrary viewpoints. Our goal is to generate a 3D model that can be rendered to synthesize new photo-realistic views of the scene. We improve upon existing voxel coloring/space carving approaches by introducing new ways to compute visibility and photo-consistency, as well as model infinitely large scenes. In particular, we describe a visibility approach that uses all possible color information from the photographs during reconstruction, photo-consistency measures that are more robust and/or require less manual intervention, and a volumetric warping method for application of these reconstruction methods to large-scale scenes.  相似文献   

19.
3D video billboard clouds reconstruct and represent a dynamic three-dimensional scene using displacement-mapped billboards. They consist of geometric proxy planes augmented with detailed displacement maps and combine the generality of geometry-based 3D video with the regularization properties of image-based 3D video. 3D video billboards are an image-based representation placed in the disparity space of the acquisition cameras and thus provide a regular sampling of the scene with a uniform error model. We propose a general geometry filtering framework which generates time-coherent models and removes reconstruction and quantization noise as well as calibration errors. This replaces the complex and time-consuming sub-pixel matching process in stereo reconstruction with a bilateral filter. Rendering is performed using a GPU-accelerated algorithm which generates consistent view-dependent geometry and textures for each individual frame. In addition, we present a semi-automatic approach for modeling dynamic three-dimensional scenes with a set of multiple 3D video billboards clouds.  相似文献   

20.
It is still challenging to design a robust and efficient tracking algorithm in complex scenes. We propose a new object tracking algorithm with adaptive appearance learning and occlusion detection in an efficient self-tuning particle filter framework. The appearance of an object is modeled with a set of weighted and ordered submanifolds, which can guarantee the adaptability when there is fast illumination or pose change. To overcome the occlusion problem, we use the reconstruction error data of the appearance model to extract occlusion region by graph cuts. And the tracking result is improved with feedback of occlusion detection. The motion model is also integrated with adaptability to overcome the abrupt motion problem. To improve the efficiency of particle filter, the number of samples is tuned with respect to the scene conditions. Experimental results demonstrate that our algorithm can achieve great robustness, high accuracy and good efficiency in challenging scenes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号