首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.

Depth image based rendering (DIBR) is a popular technique for rendering virtual 3D views in stereoscopic and autostereoscopic displays. The quality of DIBR-synthesized images may decrease due to various factors, e.g., imprecise depth maps, poor rendering techniques, inaccurate camera parameters. The quality of synthesized images is important as it directly affects the overall user experience. Therefore, the need arises for designing algorithms to estimate the quality of the DIBR-synthesized images. The existing 2D image quality assessment metrics are found to be insufficient for 3D view quality estimation because the 3D views not only contain color information but also make use of disparity to achieve the real depth sensation. In this paper, we present a new algorithm for evaluating the quality of DIBR generated images in the absence of the original references. The human visual system is sensitive to structural information; any deg radation in structure or edges affects the visual quality of the image and is easily noticeable for humans. In the proposed metric, we estimate the quality of the synthesized view by capturing the structural and textural distortion in the warped view. The structural and textural information from the input and the synthesized images is estimated and used to calculate the image quality. The performance of the proposed quality metric is evaluated on the IRCCyN IVC DIBR images dataset. Experimental evaluations show that the proposed metric outperforms the existing 2D and 3D image quality metrics by achieving a high correlation with the subjective ratings.

  相似文献   

2.
目的 深度图像作为一种普遍的3维场景信息表达方式在立体视觉领域有着广泛的应用。Kinect深度相机能够实时获取场景的深度图像,但由于内部硬件的限制和外界因素的干扰,获取的深度图像存在分辨率低、边缘不准确的问题,无法满足实际应用的需要。为此提出了一种基于彩色图像边缘引导的Kinect深度图像超分辨率重建算法。方法 首先对深度图像进行初始化上采样,并提取初始化深度图像的边缘;进一步利用高分辨率彩色图像和深度图像的相似性,采用基于结构化学习的边缘检测方法提取深度图的正确边缘;最后找出初始化深度图的错误边缘和深度图正确边缘之间的不可靠区域,采用边缘对齐的策略对不可靠区域进行插值填充。结果 在NYU2数据集上进行实验,与8种最新的深度图像超分辨率重建算法作比较,用重建之后的深度图像和3维重建的点云效果进行验证。实验结果表明本文算法在提高深度图像的分辨率的同时,能有效修正上采样后深度图像的边缘,使深度边缘与纹理边缘对齐,也能抑制上采样算法带来的边缘模糊现象;3维点云效果显示,本文算法能准确区分场景中的前景和背景,应用于3维重建等应用能取得较其他算法更好的效果。结论 本文算法普遍适用于Kinect深度图像的超分辨率重建问题,该算法结合同场景彩色图像与深度图像的相似性,利用纹理边缘引导深度图像的超分辨率重建,可以得到较好的重建结果。  相似文献   

3.
目的 许多先前的显著目标检测工作都是集中在2D的图像上,并不能适用于RGB-D图像的显著性检测。本文同时提取颜色特征以及深度特征,提出了一种基于特征融合和S-D概率矫正的RGB-D显著性检测方法,使得颜色特征和深度特征相互补充。方法 首先,以RGB图像的4个边界为背景询问节点,使用特征融合的Manifold Ranking输出RGB图像的显著图;其次,依据RGB图像的显著图和深度特征计算S-D矫正概率;再次,计算深度图的显著图并依据S-D矫正概率对该显著图进行S-D概率矫正;最后,对矫正后的显著图提取前景询问节点再次使用特征融合的Manifold Ranking方法进行显著优化,得到最终的显著图。结果 利用本文RGB-D显著性检测方法对RGBD数据集上的1 000幅图像进行了显著性检测,并与6种不同的方法进行对比,本文方法的显著性检测结果更接近人工标定结果。Precision-Recall曲线(PR曲线)显示在相同召回率下本文方法的准确率较其中5种方法高,且处理单幅图像的时间为2.150 s,与其他算法相比也有一定优势。结论 本文方法能较准确地对RGB-D图像进行显著性检测。  相似文献   

4.
Depth image-based rendering (DIBR), which is used to render virtual views with a color image and the corresponding depth map, is one of the key techniques in the 2D to 3D video conversion process. In this paper, a novel method is proposed to partially solve two puzzles of DIBR, i.e. visual image generation and hole filling. The method combines two different approaches for synthesizing new views from an existing view and a corresponding depth map. Disoccluded parts of the synthesized image are first classified as either smooth or highly structured. At structured regions, inpainting is used to preserve the background structure. In other regions, an improved directional depth smoothing is used to avoid disocclusion. Thus, more details and straight line structures in the generated virtual image are preserved. The key contributions include an enhanced adaptive directional filter and a directional hole inpainting algorithm. Experiments show that the disocclusion is removed and the geometric distortion is reduced efficiently. The proposed method can generate more visually satisfactory results.  相似文献   

5.
Large holes are unavoidably generated in depth image based rendering (DIBR) using a single color image and its associated depth map. Such holes are mainly caused by disocclusion, which occurs around the sharp depth discontinuities in the depth map. We propose a divide-and-conquer hole-filling method which refines the background depth pixels around the sharp depth discontinuities to address the disocclusion problem. Firstly, the disocclusion region is detected according to the degree of depth discontinuity, and the target area is marked as a binary mask. Then, the depth pixels located in the target area are modified by a linear interpolation process, whose pixel values decrease from the foreground depth value to the background depth value. Finally, in order to remove the isolated depth pixels, median filtering is adopted to refine the depth map. In these ways, disocclusion regions in the synthesized view are divided into several small holes after DIBR, and are easily filled by image inpainting. Experimental results demonstrate that the proposed method can effectively improve the quality of the synthesized view subjectively and objectively.  相似文献   

6.
Depth map generated by the Kinect may have some pixels lost due to echo attenuation of infra- red light and mutual interference between neighboring pixels, which can cause pervasive problems when utilizing Kinect cameras as depth sensors. In this work, we propose a 2-step inpainting algorithm to infill the holes. First, a naive Bayesian estimation is conducted as preliminary inpainting scheme, utilizing neighboring pixels of the missing ones, and corresponding pixels in the color image as prior knowledge. After that, an optimization is implemented to improve the depth map, where the false edges in mistakenly inpainted regions are detected, then iteratively propelled to their true positions under total variation framework. Experimental results are included to show effectiveness of the proposed algorithm.  相似文献   

7.
The wide availability of affordable RGB-D sensors changes the landscape of indoor scene analysis. Years of research on simultaneous localization and mapping (SLAM) have made it possible to merge multiple RGB-D images into a single point cloud and provide a 3D model for a complete indoor scene. However, these reconstructed models only have geometry information, not including semantic knowledge. The advancements in robot autonomy and capabilities for carrying out more complex tasks in unstructured environments can be greatly enhanced by endowing environment models with semantic knowledge. Towards this goal, we propose a novel approach to generate 3D semantic maps for an indoor scene. Our approach creates a 3D reconstructed map from a RGB-D image sequence firstly, then we jointly infer the semantic object category and structural class for each point of the global map. 12 object categories (e.g. walls, tables, chairs) and 4 structural classes (ground, structure, furniture and props) are labeled in the global map. In this way, we can totally understand both the object and structure information. In order to get semantic information, we compute semantic segmentation for each RGB-D image and merge the labeling results by a Dense Conditional Random Field. Different from previous techniques, we use temporal information and higher-order cliques to enforce the label consistency for each image labeling result. Our experiments demonstrate that temporal information and higher-order cliques are significant for the semantic mapping procedure and can improve the precision of the semantic mapping results.  相似文献   

8.
RGB-D相机(如微软的Kinect)能够在获取彩色图像的同时得到每个像素的深度信息,在移动机器人三维地图创建方向具有广泛应用。本文设计了一种利用RGB-D相机进行机器人自定位及创建室内场景三维模型的方法,该方法首先由RGB-D相机获取周围环境的连续帧信息;其次提取并匹配连续帧间的SURF特征点,通过特征点的位置变化计算机器人的位姿并结合非线性最小二乘优化算法最小化对应点的双向投影误差;最后结合关键帧技术及观察中心法将相机观测到的三维点云依据当前位姿投影到全局地图。本文选择三个不同的场景试验了该方法,并对比了不同特征点下该方法的效果,试验中本文方法在轨迹长度为5.88m情况下误差仅为0.023,能够准确地创建周围环境的三维模型。  相似文献   

9.
When constructing a dense 3D model of an indoor static scene from a sequence of RGB-D images, the choice of the 3D representation (e.g. 3D mesh, cloud of points or implicit function) is of crucial importance. In the last few years, the volumetric truncated signed distance function (TSDF) and its extensions have become popular in the community and largely used for the task of dense 3D modelling using RGB-D sensors. However, as this representation is voxel based, it offers few possibilities for manipulating and/or editing the constructed 3D model, which limits its applicability. In particular, the amount of data required to maintain the volumetric TSDF rapidly becomes huge which limits possibilities for portability. Moreover, simplifications (such as mesh extraction and surface simplification) significantly reduce the accuracy of the 3D model (especially in the color space), and editing the 3D model is difficult. We propose a novel compact, flexible and accurate 3D surface representation based on parametric surface patches augmented by geometric and color texture images. Simple parametric shapes such as planes are roughly fitted to the input depth images, and the deviations of the 3D measurements to the fitted parametric surfaces are fused into a geometric texture image (called the Bump image). A confidence and color texture image are also built. Our 3D scene representation is accurate yet memory efficient. Moreover, updating or editing the 3D model becomes trivial since it is reduced to manipulating 2D images. Our experimental results demonstrate the advantages of our proposed 3D representation through a concrete indoor scene reconstruction application.  相似文献   

10.
文中提出一种羽毛球比赛的2D视频转换到3D视频的算法。在这类视频中,前景是最受关注的部分,准确地从背景中提取出前景对象是获取深度图的关键。文中采用一种改进的图割算法来获取前景,并根据场景结构构建背景深度模型,获取背景深度图;在背景深度图的基础上,根据前景与镜头之间的距离关系为前景对象进行深度赋值,从而得到前景深度图。然后,融合背景深度图和前景深度图,得到完整的深度图。最后,通过基于深度图像的虚拟视点绘制技术DIBR来获取用于3D显示的立体图像对。实验结果表明,最终生成的立体图像对具有较好的3D效果。  相似文献   

11.

Saliency prediction models provide a probabilistic map of relative likelihood of an image or video region to attract the attention of the human visual system. Over the past decade, many computational saliency prediction models have been proposed for 2D images and videos. Considering that the human visual system has evolved in a natural 3D environment, it is only natural to want to design visual attention models for 3D content. Existing monocular saliency models are not able to accurately predict the attentive regions when applied to 3D image/video content, as they do not incorporate depth information. This paper explores stereoscopic video saliency prediction by exploiting both low-level attributes such as brightness, color, texture, orientation, motion, and depth, as well as high-level cues such as face, person, vehicle, animal, text, and horizon. Our model starts with a rough segmentation and quantifies several intuitive observations such as the effects of visual discomfort level, depth abruptness, motion acceleration, elements of surprise, size and compactness of the salient regions, and emphasizing only a few salient objects in a scene. A new fovea-based model of spatial distance between the image regions is adopted for considering local and global feature calculations. To efficiently fuse the conspicuity maps generated by our method to one single saliency map that is highly correlated with the eye-fixation data, a random forest based algorithm is utilized. The performance of the proposed saliency model is evaluated against the results of an eye-tracking experiment, which involved 24 subjects and an in-house database of 61 captured stereoscopic videos. Our stereo video database as well as the eye-tracking data are publicly available along with this paper. Experiment results show that the proposed saliency prediction method achieves competitive performance compared to the state-of-the-art approaches.

  相似文献   

12.

In 3D image compression, depth image based rendering (DIBR) is one of the latest techniques where the center image (say the main view, is used to synthesise the left and the right view image) and the depth image are communicated to the receiver side. It has been observed in the literature that most of the existing 3D image watermarking schemes are not resilient to the view synthesis process used in the DIBR technique. In this paper, a 3D image watermarking scheme is proposed which is invariant to the DIBR view synthesis process. In this proposed scheme, 2D-dual-tree complex wavelet transform (2D-DT-CWT) coefficients of centre view are used for watermark embedding such that shift invariance and directional property of the DT-CWT can be exploited to make the scheme robust against view synthesis process. A comprehensive set of experiments has been carried out to justify the robustness of the proposed scheme over the related existing schemes with respect to the JPEG compression and synthesis view attack.

  相似文献   

13.
摘 要: 为充分利用RGB-D图像提供的潜在特征信息,提出了多尺度卷积递归神经网络算法(Multi-scale Convolutional-Recursive Neural Networks,Ms-CRNN)。该算法对RGB-D图像的RGB图、灰度图、深度图及3D曲面法线图进行不同尺度分块形成多个通道,每个通道与相应尺寸的滤波器卷积,提取的特征图经局部对比度标准化和下采样后作为递归神经网络(Recursive Neural Networks ,RNN)层的输入以得到更加抽象的高层特征。融合后的多尺度特征,由SVM分类器进行分类。基于RGB-D数据集的仿真实验结果表明,综合利用RGB-D图像的多尺度特征,提出的Ms-CRNN算法在物体识别率上达到88.2%,和先前方法相比有了较大的提高。  相似文献   

14.

Depth-image-based rendering (DIBR) is widely used in 3DTV, free-viewpoint video, and interactive 3D graphics applications. Typically, synthetic images generated by DIBR-based systems incorporate various distortions, particularly geometric distortions induced by object dis-occlusion. Ensuring the quality of synthetic images is critical to maintaining adequate system service. However, traditional 2D image quality metrics are ineffective for evaluating synthetic images as they are not sensitive to geometric distortion. In this paper, we propose a novel no-reference image quality assessment method for synthetic images based on convolutional neural networks, introducing local image saliency as prediction weights. Due to the lack of existing training data, we construct a new DIBR synthetic image dataset as part of our contribution. Experiments were conducted on both the public benchmark IRCCyN/IVC DIBR image dataset and our own dataset. Results demonstrate that our proposed metric outperforms traditional 2D image quality metrics and state-of-the-art DIBR-related metrics.

  相似文献   

15.
目的 深度相机能够对场景的深度信息进行实时动态捕捉,但捕获的深度图像分辨率低且容易形成空洞。利用高分辨率彩色图像作为引导,是深度图超分辨率重建的重要方式。现有方法对彩色边缘与深度不连续区域的不一致性问题难以有效解决,在深度图超分辨率重建中引入了纹理复制伪影。针对这一问题,本文提出了一种鲁棒的彩色图像引导的深度图超分辨率重建算法。方法 首先,利用彩色图像边缘与深度图像边缘的结构相关性,提出RGB-D结构相似性度量,检测彩色图像与深度图像共有的边缘不连续区域,并利用RGB-D结构相似性度量自适应选取估计像素点邻域的最优图像块。接着,通过提出的定向非局部均值权重,在图像块区域内建立多边引导下的深度估计,解决彩色边缘和深度不连续区域的结构不一致性。最后,利用RGB-D结构相似性度量与图像平滑性之间的对应关系,对多边引导权重的参数进行自适应调节,实现鲁棒的深度图超分辨率重建。结果 在Middlebury合成数据集、ToF和Kinect数据集以及本文自建数据集上的实验结果表明,相比其他先进方法,本文方法能够有效抑制纹理复制伪影。在Middlebury、ToF和Kinect数据集上,本文方法相较于次优算法,平均绝对偏差平均降低约63.51%、39.47 %和7.04 %。结论 对于合成数据集以及真实场景的深度数据集,本文方法均能有效处理存在于彩色边缘和深度不连续区域的不一致性问题,更好地保留深度边缘的不连续性。  相似文献   

16.

Augmented Reality applications are set to revolutionize the smartphone industry due to the integration of RGB-D sensors into mobile devices. Given the large number of smartphone users, efficient storage and transmission of RGB-D data is of paramount interest to the research community. While there exist Video Coding Standards such as HEVC and H.264/AVC for compression of RGB/texture component, the coding of depth data is still an area of active research. This paper presents a method for coding depth videos, captured from mobile RGB-D sensors, by planar segmentation. The segmentation algorithm is based on Markov Random Field assumptions on depth data and solved using Graph Cuts. While all prior works based on this approach remain restricted to images only and under noise-free conditions, this paper presents an efficient solution to planar segmentation in noisy depth videos. Also presented is a unique method to encode depth based on its segmented planar representation. Experiments on depth captured from a noisy sensor (Microsoft Kinect) shows superior Rate-Distortion performance over the 3D extension of HEVC codec.

  相似文献   

17.
目的 2维转3维技术可以将现有的丰富2维图像资源快速有效地转为立体图像,但是现有方法只能对树木的整体进行深度估计,所生成的图像无法表现出树木的立体结构。为此,提出一种树木结构层次细化的立体树木图像构建方法。方法 首先利用Lab颜色模型下的像素色差区别将2维树木图像的树干区域和树冠区域分割开来,并对树冠区域进行再分割;然后,在深度梯度假设思想基础上建立多种类型的深度模板,结合深度模板和树冠的区域信息为典型树木对象构建初始深度图,并通过基础深度梯度图组合的方式为非典型树木进行个性化深度构建;最后,根据应用场景对树木深度信息进行自适应调整与优化,将树木图像合成到背景图像中,并构建立体图像。结果 对5组不同的树木图像及背景图像进行了立体树木图像的构建与合成。结果表明,不同形态的树木图像都能生成具有层次感的深度图并自适应地合成到立体背景图像中,构建树木图像深度图的时间与原始树木图像的尺寸成正比,而构建立体树木图像并合成到背景中所需时间在24 s之间。对立体图像质量的主观评价测试中,这些图像的评分均达到良好级别以上,部分立体图像达到了优秀级别。结论 该方法充分利用了树木的形态结构特征,能同时适用于典型和非典型树木,所构建的立体树木图像质量较高,具有丰富的层次感,并具有舒适的立体观看效果。  相似文献   

18.
目的 RGB-D相机的外参数可以被用来将相机坐标系下的点云转换到世界坐标系的点云,可以应用在3维场景重建、3维测量、机器人、目标检测等领域。 一般的标定方法利用标定物(比如棋盘)对RGB-D彩色相机的外参标定,但并未利用深度信息,故很难简化标定过程,因此,若充分利用深度信息,则极大地简化外参标定的流程。基于彩色图的标定方法,其标定的对象是深度传感器,然而,RGB-D相机大部分则应用基于深度传感器,而基于深度信息的标定方法则可以直接标定深度传感器的姿势。方法 首先将深度图转化为相机坐标系下的3维点云,利用MELSAC方法自动检测3维点云中的平面,根据地平面与世界坐标系的约束关系,遍历并筛选平面,直至得到地平面,利用地平面与相机坐标系的空间关系,最终计算出相机的外参数,即相机坐标系内的点与世界坐标系内的点的转换矩阵。结果 实验以棋盘的外参标定方法为基准,处理从PrimeSense相机所采集的RGB-D视频流,结果表明,外参标定平均侧倾角误差为-1.14°,平均俯仰角误差为4.57°,平均相机高度误差为3.96 cm。结论 该方法通过自动检测地平面,准确估计出相机的外参数,具有很强的自动化,此外,算法具有较高地并行性,进行并行优化后,具有实时性,可应用于自动估计机器人姿势。  相似文献   

19.
Depth image based rendering (DIBR) is a promising technique for extending viewpoints with a monoscopic center image and its associated per-pixel depth map. With its numerous advantages including low-cost bandwidth, 2D-to-3D compatibility and adjustment of depth condition, DIBR has received much attention in the 3D research community. In the case of a DIBR-based broadcasting system, a malicious adversary can illegally distribute both a center view and synthesized virtual views as 2D and 3D content, respectively. To deal with the issue of copyright protection for DIBR 3D Images, we propose a scale invariant feature transform (SIFT) features based blind watermarking algorithm. To design the proposed method robust against synchronization attacks from DIBR operation, we exploited the parameters of the SIFT features: the location, scale and orientation. Because the DIBR operation is a type of translation transform, the proposed method uses high similarity between the SIFT parameters extracted from a synthesized virtual view and center view images. To enhance the capacity and security, we propose an orientation of keypoints based watermark pattern selection method. In addition, we use the spread spectrum technique for watermark embedding and perceptual masking taking into consideration the imperceptibility. Finally, the effectiveness of the presented method was experimentally verified by comparing with other previous schemes. The experimental results show that the proposed method is robust against synchronization attacks from DIBR operation. Furthermore, the proposed method is robust against signal distortions and typical attacks from geometric distortions such as translation and cropping.  相似文献   

20.
RGB-D sensors are capable of providing 3D points (depth) together with color information associated with each point. These sensors suffer from different sources of noise. With some kinds of RGB-D sensors, it is possible to pre-process the color image before assigning the color information to the 3D data. However, with other kinds of sensors that is not possible: RGB-D data must be processed directly. In this paper, we compare different approaches for noise and artifacts reduction: Gaussian, mean and bilateral filter. These methods are time consuming when managing 3D data, which can be a problem with several real time applications. We propose new methods to accelerate the whole process and improve the quality of the color information using entropy information. Entropy provides a framework for speeding up the involved methods allowing certain data not to be processed if the entropy value of that data is over or under a given threshold. The experimental results provide a way to balance the quality and the acceleration of these methods. The current results show that our methods improve both the image quality and processing time, as compared to the original methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号