首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Free navigation of a scene requires warping some reference views to some desired target viewpoint and blending them to synthesize a virtual view. Convolutional Neural Networks (ConvNets) based methods can learn both the warping and blending tasks jointly. Such methods are often designed for moderate inter-camera baseline distance and larger kernels are required for warping if the baseline distance increases. Algorithmic methods can in principle deal with large baselines, however the synthesized view suffers from artifacts near disoccluded pixels. We present a hybrid approach where first, reference views are algorithmically warped to the target position and then are blended via a ConvNet. Preliminary view warping allows reducing the size of the convolutional kernels and thus the learnable parameters count. We propose a residual encoder–decoder for image blending with a Siamese encoder to further keep the parameters count low. We also contribute a hole inpainting algorithm to fill the disocclusions in the warped views. Our view synthesis experiments on real multiview sequences show better objective image quality than state-of-the-art methods due to fewer artifacts in the synthesized images.  相似文献   

2.
Depth image-based rendering techniques for multiview applications have been recently introduced for efficient view generation at arbitrary camera positions. The rate control in an encoder has thus to consider both texture and depth data. However, due to different structures of depth and texture data and their different roles on the rendered views, the allocation of the available bit budget between them requires a careful analysis. Information loss due to texture coding affects the value of pixels in synthesized views, while errors in depth information lead to a shift in objects or to unexpected patterns at their boundaries.In this paper, we address the problem of efficient bit allocation between texture and depth data of multiview sequences.We adopt a rate-distortion framework based on a simplified model of depth and texture images, which preserves the main features of depth and texture images. Unlike most recent solutions, our method avoids rendering at encoding time for distortion estimation so that the encoding complexity stays low. In addition to this, our model is independent of the underlying inpainting method that is used at the decoder for filling holes in the synthetic views. Extensive experiments validate our theoretical results and confirm the efficiency of our rate allocation strategy.  相似文献   

3.
This paper presents a polygon soup representation for multiview data. Starting from a sequence of multiview video plus depth (MVD) data, the proposed quad-based representation takes into account, in a unified manner, different issues such as compactness, compression, and intermediate view synthesis. The representation is extracted from MVD data in two steps. First, a set of 3D quads is extracted thanks to quadtree decomposition performed on depth maps. Second, a selective elimination of the quads is performed in order to reduce inter-view redundancies and thus provide a compact representation. Moreover, the proposed methodology for extracting the representation allows to reduce ghosting artifacts. Finally, an adapted compression technique is proposed that limits coding artifacts. The results presented on two real sequences show that the proposed representation provides a good trade-off between rendering quality and data compactness.  相似文献   

4.
Depth image based rendering is one of key techniques to realize view synthesis for three-dimensional television and free-viewpoint television, which provide high quality and immersive experiences to end viewers. However, artifacts of rendered images, including holes caused by occlusion/disclosure and boundary artifacts, may degrade the subjective and objective image quality. To handle these problems and improve the quality of rendered images, we present a novel view-spatial–temporal post-refinement method for view synthesis, in which new hole-filling and boundary artifact removal techniques are proposed. In addition, we propose an optimal reference frame selection algorithm for a better trade-off between the computational complexity and rendered image quality. Experimental results show that the proposed method can achieve a peak signal-to-noise ratio gain of 0.94 dB on average for multiview video test sequences when compared with the benchmark view synthesis reference software. In addition, the subjective quality of the rendered image is also improved.  相似文献   

5.
An improved DIBR-based (Depth image based rendering) whole frame error concealment method for multiview video with depth is designed. An optimal reference view selection is first proposed. The paper further includes three modified parts for the DIBRed pixels. First, the missing 1-to-1 pixels are concealed by the pixels from another view. The light differences between views are taken care of by the information of the motion vector of the projected coordination and a reverse DIBR procedure. Second, the generation of the many-to-1 pixels is improved via their depth information. Third, the hole pixels are found using the estimated motion vectors derived efficiently from a weighted function of the neighboring available motion vectors and their distance to the target hole pixel. The experimental results show that, compared to the state-of-the-art method, the combined system of the four proposed methods is superior and improves the performance by 5.53 dB at maximum.  相似文献   

6.
In multiview video plus depth (MVD) format, virtual views are generated from decoded texture videos with corresponding decoded depth images through depth image based rendering (DIBR). 3DV-ATM is a reference model for the H.264/AVC based multiview video coding (MVC) and aims at achieving high coding efficiency for 3D video in MVD format. Depth images are first downsampled then coded by 3DV-ATM. However, sharp object boundary characteristic of depth images does not well match with the transform coding based nature of H.264/AVC in 3DV-ATM. Depth boundaries are often blurred with ringing artifacts in the decoded depth images that result in noticeable artifacts in synthesized virtual views. This paper presents a low complexity adaptive depth truncation filter to recover the sharp object boundaries of the depth images using adaptive block repositioning and expansion for increasing the depth values refinement accuracy. This new approach is very efficient and can avoid false depth boundary refinement when block boundaries lie around the depth edge regions and ensure sufficient information within the processing block for depth layers classification. Experimental results demonstrate that the sharp depth edges can be recovered using the proposed filter and boundary artifacts in the synthesized views can be removed. The proposed method can provide improvement up to 3.25 dB in the depth map enhancement and bitrate reduction of 3.06% in the synthesized views.  相似文献   

7.
View synthesis is a crucial process in current 3D video applications. Currently, the existing view synthesis techniques may introduce visual artifacts such as corona, pinholes and ghosts into pictures, which degrade the visual experience greatly. In this paper, we will introduce an error resilient 3D view synthesis approach, which is able to effectively remove these artifacts. Specifically, we first detect the regions mixed with foreground and background pixels to avoid corona artifacts. Then, we resize images and conduct projection on the resized images to reduce pinhole artifacts. Finally, an improved view blending algorithm is proposed to reduce ghosting artifacts. Simulation results demonstrate that our proposed method outperforms others significantly in removing view artifacts.  相似文献   

8.
The quality of the synthesized views by Depth Image Based Rendering (DIBR) highly depends on the accuracy of the depth map, especially the alignment of object boundaries of texture image. In practice, the misalignment of sharp depth map edges is the major cause of the annoying artifacts at the disoccluded regions of the synthesized views. Conventional smooth filter approach blurs the depth map to reduce the disoccluded regions. The drawbacks are the degradation of 3D perception of the reconstructed 3D videos and the destruction of the texture in background regions. Conventional edge preserving filter utilizes the color image information in order to align the depth edges with color edges. Unfortunately, the characteristics of color edges and depth edges are very different which causes annoying boundaries artifacts in the synthesized virtual views. Recent solution of reliability-based approach uses reliable warping information from other views to fill the holes. However, it is not suitable for the view synthesis in video-plus-depth based DIBR applications. In this paper, a new depth map preprocessing approach is proposed. It utilizes Watershed color segmentation method to correct the depth map misalignment and then the depth map object boundaries are extended to cover the transitional edge regions of color image. This approach can handle the sharp depth map edges lying inside or outside the object boundaries in 2D sense. The quality of the disoccluded regions of the synthesized views can be significantly improved and unknown depth values can also be estimated. Experimental results show that the proposed method achieves superior performance for view synthesis by DIBR especially for generating large baseline virtual views.  相似文献   

9.
Multi-view video plus depth (MVD) format is considered as the next-generation standard for advanced 3D video systems. MVD consists of multiple color videos with a depth value associated with each texture pixel. Relying on this representation and by using depth-image-based rendering techniques, new viewpoints for multi-view video applications can be generated. However, since MVD is captured from different viewing angles with different cameras, significant illumination and color differences can be observed between views. These color mismatches degrade the performance of view rendering algorithms by introducing visible artifacts leading to a reduced view synthesis quality. To cope with this issue, we propose an effective method for correcting color inconsistencies in MVD. Firstly, to avoid occlusion problems and allow performing correction in the most accurate way, we consider only the overlapping region when calculating the color mapping function. These common regions are determined using a reliable feature matching technique. Also, to maintain the temporal coherence, correction is applied on a temporal sliding window. Experimental results show that the proposed method reduces the color difference between views and improves view rendering process providing high-quality results.  相似文献   

10.
This paper presents new hole‐filling methods for generating multiview images by using depth image based rendering (DIBR). Holes appear in a depth image captured from 3D sensors and in the multiview images rendered by DIBR. The holes are often found around the background regions of the images because the background is prone to occlusions by the foreground objects. Background‐oriented priority and gradient‐oriented priority are also introduced to find the order of hole‐filling after the DIBR process. In addition, to obtain a sample to fill the hole region, we propose the fusing of depth and color information to obtain a weighted sum of two patches for the depth (or rendered depth) images and a new distance measure to find the best‐matched patch for the rendered color images. The conventional method produces jagged edges and a blurry phenomenon in the final results, whereas the proposed method can minimize them, which is quite important for high fidelity in stereo imaging. The experimental results show that, by reducing these errors, the proposed methods can significantly improve the hole‐filling quality in the multiview images generated.  相似文献   

11.
In this paper, a new coding method for multiview depth video is presented. Considering the smooth structure and sharp edges of depth maps, a segmentation based approach is proposed. This allows further preserving the depth contours thus introducing fewer artifacts in the depth perception of the video. To reduce the cost associated with partition coding, an approximation of the depth partition is built using the decoded color view segmentation. This approximation is refined by sending some complementary information about the relevant differences between color and depth partitions. For coding the depth content of each region, a decomposition into orthogonal basis is used in this paper although similar decompositions may be also employed. Experimental results show that the proposed segmentation based depth coding method outperforms H.264/AVC and H.264/MVC by more than 2 dB at similar bitrates.  相似文献   

12.
In this paper, we present a method for modeling a complex scene from a small set of input images taken from widely separated viewpoints and then synthesizing novel views. First, we find sparse correspondences across multiple input images and calibrate these input images taken with unknown cameras. Then one of the input images is chosen as the reference image for modeling by match propagation. A sparse set of reliably matched pixels in the reference image is initially selected and then propagated to neighboring pixels based on both the clustering-based light invariant photoconsistency constraint and the data-driven depth smoothness constraint, which are integrated into a pixel matching quality function to efficiently deal with occlusions, light changes and depth discontinuity. Finally, a novel view rendering algorithm is developed to fast synthesize a novel view by match propagation again. Experimental results show that the proposed method can produce good scene models from a small set of widely separated images and synthesize novel views in good quality.  相似文献   

13.
传统的基于全局优化的立体匹配算法计算复杂度较高,在遮挡和视差不连续区域具有较差的匹配精度。提出了基于Tao 立体匹配框架的全局优化算法。首先采用高效的局部算法获取初始匹配视差;然后对得到的视差值进行可信度检测,利用可信像素点和视差平面假设使用具有鲁棒性的低复杂度算法修正不可信任像素视差值;最后改进置信度传播算法,使其能够自适应地停止收敛节点的消息传播,并对经修正的初始匹配进行优化,提高弱纹理区域匹配准确度。实验结果表明,文中算法有效地降低整体误匹配率,改善了视差不连续及遮挡区域的匹配精度;同时,降低了算法整体复杂度,兼顾了速度,具有一定的实用性。  相似文献   

14.
In general, excessive colorimetric and geometric errors in multi-view images induce visual fatigue to users. Various works have been proposed to reduce these errors, but conventional works have only been available for stereoscopic images while requiring cumbersome additional tasks, and often showing unstable results. In this paper, we propose an effective multi-view image refinement algorithm. The proposed algorithm analyzes such errors in multi-view images from sparse correspondences and compensates them automatically. While the conventional works transform every view to compensate geometric errors, the proposed method transforms only the source views with consideration of a reference view. Therefore this approach can be extended regardless of the number of views. In addition, we also employ uniform view intervals to provide consistent depth perception among views. We correct color inconsistency among views from the correspondences by considering importance and channel properties. Various experimental results show that the proposed algorithm outperforms conventional approaches and generates more visually comfortable multi-view images.  相似文献   

15.
A multiview sequence CODEC with flexibility, MPEG-2 compatibility and view scalability is proposed. We define a GGOP (Group of GOP) structure as a basic coding unit to efficiently code multiview sequences. Our proposed CODEC provides flexible GGOP structures based on the number of views and baseline distances among cameras. The encoder generates two types of bitstreams; a main bitstream and an auxiliary one. The main bitstream is the same as a MPEG-2 mono-sequence bitstream for MPEG-2 compatibility. The auxiliary bitstream contains information concerning the remaining multiview sequences except for the reference sequences. Our proposed CODEC with view scalability provides several viewers with realities or one viewer motion parallax whereby changes in the viewer’s position results in changes in what is seen. The important point is that a number of view points are selectively determined at the receiver according to the type of display modes. The viewers can choose an arbitrary number of views by checking the information so that only the views selected are decoded and displayed.The proposed multiview sequence CODEC is tested with several multiview sequences to determine its flexibility, compatibility and view scalability. In addition, we subjectively confirm that the decoded bitstreams with view scalability can be properly displayed by several types of display modes, including 3D monitors.  相似文献   

16.
为了有效填补虚拟视点图像中的公共空洞,提出 了一种基于逆向映射的空洞填补方法。 首先利用深度图像绘制(DIBR)技术将左、右参考视点映射到虚拟视点位置,利用图像膨胀方 法将映射的虚拟视图中的空 洞区域进行扩大,以消除虚拟视点图像中的伪影瑕疵;然后,提取出膨胀后空洞区域的边界 ,并将其逆映 射到原始的参考图像中,根据空洞与边界的相对位置,选取原始图像中相对位置上的像素来 填充虚拟视图 中的空洞区域;最后,将空洞填补之后的左、右视点映射的虚拟视图进行融合获得最终的虚 拟视图。实验 证明,本文方法有效解决了传统空洞填补方法容易将前景像素填充到背景区域的问题,能 够获得较好的视觉观看效果和较高的客观峰值信噪比(PSNR)值。  相似文献   

17.
新一代基于HEVC的3D视频编码技术   总被引:2,自引:1,他引:1  
HEVC标准出台后,新一代基于HEVC的多视点加深度编码也将正式推出。基于HEVC的3D视频编码作为HEVC标准的扩展部分,主要面向立体电视和自由立体视频。从该编码方式的基本结构出发,较全面地介绍了视频编码方式、深度图编码方式和对深度图的编码控制三个方面的关键技术,包括视点间运动预测、深度图建模模式和视点合成优化等技术。  相似文献   

18.
In this paper, we propose a key-frame-based bi-directional depth propagation algorithm for semi-automatic 2D-to-3D stereoscopic video conversion. First, key-frames are identified from each video shot based on color motion-compensation errors to prevent high-motion content between any pair of consecutive key frames. Depths for key-frames are manually assigned or rendered by popular computer tools, and then bi-directionally propagated to non-key-frames there between. Our depth propagation algorithm is featured of a multi-pass error correcting procedure for each frame to prevent depth artifacts from being further propagated to adjacent frames. Our proposed algorithm is advantageous in solving the background occlusion/dis-occlusion problem that degrades the performances of traditional depth propagation algorithms. Experimental results show that our scheme is capable of achieving better results against three prior algorithms in view of the qualities of the estimated depth map (e.g., dis-occluded background and object boundaries) and the synthesized stereo views.  相似文献   

19.
Many alternative transforms have been developed recently for improved compression of images, intra prediction residuals or motion-compensated prediction residuals. In this paper, we propose alternative transforms for multiview video coding. We analyze the spatial characteristics of disparity-compensated prediction residuals, and the analysis results show that many regions have 1-D signal characteristics, similar to previous findings for motion-compensated prediction residuals. Signals with such characteristics can be transformed more efficiently with transforms adapted to these characteristics and we propose to use 1-D transforms in the compression of disparity-compensated prediction residuals in multiview video coding. To show the compression gains achievable from using these transforms, we modify the reference software (JMVC) of the multiview video coding amendment to H.264/AVC so that each residual block can be transformed either with a 1-D transform or with the conventional 2-D Discrete Cosine Transform. Experimental results show that coding gains ranging from about 1–15% of Bjontegaard-Delta bitrate savings can be achieved.  相似文献   

20.
In the applications of Free View TV, pre-estimated depth information is available to synthesize the intermediate views as well as to assist multi-view video coding. Existing view synthesis prediction schemes generate virtual view picture only from interview pictures. However, there are many types of signal mismatches caused by depth errors, camera heterogeneity or illumination difference across views and these mismatches decrease the prediction capability of virtual view picture. In this paper, we propose an adaptive learning based view synthesis prediction algorithm to enhance the prediction capability of virtual view picture. This algorithm integrates least square prediction with backward warping to synthesize the virtual view picture, which not only utilizes the adjacent views information but also the temporal decoded information to adaptively learn the prediction coefficients. Experiments show that the proposed method reduces the bitrates by up to 18 % relative to the multi-view video coding standard, and about 11 % relative to the conventional view synthesis prediction method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号