期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

No-reference synthetic image quality assessment with convolutional neural network and local image saliency

Wang Xiaochuan Liang Xiaohui Yang Bailin Li Frederick W. B. 《计算可视媒体（英文）》2019,5(2):193-208

Depth-image-based rendering (DIBR) is widely used in 3DTV, free-viewpoint video, and interactive 3D graphics applications. Typically, synthetic images generated by DIBR-based systems incorporate various distortions, particularly geometric distortions induced by object dis-occlusion. Ensuring the quality of synthetic images is critical to maintaining adequate system service. However, traditional 2D image quality metrics are ineffective for evaluating synthetic images as they are not sensitive to geometric distortion. In this paper, we propose a novel no-reference image quality assessment method for synthetic images based on convolutional neural networks, introducing local image saliency as prediction weights. Due to the lack of existing training data, we construct a new DIBR synthetic image dataset as part of our contribution. Experiments were conducted on both the public benchmark IRCCyN/IVC DIBR image dataset and our own dataset. Results demonstrate that our proposed metric outperforms traditional 2D image quality metrics and state-of-the-art DIBR-related metrics.

相似文献

2.

A SIFT features based blind watermarking for DIBR 3D images

Seung-Hun Nam Wook-Hyoung Kim Seung-Min Mun Jong-Uk Hou Sunghee Choi Heung-Kyu Lee 《Multimedia Tools and Applications》2018,77(7):7811-7850

Depth image based rendering (DIBR) is a promising technique for extending viewpoints with a monoscopic center image and its associated per-pixel depth map. With its numerous advantages including low-cost bandwidth, 2D-to-3D compatibility and adjustment of depth condition, DIBR has received much attention in the 3D research community. In the case of a DIBR-based broadcasting system, a malicious adversary can illegally distribute both a center view and synthesized virtual views as 2D and 3D content, respectively. To deal with the issue of copyright protection for DIBR 3D Images, we propose a scale invariant feature transform (SIFT) features based blind watermarking algorithm. To design the proposed method robust against synchronization attacks from DIBR operation, we exploited the parameters of the SIFT features: the location, scale and orientation. Because the DIBR operation is a type of translation transform, the proposed method uses high similarity between the SIFT parameters extracted from a synthesized virtual view and center view images. To enhance the capacity and security, we propose an orientation of keypoints based watermark pattern selection method. In addition, we use the spread spectrum technique for watermark embedding and perceptual masking taking into consideration the imperceptibility. Finally, the effectiveness of the presented method was experimentally verified by comparing with other previous schemes. The experimental results show that the proposed method is robust against synchronization attacks from DIBR operation. Furthermore, the proposed method is robust against signal distortions and typical attacks from geometric distortions such as translation and cropping. 相似文献

3.

An enhanced depth map based rendering method with directional depth filter and image inpainting

Wei Liu Dehua Zhang Mingyue Cui Jianwei Ding 《The Visual computer》2016,32(5):579-589

Depth image-based rendering (DIBR), which is used to render virtual views with a color image and the corresponding depth map, is one of the key techniques in the 2D to 3D video conversion process. In this paper, a novel method is proposed to partially solve two puzzles of DIBR, i.e. visual image generation and hole filling. The method combines two different approaches for synthesizing new views from an existing view and a corresponding depth map. Disoccluded parts of the synthesized image are first classified as either smooth or highly structured. At structured regions, inpainting is used to preserve the background structure. In other regions, an improved directional depth smoothing is used to avoid disocclusion. Thus, more details and straight line structures in the generated virtual image are preserved. The key contributions include an enhanced adaptive directional filter and a directional hole inpainting algorithm. Experiments show that the disocclusion is removed and the geometric distortion is reduced efficiently. The proposed method can generate more visually satisfactory results. 相似文献

4.

Elemental image array generation method by using optimized depth image‐based rendering algorithm for integral imaging display

下载免费PDF全文

Fei Xiao Zhao‐Long Xiong Huan Deng Yan Xing Qiong‐Hua Wang 《Journal of the Society for Information Display》2018,26(7):419-426

We propose an elemental image array (EIA) generation method by using an optimized depth image‐based rendering (DIBR) algorithm. In this method, the EIA is synthesized by the reference and virtual viewpoint elemental images, and the virtual viewpoint elemental images at the given locations are generated by DIBR algorithm. We optimize the existing DIBR algorithm by adaptively repairing the warped depth images in the processing part and extend the generation dimension of the virtual viewpoint elemental images from one dimension to two dimensions. The optimized DIBR algorithm can effectively solve the problem: the low quality of virtual viewpoint elemental images caused by discontinuous depth values and disocclusion regions. We also implement the generations of virtual viewpoint elemental images and EIA in graph processing unit to reduce the time cost. Experimental results show that the proposed method can not only improve the quality of the virtual viewpoint images but also accelerate the generations of the virtual viewpoint elemental images and EIA. 相似文献

5.

基于空间加权的虚拟视点绘制算法

汪辉陈芬焦任直彭宗举郁梅《计算机工程与应用》2016,52(8):174-179

在自由视点视频系统中,如何能在视频终端得到高质量的视频图像已成为基于深度图的绘制（DIBR）技术所研究的主要任务,其中虚拟视点像素插值是该技术中影响绘制质量的一个重要环节。针对虚拟视点绘制标准方案中存在的问题,提出了一种基于空间加权的像素插值算法。它是通过对多个投影像素点的深度值和水平方向绝对距离进行加权操作来实现像素插值的。在插值过程中,该算法考虑了不同区域投影像素点个数对像素插值准确性的影响,从而剔除了部分失真像素点,并且在图像输出前还分别对左、右参考虚拟视点进行了失真检测和矫正。实验结果表明,该算法改善了绘制的主、客观质量,其中,PSNR平均提高0.30 dB,SSIM平均提高0.001 3。因此,该算法可以有效地抑制像素插值过程引入的噪声,提高像素插值的精度。相似文献

6.

Analysis of maximum tolerant depth distortion in view synthesis

Laihua Wang Chunping Hou Sumin Qi Lanlan Jiang 《Multimedia Tools and Applications》2018,77(7):7909-7927

In view synthesis, pixels in an original view are warped into a virtual view with depth-image-based rendering (DIBR). During the procedure of DIBR, distortions in depth map may lead to geometric errors in the synthesized view which will induce quality degradation of synthesized view. Therefore, how to efficiently preserve the fidelity of depth information is extremely important. In this paper, we explore and develop a maximum tolerable depth distortion (MTDD) model to examine the allowable depth distortion which will not introduce any texture distortion for a rendered virtual view and accordingly develop. Experimental results show that a virtual view can be synthesized without introducing any geometric changes if depth distortions follow the MTDD specified thresholds. 相似文献

7.

Depth‐map generation for multi‐view autostereoscopic 3‐D displays based on the SIFT algorithm constrained by boundary

Ying‐Quan Zhang Qiong‐Hua Wang Ai‐Hong Wang Da‐Hai Li 《Journal of the Society for Information Display》2010,18(7):513-518

Abstract— A depth‐map estimation method, which converts two‐dimensional images into three‐dimensional (3‐D) images for multi‐view autostereoscopic 3‐D displays, is presented. The proposed method utilizes the Scale Invariant Feature Transform (SIFT) matching algorithm to create the sparse depth map. The image boundaries are labeled by using the Sobel operator. A dense depth map is obtained by using the Zero‐Mean Normalized Cross‐Correlation (ZNCC) propagation matching method, which is constrained by the labeled boundaries. Finally, by using depth rendering, the parallax images are generated and synthesized into a stereoscopic image for multi‐view autostereoscopic 3‐D displays. Experimental results show that this scheme achieves good performances on both parallax image generation and multi‐view autostereoscopic 3‐D displays. 相似文献

8.

Silhouette‐Aware Warping for Image‐Based Rendering

Gaurav Chaurasia Olga Sorkine George Drettakis 《Computer Graphics Forum》2011,30(4):1223-1232

Image‐based rendering (IBR) techniques allow capture and display of 3D environments using photographs. Modern IBR pipelines reconstruct proxy geometry using multi‐view stereo, reproject the photographs onto the proxy and blend them to create novel views. The success of these methods depends on accurate 3D proxies, which are difficult to obtain for complex objects such as trees and cars. Large number of input images do not improve reconstruction proportionally; surface extraction is challenging even from dense range scans for scenes containing such objects. Our approach does not depend on dense accurate geometric reconstruction; instead we compensate for sparse 3D information by variational image warping. In particular, we formulate silhouette‐aware warps that preserve salient depth discontinuities. This improves the rendering of difficult foreground objects, even when deviating from view interpolation. We use a semi‐automatic step to identify depth discontinuities and extract a sparse set of depth constraints used to guide the warp. Our framework is lightweight and results in good quality IBR for previously challenging environments. 相似文献

9.

Depth image‐based rendering for multiview generation

Young Ju Jeong Yong Ju Jung Du‐sik Park 《Journal of the Society for Information Display》2010,18(4):310-316

Abstract— Techniques for 3‐D display have evolved from stereoscopic 3‐D systems to multiview 3‐D systems, which provide images corresponding to different viewpoints. Currently, new technology is required for application in multiview display systems that use input‐source formats such as 2‐D images to generate virtual‐view images of multiple viewpoints. Due to the changes in viewpoints, occlusion regions of the original image become disoccluded, resulting in problems related to the restoration of output image information that is not contained in the input image. In this paper, a method for generating multiview images through a two‐step process is proposed: (1) depth‐map refinement and (2) disoccluded‐area estimation and restoration. The first step, depth‐map processing, removes depth‐map noise, compensates for mismatches between RGB and depth, and preserves the boundaries and object shapes. The second step, disoccluded‐area estimation and restoration, predicts the disoccluded area by using disparity and restores information about the area by using information about neighboring frames that are most similar to the occlusion area. Finally, multiview rendering generates virtual‐view images by using a directional rendering algorithm with boundary blending. 相似文献

10.

Hiding depth information in compressed 2D image/video using reversible watermarking

Wenyi Wang Jiying Zhao 《Multimedia Tools and Applications》2016,75(8):4285-4303

In this paper, a novel joint coding scheme is proposed for 3D media content including stereo images and multiview-plus-depth (MVD) video for the purpose of depth information hiding. The depth information is an image or image channel which reveals the distance of scene objects’ surfaces from a viewpoint. With the concern of copyright protection, access control and coding efficiency for 3D content, we propose to hide the depth information into the texture image/video by a reversible watermarking algorithm called Quantized DCT Expansion (QDCTE). Considering the crucial importance of depth information for depth-image-based rendering (DIBR), full resolution depth image/video is compressed and embedded into the texture image/video, and it can be extracted without extra quality degradation other than compression itself. The reversibility of the proposed algorithm guarantees that texture image/video quality will not suffer from the watermarking process even if high payload (i.e. depth information) is embedded into the cover image/video. In order to control the size increase of watermarked image/video, the embedding function is carefully selected and the entropy coding process is also customized according to watermarking strength. Huffman and content-adaptive variable-length coding (CAVLC), which are respectively used for JPEG image and H.264 video entropy encoding, are analyzed and customized. After depth information embedding, we propose a new method to update the entropy codeword table with high efficiency and low computational complexity according to watermark embedding strength. By using our proposed coding scheme, the depth information can be hidden into the compressed texture image/video with little bitstream size overhead while the quality degradation of original cover image/video from watermarking can be completely removed at the receiver side. 相似文献

11.

View invariant DIBR-3D image watermarking using DT-CWT

Rana Shuvendu Sur Arijit 《Multimedia Tools and Applications》2019,78(12):16665-16693

In 3D image compression, depth image based rendering (DIBR) is one of the latest techniques where the center image (say the main view, is used to synthesise the left and the right view image) and the depth image are communicated to the receiver side. It has been observed in the literature that most of the existing 3D image watermarking schemes are not resilient to the view synthesis process used in the DIBR technique. In this paper, a 3D image watermarking scheme is proposed which is invariant to the DIBR view synthesis process. In this proposed scheme, 2D-dual-tree complex wavelet transform (2D-DT-CWT) coefficients of centre view are used for watermark embedding such that shift invariance and directional property of the DT-CWT can be exploited to make the scheme robust against view synthesis process. A comprehensive set of experiments has been carried out to justify the robustness of the proposed scheme over the related existing schemes with respect to the JPEG compression and synthesis view attack.

相似文献

12.

An automatic 2D to 3D video conversion approach based on RGB-D images

Pan Baiyu Zhang Liming Yin Hanxiong Lan Jun Cao Feilong 《Multimedia Tools and Applications》2021,80(13):19179-19201

3D movies/videos have become increasingly popular in the market; however, they are usually produced by professionals. This paper presents a new technique for the automatic conversion of 2D to 3D video based on RGB-D sensors, which can be easily conducted by ordinary users. To generate a 3D image, one approach is to combine the original 2D color image and its corresponding depth map together to perform depth image-based rendering (DIBR). An RGB-D sensor is one of the inexpensive ways to capture an image and its corresponding depth map. The quality of the depth map and the DIBR algorithm are crucial to this process. Our approach is twofold. First, the depth maps captured directly by RGB-D sensors are generally of poor quality because there are many regions missing depth information, especially near the edges of objects. This paper proposes a new RGB-D sensor based depth map inpainting method that divides the regions with missing depths into interior holes and border holes. Different schemes are used to inpaint the different types of holes. Second, an improved hole filling approach for DIBR is proposed to synthesize the 3D images by using the corresponding color images and the inpainted depth maps. Extensive experiments were conducted on different evaluation datasets. The results show the effectiveness of our method.

相似文献

13.

基于DIBR和图像融合的任意视点绘制 总被引：1，自引：1，他引：1

下载免费PDF全文

刘占伟安平刘苏醒张兆扬《中国图象图形学报》2007,12(10):1696-1700

虚拟视点生成是3维视频会议等应用领域中的关键技术,为了快速高质量地进行任意视点绘制,提出了一种基于深度图像绘制(DIBR)和图像融合的新视点生成方法,该方法首先对参考图像进行预处理,包括深度图像的边缘滤波和参考图像规正,以减少目标图像中产生的较大空洞和虚假边缘;然后利用3维图像变换生成新视点图像,并用遮挡兼容算法对遮挡进行快速处理;接着再对两幅目标图像进行融合得到新视点图像;最后用插值法填充剩余的较小空洞。实验证明,该新方法能获得令人满意的绘制效果。相似文献

14.

快速3维坐标变换的绘制算法

下载免费PDF全文

汪辉彭宗举焦仁直陈芬郁梅蒋刚毅《中国图象图形学报》2016,21(6):805-814

目的基于深度图的绘制(DIBR)是一种新型的虚拟视点生成技术,在诸多方面得到了广泛的应用。然而,该技术还不能满足实时性的绘制需求。为了在保证绘制质量不下降的前提下,尽可能地提高绘制速度,提出了一种高效的3D-Warping(3维坐标变换)算法。方法主要在以下3个方面进行了改进:1)引入了深度—视差映射表技术,避免了重复地进行视差求取操作。2)对深度平坦的像素块进行基于块的3D-Warping,减少了映射的次数。对深度非平坦像素块中的像素点采取传统的基于像素点的3D-Warping,保证了映射的准确性。3)针对两种不同的3D-Warping方式,分别提出了相应的插值算法。在水平方向上,改进的像素插值算法对紧邻插值和Splatting(散射)插值算法进行了折中,只在映射像素点与待插值像素点很近的情况下才进行紧邻插值,否则进行Splatting插值;在深度方向上,它对Z-Buffer(深度缓存)技术进行了改进,舍弃了与前景物体太远的映射像素点,而对其他映射像素点按深度值进行加权操作。结果实验结果表明,与标准绘制方案的整像素精度相比,绘制时间平均节省了72.05%;与标准绘制方案的半像素精度相比,PSNR平均提高了0.355dB,SSIM平均提高了0.00115。结论改进算法非常适用于水平设置相机系统的DIBR技术中的整像素精度绘制,对包含大量深度平坦区域的视频序列效果明显,不但能够提高绘制的速度,而且可以有效地改善绘制的客观质量。相似文献

15.

Influence of 3‐D cross‐talk on qualified viewing spaces in two‐ and multi‐view autostereoscopic displays

Akimasa Yuuki Shin‐ichi Uehara Kazuki Taira Goro Hamagishi Kuniaki Izumi Toshio Nomura Ken Mashitani Atsushi Miyazawa Takafumi Koike Naoko Watanabe Yuzo Hisatake Tsutomu Horikoshi Shigeki Miyazaki Hiroyasu Ujike 《Journal of the Society for Information Display》2010,18(7):483-493

Abstract— To estimate the qualified viewing spaces for two‐ and multi‐view autostereoscopic displays, the relationship between image quality (image comfort, annoying ghost image, depth perception) and various pairings between 3‐D cross‐talk in the left and right views are studied subjectively using a two‐view autostereoscopic display and test charts for the left and right views with ghost images due to artificial 3‐D cross‐talk. The artificial 3‐D cross‐talk was tuned to simulate the view in the intermediate zone of the viewing spaces. It was shown that the stereoscopic images on a two‐view autostereoscopic display cause discomfort when they are observed by the eye in the intermediate zone between the viewing spaces. This is because the ghost image due to large 3‐D cross‐talk in the intermediate zone elicits different depth perception from the depth induced by the original images for the left and right views, so the observer's depth perception is confused. Image comfort is also shown to be better for multi‐views, especially the width of the viewing space, which is narrower than the interpupillary distance, where the parallax of the cross‐talking image is small. 相似文献

16.

对3D Warping分解的高质量虚拟视点绘制方法

下载免费PDF全文

郭秋红梁秀霞《计算机工程与应用》2019,55(3):184-190

基于深度图像的虚拟视点绘制（DIBR）作为自由视点视频应用中的核心问题,由其获得的任意视频的质量和速度对于自由视点视频的发展至关重要。为解决经典DIBR方法存在的重叠、空洞、伪影,以及由于重采样导致的细小裂纹问题以提高虚拟视点图像质量,提出了一种基于改进的3D Warping过程的虚拟视点绘制方法。该方法从整个3D Warping过程出发,将3D Warping过程分解成两步而无需进行三维建模,并分别在两步中提出了改进的方法。在第一步中,通过在变换矩阵中引入可调的系数对三维参数进行修改;在第二步中,主要针对由于重采样导致的细小的裂纹问题,提出了自适应一投多算法,在兼顾时间复杂度的同时改善了裂纹问题。实验结果表明,无论从主观质量还是客观评价标准来看,该方法都能够显著提高虚拟视点绘制的图像质量。相似文献

17.

Exploiting Repetitions for Image‐Based Rendering of Facades

下载免费PDF全文

Simon Rodriguez Adrien Bousseau Fredo Durand George Drettakis 《Computer Graphics Forum》2018,37(4):119-131

Street‐level imagery is now abundant but does not have sufficient capture density to be usable for Image‐Based Rendering (IBR) of facades. We present a method that exploits repetitive elements in facades ‐ such as windows ‐ to perform data augmentation, in turn improving camera calibration, reconstructed geometry and overall rendering quality for IBR. The main intuition behind our approach is that a few views of several instances of an element provide similar information to many views of a single instance of that element. We first select similar instances of an element from 3–4 views of a facade and transform them into a common coordinate system, creating a “platonic” element. We use this common space to refine the camera calibration of each view of each instance and to reconstruct a 3D mesh of the element with multi‐view stereo, that we regularize to obtain a piecewise‐planar mesh aligned with dominant image contours. Observing the same element under multiple views also allows us to identify reflective areas ‐ such as glass panels ‐ which we use at rendering time to generate plausible reflections using an environment map. Our detailed 3D mesh, augmented set of views, and reflection mask enable image‐based rendering of much higher quality than results obtained using the input images directly. 相似文献

18.

Deep compression of remotely rendered views

《Multimedia, IEEE Transactions on》2006,8(3):444-456

Three-dimensional (3-D) models are information-rich and provide compelling visualization effects. However downloading and viewing 3-D scenes over the network may be excessive. In addition low-end devices typically have insufficient power and/or memory to render the scene interactively in real-time. Alternatively,3-D image warping, an image-based-rendering technique that renders a two-dimensional(2-D) depth view to form new views intended from different viewpoints and/or orientations, may be employed on a limited device. In a networked 3-D environment,the warped views may be further compensated by the graphically rendered views and transmitted to clients at times. Depth views can be considered as a compact model of 3-D scenes enabling the remote rendering of complex 3-D environment on relatively low-end devices. The major overhead of the 3-D image warping environment is the transmission of the depth views of the initial and subsequent references. This paper addresses the issue by presenting an effective remote rendering environment based on the deep compression of depth views utilizing the context statistics structure present in depth views. The warped image quality is also explored by reducing the resolution of the depth map. It is shown that proposed deep compression of the remote rendered view significantly outperforms the JPEG2000 and enables the realtime rendering of remote 3-D scene while the degradation of warped image quality is visually imperceptible for the benchmark scenes. 相似文献

19.

Distance metric between 3D models and 2D images for recognition andclassification

Basri R. Weinshall D. 《IEEE transactions on pattern analysis and machine intelligence》1996,18(4):465-479

Similarity measurements between 3D objects and 2D images are useful for the tasks of object recognition and classification. The authors distinguish between two types of similarity metrics: metrics computed in image-space (image metrics) and metrics computed in transformation-space (transformation metrics). Existing methods typically use image metrics; namely, metrics that measure the difference in the image between the observed image and the nearest view of the object. Example for such a measure is the Euclidean distance between feature points in the image and their corresponding points in the nearest view. (This measure can be computed by solving the exterior orientation calibration problem.) In this paper the authors introduce a different type of metrics: transformation metrics. These metrics penalize for the deformations applied to the object to produce the observed image. In particular, the authors define a transformation metric that optimally penalizes for “affine deformations” under weak-perspective. A closed-form solution, together with the nearest view according to this metric, are derived. The metric is shown to be equivalent to the Euclidean image metric, in the sense that they bound each other from both above and below. It therefore provides an easy-to-use closed-form approximation for the commonly-used least-squares distance between models and images. The authors demonstrate an image understanding application, where the true dimensions of a photographed battery charger are estimated by minimizing the transformation metric 相似文献

20.

Characterizing image quality of autostereoscopic displays by using the maximum luminance at each viewing position

Tsutomu Horikoshi Shinji Kimura 《Journal of the Society for Information Display》2014,22(1):1-8

A metric of the 3D image quality of autostereoscopic displays based on optical measurements is proposed. This metric uses each view's luminance contrast, which is defined as the ratio of maximum luminance at each viewing position to total luminance at that position. Conventional metrics of the autostereoscopic display based on crosstalk, which uses “wanted” and “unwanted” lights. However, in case of the multiple‐views‐type autostereoscopic displays, it is difficult to distinguish exactly which lights are wanted lights and which are unwanted lights. This paper assumes that the wanted light has a maximum luminance at the good stereoscopic viewing position, and the unwanted light also has a maximum luminance at the worst pseudo‐stereoscopic viewing position. By using the maximum luminance that is indexed by view number of the autostereoscopic display, the proposed method enables characterizing stereoscopic viewing conditions without using wanted/unwanted light. A 3D image quality metric called “stereo luminance contrast,” the average of both eyes' contrast, is proposed. The effectiveness of the proposed metric is confirmed by the results of optical measurement analyses of different types of autostereoscopic displays, such as the two‐view, scan‐backlight, multi‐view, and integral. 相似文献