首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 30 毫秒
1.
In image-based rendering with adjustable illumination, the data set contains a large number of pre-captured images under different sampling lighting directions. Instead of individually compressing each pre-captured image, we propose a two-level compression method. Firstly, we use a few spherical harmonic (SH) coefficients to represent the plenoptic property of each pixel. The classical discrete summation method for extracting SH coefficient requires that the sampling lighting directions should be uniformly distributed on the whole spherical surface. It cannot handle the case that the sampling lighting directions are irregularly distributed. A constrained least-squares algorithm is proposed to handle this case. Afterwards, embedded zero-tree wavelet coding is used for removing the spatial redundancy in SH coefficients. Simulation results show our approach is much superior to the JPEG, JPEG2000, MPEG2, and 4D wavelet compression method. The way to allow users to interactively control the lighting condition of a scene is also discussed.  相似文献   

2.
Depth image based rendering is one of key techniques to realize view synthesis for three-dimensional television and free-viewpoint television, which provide high quality and immersive experiences to end viewers. However, artifacts of rendered images, including holes caused by occlusion/disclosure and boundary artifacts, may degrade the subjective and objective image quality. To handle these problems and improve the quality of rendered images, we present a novel view-spatial–temporal post-refinement method for view synthesis, in which new hole-filling and boundary artifact removal techniques are proposed. In addition, we propose an optimal reference frame selection algorithm for a better trade-off between the computational complexity and rendered image quality. Experimental results show that the proposed method can achieve a peak signal-to-noise ratio gain of 0.94 dB on average for multiview video test sequences when compared with the benchmark view synthesis reference software. In addition, the subjective quality of the rendered image is also improved.  相似文献   

3.
Free navigation of a scene requires warping some reference views to some desired target viewpoint and blending them to synthesize a virtual view. Convolutional Neural Networks (ConvNets) based methods can learn both the warping and blending tasks jointly. Such methods are often designed for moderate inter-camera baseline distance and larger kernels are required for warping if the baseline distance increases. Algorithmic methods can in principle deal with large baselines, however the synthesized view suffers from artifacts near disoccluded pixels. We present a hybrid approach where first, reference views are algorithmically warped to the target position and then are blended via a ConvNet. Preliminary view warping allows reducing the size of the convolutional kernels and thus the learnable parameters count. We propose a residual encoder–decoder for image blending with a Siamese encoder to further keep the parameters count low. We also contribute a hole inpainting algorithm to fill the disocclusions in the warped views. Our view synthesis experiments on real multiview sequences show better objective image quality than state-of-the-art methods due to fewer artifacts in the synthesized images.  相似文献   

4.
Light Field (LF) image angular super-resolution aims to synthesize a high angular resolution LF image from a low angular resolution one, and is drawing increased attention because of its wide applications. In order to reconstruct a high angular resolution LF image, many learning based LF image angular super-resolution methods have been proposed. However, most existing methods are based on LF Epipolar Plane Image or Epipolar Plane Image volume representation, which underuse the LF image structure. The LF view spatial correlation and neighboring LF views angular correlations which can reflect LF image structure are not fully explored, which reduces LF angular super-resolution quality. In order to alleviate this problem, this paper introduces an Epipolar Plane Image Volume Stack (EPI-VS) representation for LF angular super-resolution. The EPI-VS is constituted by arranging all LF views in a raster order, which benefits in exploring LF view spatial correlation and neighboring LF views angular correlations. Based on such representation, we further propose an LF angular super-resolution network. 3D convolutions are applied in the whole super-resolution network to better accommodate the input EPI-VS data and allow information propagation between two spatial and one directional dimensions of EPI-VS data. Extensive experiments on synthetic and real-world LF scenes demonstrate the effectiveness of the proposed network. Moreover, we also illustrate the superiority of our network by applying it in scene depth estimation task.  相似文献   

5.
Multi‐view video plus depth (MVD) has been widely used owing to its effectiveness in three‐dimensional data representation. Using MVD, color videos with only a limited number of real viewpoints are compressed and transmitted along with captured or estimated depth videos. Because the synthesized views are generated from decoded real views, their original reference views do not exist at either the transmitter or receiver. Therefore, it is challenging to define an efficient metric to evaluate the quality of synthesized images. We propose a novel metric—the reduced‐reference quality metric. First, the effects of depth distortion on the quality of synthesized images are analyzed. We then employ the high correlation between the local depth distortions and local color characteristics of the decoded depth and color images, respectively, to achieve an efficient depth quality metric for each real view. Finally, the objective quality metric of the synthesized views is obtained by combining all the depth quality metrics obtained from the decoded real views. The experimental results show that the proposed quality metric correlates very well with full reference image and video quality metrics.  相似文献   

6.
A new tensor transfer-based novel view synthesis (NVS) method is proposed in this paper. As opposed to conventional tensor transfer methods which transfer the pixel from the real input views to the virtual novel view, our method operates inversely in the sense that it transfers a pixel from the novel view image back to the real images. This inverse tensor-transfer approach offers a simple mechanism for associating corresponding image points across multiple views, resulting in geometrically consistent pixel chains across the input images. A colour consistency metric is used to choose the most likely colour for a pixel in the novel image by analysing the spread of colours in each of the possible pixel chains. By emphasizing colour consistency, rather than depth, our method avoids the need to precompute a dense depth map (which is essential for most conventional transfer methods), therefore alleviating many common problems with conventional methods. Experiments involving NVS on real images give promising results. The synthesized novel view image is not only photo-realistic but also has the right geometric relationship with respect to the other views.Since this method avoids explicit depth map computation, we further investigate its applicability to the multi-baseline stereo matching problem (MBS). By using this inverse transfer idea, we are able to handle non-ideally configured MBS in a natural and efficient way. The new MBS algorithm has been used for stereo vision navigation.  相似文献   

7.
Vincent Nozick 《电信纪事》2013,68(11-12):581-596
This paper presents an image rectification method for an arbitrary number of views with aligned camera center. This paper also describes how to extend this method to easily perform a robust camera calibration. These two techniques can be used for stereoscopic rendering to enhance the perception comfort or for depth from stereo. In this paper, we first expose why epipolar geometry is not suited to solve this problem. Second, we propose a nonlinear method that includes all the images in the rectification process. Then, we detail how to extract the rectification parameters to provide a quasi-Euclidean camera calibration. Our method only requires point correspondences between the views and can handle images with different resolutions. The tests show that it is robust to noise and to sparse point correspondences among the views.  相似文献   

8.
The quality of the synthesized views by Depth Image Based Rendering (DIBR) highly depends on the accuracy of the depth map, especially the alignment of object boundaries of texture image. In practice, the misalignment of sharp depth map edges is the major cause of the annoying artifacts at the disoccluded regions of the synthesized views. Conventional smooth filter approach blurs the depth map to reduce the disoccluded regions. The drawbacks are the degradation of 3D perception of the reconstructed 3D videos and the destruction of the texture in background regions. Conventional edge preserving filter utilizes the color image information in order to align the depth edges with color edges. Unfortunately, the characteristics of color edges and depth edges are very different which causes annoying boundaries artifacts in the synthesized virtual views. Recent solution of reliability-based approach uses reliable warping information from other views to fill the holes. However, it is not suitable for the view synthesis in video-plus-depth based DIBR applications. In this paper, a new depth map preprocessing approach is proposed. It utilizes Watershed color segmentation method to correct the depth map misalignment and then the depth map object boundaries are extended to cover the transitional edge regions of color image. This approach can handle the sharp depth map edges lying inside or outside the object boundaries in 2D sense. The quality of the disoccluded regions of the synthesized views can be significantly improved and unknown depth values can also be estimated. Experimental results show that the proposed method achieves superior performance for view synthesis by DIBR especially for generating large baseline virtual views.  相似文献   

9.
This paper addresses the problem of efficient representation of scenes captured by distributed omnidirectional vision sensors. We propose a novel geometric model to describe the correlation between different views of a 3-D scene. We first approximate the camera images by sparse expansions over a dictionary of geometric atoms. Since the most important visual features are likely to be equivalently dominant in images from multiple cameras, we model the correlation between corresponding features in different views by local geometric transforms. For the particular case of omnidirectional images, we define the multiview transforms between corresponding features based on shape and epipolar geometry constraints. We apply this geometric framework in the design of a distributed coding scheme with side information, which builds an efficient representation of the scene without communication between cameras. The Wyner-Ziv encoder partitions the dictionary into cosets of dissimilar atoms with respect to shape and position in the image. The joint decoder then determines pairwise correspondences between atoms in the reference image and atoms in the cosets of the Wyner-Ziv image in order to identify the most likely atoms to decode under epipolar geometry constraints. Experiments demonstrate that the proposed method leads to reliable estimation of the geometric transforms between views. In particular, the distributed coding scheme offers similar rate-distortion performance as joint encoding at low bit rate and outperforms methods based on independent decoding of the different images.  相似文献   

10.
为降低深度数据的编码复杂度并保证重建虚拟视点的质量,提出了一种基于JNDD模型面向虚拟视点绘制的快速深度图编码算法,引入最小可觉深度差模型,将深度图划分为对绘制失真敏感的竖直边缘区域与失真难以被人眼察觉的平坦区域,并相应地为编码过程中的宏块模式选择设计了两种搜索策略。实验结果表明,与JM编码方案相比,本文所提出的方法在保证虚拟视质量与编码码率基本不变的前提下,显著降低了编码复杂度,有助于在三维视频系统中提高深度编码模块的编码速度。  相似文献   

11.
We propose in this paper a novel cross-view gait recognition method based on projection of gravity center trajectory (GCT). We project the coefficients of 3-D GCT in reality to different view planes to complete view variation. Firstly, we estimate the real GCT curve in 3-D space under different views by statistics of limb parameters. Then, we get the view transformation matrix based on the projection principle between curve and plane, and estimate the view of a silhouette sequence by this matrix to complete view variance of gait features. We calculate the body part trajectory on silhouette sequence to improve recognition accuracy by using correlation strength as similarity measure. Lastly, we take nested match method to calculate the final matching score of two kinds of features. Experimental results on the widely used CASIA-B gait database demonstrate the effectiveness and practicability of the proposed method.  相似文献   

12.
安平  张兆杨  刘苏醒   《电子器件》2008,31(1):285-289
在立体显示中,视点合成是实现交互性的关键技术,即在三维(3D)场景中通过自由选择视点而获得环视能力.本文将视点插值和基于图像拼合的视点变形技术相结合,提出一种中间视合成算法.首先均匀化原始立体图像对;然后只对前景对象区域进行视差估计以提高视差匹配的速度和精度;接着确定左右视点中的可靠区域,根据可靠区域生成过渡中间视点;最后,采用视点插值结合变形的方法,由过渡视合成中间视点.实验结果表明合成的中间视点图像质量良好,而且合成速度也明显提高.本文算法可用于实时 3D 视频应用的交互式立体显示,可以实现任意视点的实时绘制.  相似文献   

13.
A constrained disparity estimation method is proposed which uses a directional regularization technique to efficiently preserve edges for stereo image coding. The proposed method smoothes disparity vectors in smooth regions and preserves edges in object boundaries well, without creating an oversmoothing problem. The differential pulse code modulation (DPCM) technique for disparity map coding is used prior to entropy coding, in order to improve the overall coding efficiency. The proposed disparity estimation method can also be applied to intermediate view reconstruction. Intermediate views between a left image and a right image provide reality and natural motion parallax to multiviewers. Intermediate views are synthesized by appropriately exploiting an interpolation or an extrapolation technique according to the characteristics of each region after identifying the regions as occluded regions, normal regions, and regions having ambiguous disparities.The experimental results show that the proposed disparity estimation method gives close matches between a left image and a right image and improves coding efficiency. In addition, we can subjectively confirm that the application of our proposed intermediate view reconstruction method leads to satisfactory intermediate views from a stereo image pair.This work was supported by the Korea Institute of Science and Technology (KIST) under Grant No. 99HI-054.  相似文献   

14.
The protection of 3D contents from illegal distribution has attracted considerable attention and depth-image-based rendering (DIBR) is proved to be a promising technology for 3D image and video displaying. In this paper, we propose a new digital watermarking scheme for DIBR 3D images based on feature regions and ridgelet transform (RT). In this scheme, the center view and the depth map are made available at the content provider side. After selecting the reference points of the center view, we construct the feature regions for watermark embedding. Considering the sparse image representation and directional sensitivity of the RT, the watermark bits are embedded into the amplitudes of the ridgelet coefficients of the most energetic direction. The virtual left and virtual right views are generated from the watermarked center view and the associated depth map at the content consumer side. The watermarked view has good perceptual quality under both the objective and subjective image quality evaluations. The embedded watermark can be detected blindly with low bit error rate (BER) from the watermarked center view, the synthesized left and right views even when the views are distorted and distributed separately. The experimental results demonstrate that the proposed scheme exhibits good performance in terms of robustness against various image processing attacks. Meanwhile, our method can be robust to common DIBR processing, such as depth image variation, baseline distance adjustment and different rendering conditions. Furthermore, compared with other related and state-of-the-art methods, the proposed algorithm shows higher accuracy in watermark extraction.  相似文献   

15.
We propose a novel algorithm to solve the problem of person re-identification across multiple nonoverlapping cameras by grouping similarity comparison model.We use an image sequence instead of an image as a probe,and divide image sequence into groups by the method of systematic sampling.Then we design the rule which uses full-connection in a group and non-connection between groups to calculate similarities between images.We take the similarities as features,and train an AdaBoost classifier to match the persons across disjoint views.To enhance Euclidean distance discriminative ability,we propose a novel measure of similarity which is called Significant difference distance (SDD).Extensive experiments are conducted on two public datasets.Our proposed person re-identification method can achieve better performance compared with the state-of-the-art.  相似文献   

16.
In general, excessive colorimetric and geometric errors in multi-view images induce visual fatigue to users. Various works have been proposed to reduce these errors, but conventional works have only been available for stereoscopic images while requiring cumbersome additional tasks, and often showing unstable results. In this paper, we propose an effective multi-view image refinement algorithm. The proposed algorithm analyzes such errors in multi-view images from sparse correspondences and compensates them automatically. While the conventional works transform every view to compensate geometric errors, the proposed method transforms only the source views with consideration of a reference view. Therefore this approach can be extended regardless of the number of views. In addition, we also employ uniform view intervals to provide consistent depth perception among views. We correct color inconsistency among views from the correspondences by considering importance and channel properties. Various experimental results show that the proposed algorithm outperforms conventional approaches and generates more visually comfortable multi-view images.  相似文献   

17.
陈宝华  邓磊  陈志祥  段岳圻  周杰 《电子学报》2017,45(6):1294-1300
传统景象匹配定位方法在用于低空无人机定位时,易因低空航拍图像视场小,且与卫星图像(带有地理信息)的拍摄角度差异大而失败.本文提出了一种基于即时稠密三维重构的无人机视觉定位方法,通过将稠密三维点云与卫星图像匹配以实现无人机定位.首先根据图像序列快速估计摄像机位姿,而后使用多深度图协同去噪与优化算法生成稠密三维点云,随后通过变换观察视角由稠密三维点云生成与卫星图像拍摄视角相近的虚拟视图,最后将虚拟视图与卫星图像匹配并得到无人机的地理坐标.由于稠密三维点云包含多张图像的信息,覆盖面积大,且可变化观察视角,因此能够有效克服上述两个问题.实验证明了本文方法的有效性.  相似文献   

18.
Multiview super resolution image reconstruction (SRIR) is often cast as a resampling problem by merging non-redundant data from multiple images on a finer grid, while inverting the effect of the camera point spread function (PSF). One main problem with multiview methods is that resampling from nonuniform samples (provided by multiple images) and the inversion of the PSF are highly nonlinear and ill-posed problems. Non-linearity and ill-posedness are typically overcome by linearization and regularization, often through an iterative optimization process, which essentially trade off the very same information (i.e. high frequency) that we want to recover. We propose a different point of view for multiview SRIR that is very much like single-image methods which extrapolate the spectrum of one image selected as reference from among all views. However, for this, the proposed method relies on information provided by all other views, rather than prior constraints as in single-image methods which may not be an accurate source of information. This is made possible by deriving explicit closed-form expressions that define how the local high frequency information that we aim to recover for the reference high resolution image is related to the local low frequency information in the sequence of views. The locality of these expressions due to modeling using wavelets reduces the problem to an exact and linear set of equations that are well-posed and solved algebraically without requiring regularization or interpolation. Results and comparisons with recently published state-of-the-art methods show the superiority of the proposed solution.  相似文献   

19.
20.
Multiplex fluorescence in situ hybridization (M-FISH) is a recently developed technology that enables multi-color chromosome karyotyping for molecular cytogenetic analysis. Each M-FISH image set consists of a number of aligned images of the same chromosome specimen captured at different optical wavelength. This paper presents embedded M-FISH image coding (EMIC), where the foreground objects/chromosomes and the background objects/images are coded separately. We first apply critically sampled integer wavelet transforms to both the foreground and the background. We then use object-based bit-plane coding to compress each object and generate separate embedded bitstreams that allow continuous lossy-to-lossless compression of the foreground and the background. For efficient arithmetic coding of bit planes, we propose a method of designing an optimal context model that specifically exploits the statistical characteristics of M-FISH images in the wavelet domain. Our experiments show that EMIC achieves nearly twice as much compression as Lempel-Ziv-Welch coding. EMIC also performs much better than JPEG-LS and JPEG-2000 for lossless coding. The lossy performance of EMIC is significantly better than that of coding each M-FISH image with JPEG-2000.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号