首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We present a two-dimensional (2-D) mesh-based mosaic representation, consisting of an object mesh and a mosaic mesh for each frame and a final mosaic image, for video objects with mildly deformable motion in the presence of self and/or object-to-object (external) occlusion. Unlike classical mosaic representations where successive frames are registered using global motion models, we map the uncovered regions in the successive frames onto the mosaic reference frame using local affine models, i.e., those of the neighboring mesh patches. The proposed method to compute this mosaic representation is tightly coupled with an occlusion adaptive 2-D mesh tracking procedure, which consist of propagating the object mesh frame to frame, and updating of both object and mosaic meshes to optimize texture mapping from the mosaic to each instance of the object. The proposed representation has been applied to video object rendering and editing, including self transfiguration, synthetic transfiguration, and 2-D augmented reality in the presence of self and/or external occlusion. We also provide an algorithm to determine the minimum number of still views needed to reconstruct a replacement mosaic which is needed for synthetic transfiguration. Experimental results are provided to demonstrate both the 2-D mesh-based mosaic synthesis and two different video object editing applications on real video sequences.  相似文献   

2.
Video object extraction is a key technology in content-based video coding.A novel video object extracting algorithm by two Dimensional (2-D) mesh-based motion analysis is proposed in this paper.Firstly,a 2-D mesh fitting the original frame image is obtained via feature detection algorithm. Then,higher order statistics motion analysis is applied on the 2-D mesh representation to get an initial motion detection mask.After post-processing,the final segmenting mask is quickly obtained.And hence the video object is effectively extracted.Experimental results show that the proposed algorithm combines the merits of mesh-based segmenting algorithms and pixel-based segmenting algorithms,and hereby achieves satisfactory subjective and objective performance while dramatically increasing the segmenting speed.  相似文献   

3.
提出了一种基于二维网格运动分析与改进形态学滤波空域自动分割策略相结合的视频对象时空分割算法。该算法首先利用高阶统计方法对视频图像的二维网格表示进行运动分析,快速得到前景对象区域,通过后处理有效获得前景对象运动检测掩膜。然后,用一种结合交变序列重建滤波算法和自适应阈值判别算法的改进分水岭分割策略有效获得前景对象的精确边缘。最后,用区域基时空融合算法将时域分割结果和空域分割结果结合起来提取出边缘精细的视频对象。实验结果表明,本算法综合了多种算法的优点,主客观分割效果理想。  相似文献   

4.
This paper integrates fully automatic video object segmentation and tracking including detection and assignment of uncovered regions in a 2-D mesh-based framework. Particular contributions of this work are (i) a novel video object segmentation method that is posed as a constrained maximum contrast path search problem along the edges of a 2-D triangular mesh, and (ii) a 2-D mesh-based uncovered region detection method along the object boundary as well as within the object. At the first frame, an optimal number of feature points are selected as nodes of a 2-D content-based mesh. These points are classified as moving (foreground) and stationary nodes based on multi-frame node motion analysis, yielding a coarse estimate of the foreground object boundary. Color differences across triangles near the coarse boundary are employed for a maximum contrast path search along the edges of the 2-D mesh to refine the boundary of the video object. Next, we propagate the refined boundary to the subsequent frame by using motion vectors of the node points to form the coarse boundary at the next frame. We detect occluded regions by using motion-compensated frame differences and range filtered edge maps. The boundaries of detected uncovered regions are then refined by using the search procedure. These regions are either appended to the foreground object or tracked as new objects. The segmentation procedure is re-initialized when unreliable motion vectors exceed a certain number. The proposed scheme is demonstrated on several video sequences.  相似文献   

5.
In this paper, we propose a new bi-directional 2-D mesh representation of video objects, which utilizes forward and backward reference frames (keyframes). This framework extends the previous uni-directional mesh representation to enable efficient rendering, editing, and superresolution of video objects in the presence of occlusion by allowing bi-directional texture mapping as in MPEG B-frames. The video object of interest is tracked between two successive keyframes (which can be automatically or interactively selected) both in forward and backward directions. Keyframes provide the texture of the video object, whereas its motion is modeled by forward and backward 2-D meshes. In addition, we employ “validity maps”, associated with each 2-D mesh, which allow selective texture mapping from the keyframes. Experimental results for efficient video object editing and object-based video resolution enhancement in the presence of self-occlusion are presented to demonstrate the effectiveness of the proposed representation.  相似文献   

6.
We propose and evaluate a number of novel improvements to the mesh-based coding scheme for 3-D brain magnetic resonance images. This includes: 1) elimination of the clinically irrelevant background leading to meshing of only the brain part of the image; 2) content-based (adaptive) mesh generation using spatial edges and optical flow between two consecutive slices; 3) a simple solution for the aperture problem at the edges, where an accurate estimation of motion vectors is not possible; and 4) context-based entropy coding of the residues after motion compensation using affine transformations. We address only lossless coding of the images, and compare the performance of uniform and adaptive mesh-based schemes. The bit rates achieved (about 2 bits per voxel) by these schemes are comparable to those of the state-of-the-art three-dimensional (3-D) wavelet-based schemes. The mesh-based schemes have been shown to be effective for the compression of 3-D brain computed tomography data also. Adaptive mesh-based schemes perform marginally better than the uniform mesh-based methods, at the expense of increased complexity.  相似文献   

7.
Intensity prediction along motion trajectories removes temporal redundancy considerably in video compression algorithms. In three-dimensional (3-D) object-based video coding, both 3-D motion and depth values are required for temporal prediction. The required 3-D motion parameters for each object are found by the correspondence-based E-matrix method. The estimation of the correspondences-two-dimensional (2-D) motion field-between the frames and segmentation of the scene into objects are achieved simultaneously by minimizing a Gibbs energy. The depth field is estimated by jointly minimizing a defined distortion and bit-rate criterion using the 3-D motion parameters. The resulting depth field is efficient in the rate-distortion sense. Bit-rate values corresponding to the lossless encoding of the resultant depth fields are obtained using predictive coding; prediction errors are encoded by a Lempel-Ziv algorithm. The results are satisfactory for real-life video scenes.  相似文献   

8.
Occlusion-adaptive, content-based mesh design and forward tracking   总被引:1,自引:0,他引:1  
Two-dimensional (2-D) mesh-based motion compensation preserves neighboring relations (through connectivity of the mesh) as well as allowing warping transformations between pairs of frames; thus, it effectively eliminates blocking artifacts that are common in motion compensation by block matching. However, available 2-D mesh models, whether uniform or non-uniform, enforce connectivity everywhere within a frame, which is clearly not suitable across occlusion boundaries. To this effect, we hereby propose an occlusion-adaptive forward-tracking mesh model, where connectivity of the mesh elements (patches) across covered and uncovered region boundaries are broken. This is achieved by allowing no node points within the background to be covered (BTBC) and refining the mesh structure within the model failure (MF) region(s) at each frame. The proposed content-based mesh structure enables better rendition of the motion (compared to a uniform or a hierarchical mesh), while tracking is necessary to avoid transmission of all node locations at each frame. Experimental results show successful motion compensation and tracking.  相似文献   

9.
10.
The block-matching algorithm is the most popular motion compensation technique in video coding. However, it cannot provide acceptable quality at very low bit rate. In this paper, a new mesh-based motion compensation method is proposed to attack the problem. First, a regular non-uniform mesh, which has regular structure with variable patch size, is presented. The patch size is varied according to motion activity of a video sequence. Next, a weighted interpolation block matching is developed to improve the estimate accuracy of displacements of grid points. It utilizes the motion correlation between a grid point and its associated patches. Finally, based on the new mesh and motion estimation scheme, an efficient motion compensation algorithm is developed. When compared to the conventional motion compensation techniques, the proposed method improves performance significantly with lower computational complexity and overhead information bits.  相似文献   

11.
Very low bit-rate coding requires new paradigms that go well beyond pixel- and frame-based video representations. We introduce a novel content-based video representation using tridimensional entities: textured object models and pose estimates. The multiproperty object models carry stochastic information about the shape and texture of each object present in the scene. The pose estimates define the position and orientation of the objects for each frame. This representation is compact. It provides alternative means for handling video by manipulating and compositing three-dimensional (3-D) entities. We call this representation tridimensional video compositing, or 3DVC for short. We present the 3DVC framework and describe the methods used to construct incrementally the object models and the pose estimates from unregistered noisy depth and texture measurements. We also describe a method for video frame reconstruction based on 3-D scene assembly, and discuss potential applications of 3DVC to video coding and content-based handling. 3DVC assumes that the objects in the scene are rigid and segmented. By assuming segmentation, we do not address the difficult questions of nonrigid segmentation and multiple object segmentation. In our experiments, segmentation is obtained via depth thresholding. It is important to notice that 3DVC is independent of the segmentation technique adopted. Experimental results with synthetic and real video sequences where compression ratios in the range of 1:150-1:2700 are achieved demonstrate the applicability of the proposed representation to very low bit-rate coding  相似文献   

12.
We propose a new framework for highly scalable video compression, using a lifting-based invertible motion adaptive transform (LIMAT). We use motion-compensated lifting steps to implement the temporal wavelet transform, which preserves invertibility, regardless of the motion model. By contrast, the invertibility requirement has restricted previous approaches to either block-based or global motion compensation. We show that the proposed framework effectively applies the temporal wavelet transform along a set of motion trajectories. An implementation demonstrates high coding gain from a finely embedded, scalable compressed bit-stream. Results also demonstrate the effectiveness of temporal wavelet kernels other than the simple Haar, and the benefits of complex motion modeling, using a deformable triangular mesh. These advances are either incompatible or difficult to achieve with previously proposed strategies for scalable video compression. Video sequences reconstructed at reduced frame-rates, from subsets of the compressed bit-stream, demonstrate the visually pleasing properties expected from low-pass filtering along the motion trajectories. The paper also describes a compact representation for the motion parameters, having motion overhead comparable to that of motion-compensated predictive coders. Our experimental results compare favorably to others reported in the literature, however, our principal objective is to motivate a new framework for highly scalable video compression.  相似文献   

13.
This paper proposes a mesh-based representation method for the disparity map of stereo images. The proposed method is designed to concentrate mainly on applications of view interpolation and stereo image compression. To obtain high image quality in the view interpolation and compression of stereo images, we formulate the view-interpolation error and prediction error. In the formulation, the view-interpolation and prediction errors depend not only on the accuracy of the disparity map, but also on the gradient of the stereo images. The proposed representation method for the disparity map is based on a triangular mesh structure, which minimizes the formulated interpolation and prediction errors. The experimental results show that the proposed method yields higher quality view-interpolated images and also has better performance in stereo image compression than the conventional methods.  相似文献   

14.
In any practical application of the 2-D-to-3-D conversion that involves storage and transmission, representation efficiency has an undisputable importance that is not reflected in the attention the topic received. In order to address this problem, a novel algorithm, which yields efficient 3-D representations in the rate distortion sense, is proposed. The algorithm utilizes two views of a scene to build a mesh-based representation incrementally, via adding new vertices, while minimizing a distortion measure. The experimental results indicate that, in scenes that can be approximated by planes, the proposed algorithm is superior to the dense depth map and, in some practical situations, to the block motion vector-based representations in the rate-distortion sense.   相似文献   

15.
We propose a new framework in wavelet video coding to improve the compression rate by exploiting the spatiotemporal regularity of the data. A sequence of images creates a spatiotemporal volume. This volume is said to be regular along the directions in which the pixels vary the least, hence the entropy is the lowest. The wavelet decomposition of regularized data results in a fewer number of significant coefficients, thus yielding a higher compression rate. The directions of regularity of an image sequence depend on both its motion content and spatial structure. We propose the representation of these directions by a 3-D vector field, which we refer to as the spatiotemporal regularity flow (SPREF). SPREF uses splines to approximate the directions of regularity. The compactness of the spline representation results in a low storage overhead for SPREF, which is a desired property in compression applications. Once SPREF directions are known, they can be converted into actual paths along which the data is regular. Directional decomposition of the data along these paths can be further improved by using a special class of wavelet basis called the 3-D orthonormal bandelet basis. SPREF -based video compression not only removes the temporal redundancy, but it also compensates for the spatial redundancy. Our experiments on several standard video sequences demonstrate that the proposed method results in higher compression rates as compared to the standard wavelet based compression.  相似文献   

16.
This paper proposes low power VLSI architecture for motion tracking that can be used in online video applications such as in MPEG and VRML. The proposed architecture uses a hierarchical adaptive structured mesh (HASM) concept that generates a content-based video representation. The developed architecture shows the significant reducing of power consumption that is inherited in the HASM concept. The proposed architecture consists of two units: a motion estimation and motion compensation units.The motion estimation (ME) architecture generates a progressive mesh code that represents a mesh topology and its motion vectors. ME reduces the power consumption since it (1) implements a successive splitting strategy to generate the mesh topology. The successive split allows the pipelined implementation of the processing elements. (2) It approximates the mesh nodes motion vector by using the three step search algorithm. (3) and it uses parallel units that reduce the power consumption at a fixed throughput.The motion compensation (MC) architecture processes a reference frame, mesh nodes and motion vectors to predict a video frame using affine transformation to warp the texture with different mesh patches. The MC reduces the power consumption since it uses (1) a multiplication-free algorithm for affine transformation. (2) It uses parallel threads in which each thread implements a pipelined chain of scalable affine units to compute the affine transformation of each patch.The architecture has been prototyped using top-down low-power design methodology. The performance of the architecture has been analyzed in terms of video construction quality, power and delay.  相似文献   

17.
This paper proposes a method for progressive lossy-to-lossless compression of four-dimensional (4-D) medical images (sequences of volumetric images over time) by using a combination of three-dimensional (3-D) integer wavelet transform (IWT) and 3-D motion compensation. A 3-D extension of the set-partitioning in hierarchical trees (SPIHT) algorithm is employed for coding the wavelet coefficients. To effectively exploit the redundancy between consecutive 3-D images, the concepts of key and residual frames from video coding is used. A fast 3-D cube matching algorithm is employed to do motion estimation. The key and the residual volumes are then coded using 3-D IWT and the modified 3-D SPIHT. The experimental results presented in this paper show that our proposed compression scheme achieves better lossy and lossless compression performance on 4-D medical images when compared with JPEG-2000 and volumetric compression based on 3-D SPIHT.  相似文献   

18.
19.
Aerial video surveillance and exploitation   总被引:8,自引:0,他引:8  
There is growing interest in performing aerial surveillance using video cameras. Compared to traditional framing cameras, video cameras provide the capability to observe ongoing activity within a scene and to automatically control the camera to track the activity. However, the high data rates and relatively small field of view of video cameras present new technical challenges that must be overcome before such cameras can be widely used. In this paper, we present a framework and details of the key components for real-time, automatic exploitation of aerial video for surveillance applications. The framework involves separating an aerial video into the natural components corresponding to the scene. Three major components of the scene are the static background geometry, moving objects, and appearance of the static and dynamic components of the scene. In order to delineate videos into these scene components, we have developed real time, image-processing techniques for 2-D/3-D frame-to-frame alignment, change detection, camera control, and tracking of independently moving objects in cluttered scenes. The geo-location of video and tracked objects is estimated by registration of the video to controlled reference imagery, elevation maps, and site models. Finally static, dynamic and reprojected mosaics may be constructed for compression, enhanced visualization, and mapping applications  相似文献   

20.
Rate-distortion optimization for video compression   总被引:3,自引:0,他引:3  
The rate-distortion efficiency of video compression schemes is based on a sophisticated interaction between various motion representation possibilities, waveform coding of differences, and waveform coding of various refreshed regions. Hence, a key problem in high-compression video coding is the operational control of the encoder. This problem is compounded by the widely varying content and motion found in typical video sequences, necessitating the selection between different representation possibilities with varying rate-distortion efficiency. This article addresses the problem of video encoder optimization and discusses its consequences on the compression architecture of the overall coding system. Based on the well-known hybrid video coding structure, Lagrangian optimization techniques are presented that try to answer the question: what part of the video signal should be coded using what method and parameter settings?  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号