首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Intensity prediction along motion trajectories removes temporal redundancy considerably in video compression algorithms. In three-dimensional (3-D) object-based video coding, both 3-D motion and depth values are required for temporal prediction. The required 3-D motion parameters for each object are found by the correspondence-based E-matrix method. The estimation of the correspondences-two-dimensional (2-D) motion field-between the frames and segmentation of the scene into objects are achieved simultaneously by minimizing a Gibbs energy. The depth field is estimated by jointly minimizing a defined distortion and bit-rate criterion using the 3-D motion parameters. The resulting depth field is efficient in the rate-distortion sense. Bit-rate values corresponding to the lossless encoding of the resultant depth fields are obtained using predictive coding; prediction errors are encoded by a Lempel-Ziv algorithm. The results are satisfactory for real-life video scenes.  相似文献   

2.
3.
We present a two-dimensional (2-D) mesh-based mosaic representation, consisting of an object mesh and a mosaic mesh for each frame and a final mosaic image, for video objects with mildly deformable motion in the presence of self and/or object-to-object (external) occlusion. Unlike classical mosaic representations where successive frames are registered using global motion models, we map the uncovered regions in the successive frames onto the mosaic reference frame using local affine models, i.e., those of the neighboring mesh patches. The proposed method to compute this mosaic representation is tightly coupled with an occlusion adaptive 2-D mesh tracking procedure, which consist of propagating the object mesh frame to frame, and updating of both object and mosaic meshes to optimize texture mapping from the mosaic to each instance of the object. The proposed representation has been applied to video object rendering and editing, including self transfiguration, synthetic transfiguration, and 2-D augmented reality in the presence of self and/or external occlusion. We also provide an algorithm to determine the minimum number of still views needed to reconstruct a replacement mosaic which is needed for synthetic transfiguration. Experimental results are provided to demonstrate both the 2-D mesh-based mosaic synthesis and two different video object editing applications on real video sequences.  相似文献   

4.
5.
In this paper, we propose a new bi-directional 2-D mesh representation of video objects, which utilizes forward and backward reference frames (keyframes). This framework extends the previous uni-directional mesh representation to enable efficient rendering, editing, and superresolution of video objects in the presence of occlusion by allowing bi-directional texture mapping as in MPEG B-frames. The video object of interest is tracked between two successive keyframes (which can be automatically or interactively selected) both in forward and backward directions. Keyframes provide the texture of the video object, whereas its motion is modeled by forward and backward 2-D meshes. In addition, we employ “validity maps”, associated with each 2-D mesh, which allow selective texture mapping from the keyframes. Experimental results for efficient video object editing and object-based video resolution enhancement in the presence of self-occlusion are presented to demonstrate the effectiveness of the proposed representation.  相似文献   

6.
To enable content-based functionalities in video coding, a decomposition of the scene into physical objects is required. Such objects are normally not characterised by homogeneous colour, intensity, or optical flow. Therefore, conventional techniques based on these low-level features cannot perform the desired segmentation. The authors address segmentation and tracking of moving objects and present a new video object plane (VOP) segmentation algorithm that extracts semantically meaningful objects. A morphological motion filter detects physical objects by identifying areas that are moving differently from the background. A new filter criterion is introduced that measures the deviation of the estimated local motion from the synthesised global motion. A two-dimensional binary model is derived for the object of interest and tracked throughout the sequence by a Hausdorff object tracker. To accommodate for rotations and changes in shape, the model is updated every frame by a two-stage method that accounts for rigid and non-rigid moving parts of the object. The binary model then guides the actual VOP extraction, whereby a novel boundary post-processor ensures high boundary accuracy. Experimental results demonstrate the performance of the proposed algorithm  相似文献   

7.
An approach to model-based dynamic object verification and identification using video is proposed. From image sequences containing the moving object, we compute its motion trajectory. Then we estimate its three-dimensional (3-D) pose at each time step. Pose estimation is formulated as a search problem, with the search space constrained by the motion trajectory information of the moving object and assumptions about the scene structure. A generalized Hausdorff (1962) metric, which is more robust to noise and allows a confidence interpretation, is suggested for the matching procedure used for pose estimation as well as the identification and verification problem. The pose evolution curves are used to assist in the acceptance or rejection of an object hypothesis. The models are acquired from real image sequences of the objects. Edge maps are extracted and used for matching. Results are presented for both infrared and optical sequences containing moving objects involved in complex motions  相似文献   

8.
Three-dimensional (3-D) scene reconstruction from broadcast video is a challenging problem with many potential applications, such as 3-D TV, free-view TV, augmented reality or three-dimensionalization of two-dimensional (2-D) media archives. In this paper, a flexible and effective system capable of efficiently reconstructing 3-D scenes from broadcast video is proposed, with the assumption that there is relative motion between camera and scene/objects. The system requires no a priori information and input, other than the video sequence itself, and capable of estimating the internal and external camera parameters and performing a 3-D motion-based segmentation, as well as computing a dense depth field. The system also serves as a showcase to present some novel approaches for moving object segmentation, sparse and dense reconstruction problems. According to the simulations for both synthetic and real data, the system achieves a promising performance for typical TV content, indicating that it is a significant step towards the 3-D reconstruction of scenes from broadcast video.  相似文献   

9.
This paper integrates fully automatic video object segmentation and tracking including detection and assignment of uncovered regions in a 2-D mesh-based framework. Particular contributions of this work are (i) a novel video object segmentation method that is posed as a constrained maximum contrast path search problem along the edges of a 2-D triangular mesh, and (ii) a 2-D mesh-based uncovered region detection method along the object boundary as well as within the object. At the first frame, an optimal number of feature points are selected as nodes of a 2-D content-based mesh. These points are classified as moving (foreground) and stationary nodes based on multi-frame node motion analysis, yielding a coarse estimate of the foreground object boundary. Color differences across triangles near the coarse boundary are employed for a maximum contrast path search along the edges of the 2-D mesh to refine the boundary of the video object. Next, we propagate the refined boundary to the subsequent frame by using motion vectors of the node points to form the coarse boundary at the next frame. We detect occluded regions by using motion-compensated frame differences and range filtered edge maps. The boundaries of detected uncovered regions are then refined by using the search procedure. These regions are either appended to the foreground object or tracked as new objects. The segmentation procedure is re-initialized when unreliable motion vectors exceed a certain number. The proposed scheme is demonstrated on several video sequences.  相似文献   

10.
A rate-distortion framework is used to define a very low bit-rate coding scheme based on quadtree segmentation and optimized selection of motion estimators. This technique achieves maximum reconstructed image quality under the constraint of a target bit rate for the coding of the vector field and segmentation information. First, a complete scheme is proposed for hybrid two-dimensional (2-D) and three-dimensional (3-D) motion estimation and compensation. The quadtree object segmentation is optimized for hybrid motion estimation in the rate-distortion sense. This scheme adapts to the depth of the quadtree and the technique used for motion estimation for each leaf of the tree. A more sophisticated technique, adapted to the requirements of a very low bit-rate coder, is also proposed which also considers the transmission of the prediction error corresponding to the particular choice of the motion estimator. Based on these coding schemes, two versions of a very low bit-rate image sequence coder are developed. Experimental results illustrating the performance of the proposed techniques in very low bit-rate image sequence coding application areas are presented and evaluated  相似文献   

11.
This paper first provides an overview of two-dimensional (2-D) and three-dimensional mesh models for digital video processing. It then introduces 2-D mesh-based modeling of video objects as a compact representation of motion and shape for interactive, synthetic/natural video manipulation, compression, and indexing. The 2-D mesh representation and the mesh geometry and motion compression have been included in the visual tools of the upcoming MPEG-4 standard. Functionalities enabled by 2-D mesh-based visual-object representation include animation of still texture maps, transfiguration of video overlays, video morphing, and shape-and motion-based retrieval of video objects  相似文献   

12.
There are a large number of applications requiring the compression of video at Very Low Bit Rates (VLBR). Such applications include wireless video conferencing, video over the internet, multimedia database retrieval and remote sensing and monitoring. Recently, the MPEG-4 standardization effort has been a motivating factor to find a solution to this challenging problem. The existing approaches to this problem can generally be grouped into block-based, model-based, and object-oriented. Block-based approaches follow the traditional strategy of decoupling the image sequence into blocks, model-based approaches rely on complex 3-D models for specific objects that are encoded, and object-oriented approaches rely on analyzing the scene into differently moving objects. All three approaches exhibit potential problems. Block-based approaches tend to generate artifacts at the boundaries of the blocks, as well as to limit the minimum achievable bit-rate due to the fixed analysis structure of the scene. Model-based codecs are limited by the complex 3-D models of the objects to be encoded. On the other hand, object-oriented codecs can generate a significant overhead due to the analysis of the scene which needs to be transmitted, which in turn can be the limiting factor in achieving the target bit-rates. In this paper, we propose a hybrid object-oriented codec in which the correlations among the three information fields, e.g., motion, segmentation and intensity fields, are exploited both spatially and temporally. In the proposed method, additional intelligence is given to the decoder, resulting in a reduction of the required bandwidth. The residual information is analyzed into three different categories, i.e., occlusion, model failures, and global refinement. The residual information is encoded and transmitted across the channel with other side information. Experimental results are presented which demonstrate the effectiveness of the proposed approach.  相似文献   

13.
14.
Compression of captured video frames is crucial for saving the power in wireless capsule endoscopy (WCE). A low complexity encoder is desired to limit the power consumption required for compressing the WCE video. Distributed video coding (DVC) technique is best suitable for designing a low complexity encoder. In this technique, frames captured in RGB colour space are converted into YCbCr colour space. Both Y and CbCr representing luma and chroma components of the Wyner–Ziv (WZ) frames are processed and encoded in existing DVC techniques proposed for WCE video compression. In the WCE video, consecutive frames exhibit more similarity in texture and colour properties. The proposed work uses these properties to present a method for processing and encoding only the luma component of a WZ frame. The chroma components of the WZ frame are predicted by an encoder–decoder based deep chroma prediction model at the decoder by matching luma and texture information of the keyframe and WZ frame. The proposed method reduces the computations required for encoding and transmitting of WZ chroma component. The results show that the proposed DVC with a deep chroma prediction model performs better when compared to motion JPEG and existing DVC systems for WCE at the reduced encoder complexity.  相似文献   

15.
Image-based rendering has been successfully used to display 3-D objects for many applications. A well-known example is the object movie, which is an image-based 3-D object composed of a collection of 2-D images taken from many different viewpoints of a 3-D object. In order to integrate image-based 3-D objects into a chosen scene (e.g., a panorama), one has to meet a hard challenge--to efficiently and effectively remove the background from the foreground object. This problem is referred to as multiview images (MVIs) segmentation. Another task requires MVI segmentation is image-based 3-D reconstruction using multiview images. In this paper, we propose a new method for segmenting MVI, which integrates some useful algorithms, including the well-known graph-cut image segmentation and volumetric graph-cut. The main idea is to incorporate the shape prior into the image segmentation process. The shape prior introduced into every image of the MVI is extracted from the 3-D model reconstructed by using the volumetric graph cuts algorithm. Here, the constraint obtained from the discrete medial axis is adopted to improve the reconstruction algorithm. The proposed MVI segmentation process requires only a small amount of user intervention, which is to select a subset of acceptable segmentations of the MVI after the initial segmentation process. According to our experiments, the proposed method can provide not only good MVI segmentation, but also provide acceptable 3-D reconstructed models for certain less-demanding applications.  相似文献   

16.
Extracting accurate foreground objects from a scene is an essential step for many video applications. Traditional background subtraction algorithms can generate coarse estimates, but generating high quality masks requires professional softwares with significant human interventions, e.g., providing trimaps or labeling key frames. We propose an automatic foreground extraction method in applications where a static but imperfect background is available. Examples include filming and surveillance where the background can be captured before the objects enter the scene or after they leave the scene. Our proposed method is very robust and produces significantly better estimates than state-of-the-art background subtraction, video segmentation and alpha matting methods. The key innovation of our method is a novel information fusion technique. The fusion framework allows us to integrate the individual strengths of alpha matting, background subtraction and image denoising to produce an overall better estimate. Such integration is particularly important when handling complex scenes with imperfect background. We show how the framework is developed, and how the individual components are built. Extensive experiments and ablation studies are conducted to evaluate the proposed method.  相似文献   

17.
Segmentation of moving objects in image sequence: A review   总被引:6,自引:0,他引:6  
Segmentation of objects in image sequences is very important in many aspects of multimedia applications. In second-generation image/video coding, images are segmented into objects to achieve efficient compression by coding the contour and texture separately. As the purpose is to achieve high compression performance, the objects segmented may not be semantically meaningful to human observers. The more recent applications, such as content-based image/video retrieval and image/video composition, require that the segmented objects be semantically meaningful. Indeed, the recent multimedia standard MPEG-4 specifies that a video is composed of meaningful video objects. Although many segmentation techniques have been proposed in the literature, fully automatic segmentation tools for general applications are currently not achievable. This paper provides a review of this important and challenging area of segmentation of moving objects. We describe common approaches including temporal segmentation, spatial segmentation, and the combination of temporal-spatial segmentation. As an example, a complete segmentation scheme, which is an informative part of MPEG-4, is summarized.  相似文献   

18.
分布式视频编码(DVC)与传统视频编码之间的转码为移动终端设备之间的低功耗视频通信提供了一种有效的实现思路。以DVC与HEVC转码为研究对象,利用DVC解码端信息,针对高效视频编码(HEVC)中复杂度极高的编码单元(CU)划分过程进行复杂度优化研究。在DVC解码端提取与CU划分相关的纹理复杂度、运动矢量及预测残差3种特征信息;在HEVC编码端基于朴素贝叶斯原理建立CU快速划分模型,模型生成后便可以通过输入特征信息对当前CU划分进行快速决策,避免大量率失真(RD)代价计算过程。实验结果表明,本方案在编码比特率略有上升的情况下大幅缩短了HEVC编码时间,平均下降幅度达到58.26%,且几乎不影响视频质量。  相似文献   

19.
20.
本文简略报道国际活动图像专家组近年对多媒体通信草拟新标准MPEG 4的进展情况。其中音频编码包括语音、音乐 (自然的和合成的 ) ,比特率从 2至 6 4kb /s。视频编码包括甚低比特率 5~ 6 4kb /s和较高比特率 6 4kb /s至 2Mb /s。视频编码可将图像中每一对象分开编码为不同的比特流层 ,又可操纵对象的尺度、位置等 ,具有以内容为基础的交互式功能。除了核心编码器外 ,对输入视频序列的每一帧分成若干个任意形状的“视频对象平面” ,编码成各个分开的“视频对象层”。另外 ,利用“子图形”编码技术 ,将图像的背景以及前景中的每一对象分开编码成视频序列后传输 ,可以改善视像质量。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号