首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 375 毫秒
1.
We present a two-dimensional (2-D) mesh-based mosaic representation, consisting of an object mesh and a mosaic mesh for each frame and a final mosaic image, for video objects with mildly deformable motion in the presence of self and/or object-to-object (external) occlusion. Unlike classical mosaic representations where successive frames are registered using global motion models, we map the uncovered regions in the successive frames onto the mosaic reference frame using local affine models, i.e., those of the neighboring mesh patches. The proposed method to compute this mosaic representation is tightly coupled with an occlusion adaptive 2-D mesh tracking procedure, which consist of propagating the object mesh frame to frame, and updating of both object and mosaic meshes to optimize texture mapping from the mosaic to each instance of the object. The proposed representation has been applied to video object rendering and editing, including self transfiguration, synthetic transfiguration, and 2-D augmented reality in the presence of self and/or external occlusion. We also provide an algorithm to determine the minimum number of still views needed to reconstruct a replacement mosaic which is needed for synthetic transfiguration. Experimental results are provided to demonstrate both the 2-D mesh-based mosaic synthesis and two different video object editing applications on real video sequences.  相似文献   

2.
该文研究了一种无人机航拍高清影像快速拼接方法,主要包括路线规划、视频特定帧抽取、帧图像拼接三步.其中,为得到较精确的对地分辨率,对实验场景进行路线规划;为解决无人机影像冗余帧信息量大的问题,提出了一种基于目标场景和无人机飞行状态参数的抽取算法;为解决配准阶段的耗时问题,用SURF算子仅在重叠区域内进行图像配准,减少配准用时.实验表明,该方法基本达到实时性要求,且拼接图像的视觉效果较好.  相似文献   

3.
A real-time algorithm for affine-structure-based video compression for facial images is presented. The face undergoing motion is segmented and triangulated to yield a set of control points. The set of control points generated by triangulation are tracked across a few frames using an intensity-based correlation technique. For accurate motion and structure estimation a Kalman-filter-based algorithm is used to track features on the facial image. The structure information of the control points is transmitted only during the bootstrapping stage. After that only the motion information is transmitted to the decoder. This reduces the number of motion parameters associated with control points in each frame. The local motion of the eyes and lips is captured using local 2-D affine transformations. For real time implementation a quad-tree based search technique is adopted to solve local correlation. Any remaining reconstruction error is accounted for using predictive encoding. Results on real image sequences demonstrate the applicability of the method  相似文献   

4.
5.
Very low bit-rate coding requires new paradigms that go well beyond pixel- and frame-based video representations. We introduce a novel content-based video representation using tridimensional entities: textured object models and pose estimates. The multiproperty object models carry stochastic information about the shape and texture of each object present in the scene. The pose estimates define the position and orientation of the objects for each frame. This representation is compact. It provides alternative means for handling video by manipulating and compositing three-dimensional (3-D) entities. We call this representation tridimensional video compositing, or 3DVC for short. We present the 3DVC framework and describe the methods used to construct incrementally the object models and the pose estimates from unregistered noisy depth and texture measurements. We also describe a method for video frame reconstruction based on 3-D scene assembly, and discuss potential applications of 3DVC to video coding and content-based handling. 3DVC assumes that the objects in the scene are rigid and segmented. By assuming segmentation, we do not address the difficult questions of nonrigid segmentation and multiple object segmentation. In our experiments, segmentation is obtained via depth thresholding. It is important to notice that 3DVC is independent of the segmentation technique adopted. Experimental results with synthetic and real video sequences where compression ratios in the range of 1:150-1:2700 are achieved demonstrate the applicability of the proposed representation to very low bit-rate coding  相似文献   

6.
在接收实时下传的视频图像时,需要实时生成相应的视频拼接图像。针对这一问题,通过对接收视频进行采集,得到有一定重叠区域的视频图像序列,应用改进的基于特征区域的特征提取匹配法对重叠图像进行快速配准,采用渐入渐出融合算法消除拼缝,实现无缝大视场拼接。工程应用表明,该方法可以自动对视频图像(25帧/s、帧格式为768×576)进行拼接,满足系统实时拼接的要求。  相似文献   

7.
This article presents a rigorous, high-precision model for geometric orthorectification of declassified intelligence satellite photography (DISP) imagery for the generation of a seamless, full-coverage mosaic of the Greenland ice sheet. This model integrates the bundle adjustment method and satellite orbital parameters, solving for interior orientation (including lens distortion) and exterior orientation parameters simultaneously. In addition, the techniques of adaptive filtering, bright-strip removal, radiometric balancing, and mosaic postprocessing are discussed. Two full-coverage mosaics of Greenland using 24 DISP images from eight orbits of the ARGON 9034A Mission of May 1962 and 36 images from 14 orbits of the 9058A/59A mission of October 1963 were created. The average planimetric accuracy (relative to the synthetic aperture radar (SAR) mosaic) is about 168 m from statistical measurements of 182 points in topographically flat areas and 186 m from statistical measurements of 201 points in mountainous areas. The two mosaic products have been delivered to the U.S. National Snow and Ice Data Center (NSIDC) for use by the research community.  相似文献   

8.
索文凯  胡文刚  张炎  张彪 《激光技术》2019,43(5):691-696
为了充分利用连续视觉图像中3维空间信息, 解决无人机自主降落过程中的定位问题, 在稠密3维点云法和光流法定位原理的基础上, 提出了基于同物不同时图像像空间的定位方法。以理论推算、图形注释等方式, 通过求解单个像素点和整个图像移动变化情况, 将连续帧图像的形变、量变信息分解为无人机和参照物的空间相对运动信息, 并结合已知的参照物运动参量, 推算了无人机飞行位姿信息, 完成了无人机基于光学视觉图像的空间定位方法研究。结果表明, 该研究为视觉系统在无人机降落回收过程中独立实现空间定位提供了一定的借鉴和参考。  相似文献   

9.
图像拼接质量评价方法   总被引:3,自引:0,他引:3  
在现有图像质量评价方法相关原理基础上,提出了一种基于图像边缘信息的拼接质量评价新方法。该方法针对图像拼接结果的特点,先对待评价图像进行边缘提取,然后利用拼接前后图像的边缘轮廓信息,综合图像像素误差信息和结构信息,根据其均值和方差等统计信息与影响图像拼接质量的主要因素(拼接错位和亮度突变)之间的关系,对拼接图像进行评价。该评价方法得出的评价结果更加符合人眼视觉对图像拼接质量的主观评价感受,较准确地反映了拼接图像的真实质量和所使用图像拼接算法的性能。  相似文献   

10.
运动补偿插帧是目前主要的帧率上转换方法。为减小内插帧中的块效应,并降低运算量以满足实时高清视频应用,该文提出了一种基于3维递归搜索(3-D Recursive Search, 3-D RS)的多级块匹配运动估计视频帧率上转换算法。该算法将3-D RS与双向运动估计相结合,首先对序列中相邻帧进行由粗到精的三级运动估计,再利用简化的中值滤波器平滑运动矢量场,最后通过线性插值补偿得到内插帧。实验结果表明,与现有的运动补偿插帧算法相比,该算法内插帧的主、客观质量都有所提高,且算法复杂度低,有很强的实用性。  相似文献   

11.
This paper presents an image-based method for virtual bronchoscope with photo-realistic rendering. The technique is based on recovering bidirectional reflectance distribution function (BRDF) parameters in an environment where the choice of viewing positions, directions, and illumination conditions are restricted. Video images of bronchoscopy examinations are combined with patient-specific three-dimensional (3-D) computed tomography data through two-dimensional (2-D)/3-D registration and shading model parameters are then recovered by exploiting the restricted lighting configurations imposed by the bronchoscope. With the proposed technique, the recovered BRDF is used to predict the expected shading intensity, allowing a texture map independent of lighting conditions to be extracted from each video frame. To correct for disocclusion artefacts, statistical texture synthesis was used to recreate the missing areas. New views not present in the original bronchoscopy video are rendered by evaluating the BRDF with different viewing and illumination parameters. This allows free navigation of the acquired 3-D model with enhanced photo-realism. To assess the practical value of the proposed technique, a detailed visual scoring that involves both real and rendered bronchoscope images is conducted.  相似文献   

12.
主要是针对从视频采集卡中出来的两路实时MPEG-2 PS流进行了拼接研究及实现。在拼接过程中,通过正确选取拼接点、对系统参数如PTS,DTS等进行调整达到了视频流的连续播放效果,通过对视频帧的即时前向查找,避免了传统方法下通过等待查找可能造成的缓冲器下溢问题,并通过vbv delay参数的前后关联性,解决了缓冲器的上溢等问题,进而避免了拼接点处经常会出现的马赛克、花屏、黑屏及闪烁等问题。  相似文献   

13.
New methods for dynamic mosaicking   总被引:3,自引:0,他引:3  
This paper presents a new technique for the creation of a sequence of mosaic images from an original video shot. A mosaic image represents, on a single image, the scene background seen all over the sequence and its creation requires the estimation of the warping parameters and the use of a blending technique. The warping parameters permit one to represent each original image in the mosaic reference. An estimation method, based on a direct comparison between the current original image and the previously calculated mosaic is proposed. A new analytic minimization criterion is also designed to optimize the determination of the blending coefficient used for the update of the mosaic image with a new original image. This criterion is based on constraints related to the temporal variations of the background, the temporal delay and the resolution of the created mosaic images, while its minimization can be analytically performed. Finally, the proposed method is applied to the creation of new video sequences in which the camera point of view, the camera focal, or the image size are modified. This approach has been tested and validated on real video sequences with large camera motion.  相似文献   

14.
This paper presents a fast technique for fine estimation of two-dimensional (2-D) parameters, based on a parabolic interpolation of the same ambiguity function samples, and aimed at block-oriented estimation of the spatial shift between pairs of images in video sequences. Expressions for the bias and variance of the position error and the prediction error are derived. The method is tested using a synthetically generated autocorrelation function, varying the directionality and the eccentricity factor, in order to compare the performance of the proposed 2-D estimator to the case of two separate one-dimensional (1-D) estimators. The method has also been applied in vision systems, evidencing encouraging results for estimating the parameters of sophisticated global motion models from real images  相似文献   

15.
This paper integrates fully automatic video object segmentation and tracking including detection and assignment of uncovered regions in a 2-D mesh-based framework. Particular contributions of this work are (i) a novel video object segmentation method that is posed as a constrained maximum contrast path search problem along the edges of a 2-D triangular mesh, and (ii) a 2-D mesh-based uncovered region detection method along the object boundary as well as within the object. At the first frame, an optimal number of feature points are selected as nodes of a 2-D content-based mesh. These points are classified as moving (foreground) and stationary nodes based on multi-frame node motion analysis, yielding a coarse estimate of the foreground object boundary. Color differences across triangles near the coarse boundary are employed for a maximum contrast path search along the edges of the 2-D mesh to refine the boundary of the video object. Next, we propagate the refined boundary to the subsequent frame by using motion vectors of the node points to form the coarse boundary at the next frame. We detect occluded regions by using motion-compensated frame differences and range filtered edge maps. The boundaries of detected uncovered regions are then refined by using the search procedure. These regions are either appended to the foreground object or tracked as new objects. The segmentation procedure is re-initialized when unreliable motion vectors exceed a certain number. The proposed scheme is demonstrated on several video sequences.  相似文献   

16.
Immersive projection technology has become very popular as a virtual reality display system. A 2.5-D video avatar method was proposed and developed. The 2.5-D video avatar was created using a depth map generated by a stereo camera, and it was superimposed on the shared virtual world in real time. A 2.5-D video avatar was also transmitted between two immersive projection displays, computer augmented booth for image navigation (CABIN) and COSMOS, which were connected by a high bandwidth ATM network. In addition, we experimentally evaluated the accuracy of pointing when using the 2.5-D video avatar  相似文献   

17.
为了能够实时方便快捷地研究岩石在受压下的应变情况,采用数字实时全息的方法,将1块15cm×15cm×1.5cm的岩石装在3维加力架上作为研究对象,在数字实时全息的光路中用CMOS采集记录下岩石在不同条件下受压应变过程中的应力场分布及其变化情况的干涉条纹,并将这一个视频文件用MATLAB编程语言做还原处理,得到了岩石在常温条件下、温度变化条件下、预打孔条件下受压应变时实时的变化情况。结果表明,数字实时全息法是一种高精度、无损、全场的检测方法。数字实时全息法不仅能够方便快速地得到岩石在不同受力条件下实时的变化情况,而且还能够得到与岩石类似的其它材料的受力应变的情况。  相似文献   

18.
This paper presents an integrated method to identify an object pattern from an image, and track its movement over a sequence of images. The sequence of images comes from a single perspective video source, which is capturing data from a precalibrated scene. This information is used to reconstruct the scene in three-dimension (3-D) within a virtual environment where a user can interact and manipulate the system. The steps that are performed include the following: i) Identify an object pattern from a two-dimensional perspective video source. The user outlines the region of interest (ROI) in the initial frame; the procedure builds a refined mask of the dominant object within the ROI using the morphological watershed algorithm. ii) The object pattern is tracked between frames using object matching within the mask provided by the previous and next frame, computing the motion parameters. iii) The identified object pattern is matched with a library of shapes to identify a corresponding 3-D object. iv) A virtual environment is created to reconstruct the scene in 3-D using the 3-D object and the motion parameters. This method can be applied to real-life application problems, such as traffic management and material flow congestion analysis.  相似文献   

19.
Intensity prediction along motion trajectories removes temporal redundancy considerably in video compression algorithms. In three-dimensional (3-D) object-based video coding, both 3-D motion and depth values are required for temporal prediction. The required 3-D motion parameters for each object are found by the correspondence-based E-matrix method. The estimation of the correspondences-two-dimensional (2-D) motion field-between the frames and segmentation of the scene into objects are achieved simultaneously by minimizing a Gibbs energy. The depth field is estimated by jointly minimizing a defined distortion and bit-rate criterion using the 3-D motion parameters. The resulting depth field is efficient in the rate-distortion sense. Bit-rate values corresponding to the lossless encoding of the resultant depth fields are obtained using predictive coding; prediction errors are encoded by a Lempel-Ziv algorithm. The results are satisfactory for real-life video scenes.  相似文献   

20.
Four-dimensional (4-D) imaging to capture the three-dimensional (3-D) structure and motion of the heart in real time is an emerging trend. We present here our method of interactive multiplanar reformatting (MPR), i.e., the ability to visualize any chosen anatomical cross section of 4-D cardiac images and to change its orientation smoothly while maintaining the original heart motion. Continuous animation to show the time-varying 3-D geometry of the heart and smooth dynamic manipulation of the reformatted planes, as well as large image size (100-300 MB), make MPR challenging. Our solution exploits the hardware acceleration of 3-D texture mapping capability of high-end commercial PC graphics boards. Customization of volume subdivision and caching concepts to periodic cardiac data allows us to use this hardware effectively and efficiently. We are able to visualize and smoothly interact with real-time 3-D ultrasound cardiac images at the desired frame rate (25 Hz). The developed methods are applicable to MPR of one or more 3-D and 4-D medical images, including 4-D cardiac images collected in a gated fashion.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号