首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Ye Lu  Ze-Nian Li 《Pattern recognition》2008,41(3):1159-1172
A new method of video object extraction is proposed to automatically extract the object of interest from actively acquired videos. Traditional video object extraction techniques often operate under the assumption of homogeneous object motion and extract various parts of the video that are motion consistent as objects. In contrast, the proposed active video object extraction (AVOE) approach assumes that the object of interest is being actively tracked by a non-calibrated camera under general motion and classifies the possible movements of the camera that result in the 2D motion patterns as recovered from the image sequence. Consequently, the AVOE method is able to extract the single object of interest from the active video. We formalize the AVOE process using notions from Gestalt psychology. We define a new Gestalt factor called “shift and hold” and present 2D object extraction algorithms. Moreover, since an active video sequence naturally contains multiple views of the object of interest, we demonstrate that these views can be combined to form a single 3D object regardless of whether the object is static or moving in the video.  相似文献   

2.
3D video [IEEE Multimedia (1997) 18] is the ultimate image media recording dynamic visual events in the real world as is; it records time varying 3D object shape with high fidelity surface properties (i.e., color and texture). Its applications cover wide varieties of personal and social human activities: entertainment (e.g., 3D game and 3D TV), education (e.g., 3D animal picture books), sports (e.g., sport performance analysis), medicine (e.g., 3D surgery monitoring), culture (e.g., 3D archive of traditional dances), and so on. In this paper, we propose: (1) a PC cluster system for real-time reconstruction of dynamic 3D object action from multi-view video images, (2) a deformable 3D mesh model for reconstructing the accurate dynamic 3D object shape, and (3) an algorithm of rendering natural-looking texture on the 3D object surface from the multi-view video images. Experimental results with quantitative performance evaluations demonstrate the effectiveness of these methods in generating high fidelity 3D video from multi-view video images.  相似文献   

3.
We present an algorithm for acquiring the 3D surface geometry and motion of a dynamic piecewise‐rigid object using a single depth video camera. The algorithm identifies and tracks the rigid components in each frame, while accumulating the geometric information acquired over time, possibly from different viewpoints. The algorithm also reconstructs the dynamic skeleton of the object, thus can be used for markerless motion capture. The acquired model can then be animated to novel poses. We show the results of the algorithm applied to synthetic and real depth video.  相似文献   

4.
Natural motion synthesis of virtual humans have been studied extensively, however, motion control of virtual characters actively responding to complex dynamic environments is still a challenging task in computer animation. It is a labor and cost intensive animator-driven work to create realistic human motions of character animations in a dynamically varying environment in movies, television and video games. To solve this problem, in this paper we propose a novel approach of motion synthesis that applies the optimal path planning to direct motion synthesis for generating realistic character motions in response to complex dynamic environment. In our framework, SIPP (Safe Interval Path Planning) search is implemented to plan a globally optimal path in complex dynamic environments. Three types of control anchors to motion synthesis are for the first time defined and extracted on the obtained planning path, including turning anchors, height anchors and time anchors. Directed by these control anchors, highly interactive motions of virtual character are synthesized by motion field which produces a wide variety of natural motions and has high control agility to handle complex dynamic environments. Experimental results have proven that our framework is capable of synthesizing motions of virtual humans naturally adapted to the complex dynamic environments which guarantee both the optimal path and the realistic motion simultaneously.  相似文献   

5.
4D Video Textures (4DVT) introduce a novel representation for rendering video‐realistic interactive character animation from a database of 4D actor performance captured in a multiple camera studio. 4D performance capture reconstructs dynamic shape and appearance over time but is limited to free‐viewpoint video replay of the same motion. Interactive animation from 4D performance capture has so far been limited to surface shape only. 4DVT is the final piece in the puzzle enabling video‐realistic interactive animation through two contributions: a layered view‐dependent texture map representation which supports efficient storage, transmission and rendering from multiple view video capture; and a rendering approach that combines multiple 4DVT sequences in a parametric motion space, maintaining video quality rendering of dynamic surface appearance whilst allowing high‐level interactive control of character motion and viewpoint. 4DVT is demonstrated for multiple characters and evaluated both quantitatively and through a user‐study which confirms that the visual quality of captured video is maintained. The 4DVT representation achieves >90% reduction in size and halves the rendering cost.  相似文献   

6.
针对具有点状特征的柔性物体,提出了一种三维运动捕获方法.首先,该方法利用两个标定的高速摄像机拍摄柔性物体的运动视频,并对图像进行立体校正;然后,采用DOG (Difference Of Gaussian)算法获取点状特征的位置,并提取特征点极值;其次,在一定范围的窗口上搜索匹配对,匹配左右图像的特征点;再次,通过三角测量法进行三维重建;最后,利用搜索策略进行时间序列上的匹配,实现动态柔性物体的三维运动捕获,并计算空间坐标、速度、加速度参数.实验结果表明,相比于采用sift算法匹配特征点捕获柔性运动物体的方法,本方法精度更高.  相似文献   

7.
《Real》1997,3(6):415-432
Real-time motion capture plays a very important role in various applications, such as 3D interface for virtual reality systems, digital puppetry, and real-time character animation. In this paper we challenge the problem of estimating and recognizing the motion of articulated objects using theoptical motion capturetechnique. In addition, we present an effective method to control the articulated human figure in realtime.The heart of this problem is the estimation of 3D motion and posture of an articulated, volumetric object using feature points from a sequence of multiple perspective views. Under some moderate assumptions such as smooth motion and known initial posture, we develop a model-based technique for the recovery of the 3D location and motion of a rigid object using a variation of Kalman filter. The posture of the 3D volumatric model is updated by the 2D image flow of the feature points for all views. Two novel concepts – the hierarchical Kalman filter (KHF) and the adaptive hierarchical structure (AHS) incorporating the kinematic properties of the articulated object – are proposed to extend our formulation for the rigid object to the articulated one. Our formulation also allows us to avoid two classic problems in 3D tracking: the multi-view correspondence problem, and the occlusion problem. By adding more cameras and placing them appropriately, our approach can deal with the motion of the object in a very wide area. Furthermore, multiple objects can be handled by managing multiple AHSs and processing multiple HKFs.We show the validity of our approach using the synthetic data acquired simultaneously from the multiple virtual camera in a virtual environment (VE) and real data derived from a moving light display with walking motion. The results confirm that the model-based algorithm works well on the tracking of multiple rigid objects.  相似文献   

8.
While there are various commercial-strength editing tools available today for still images, object-based manipulation of real-world video footage is still a challenging problem. In this system paper, we present a framework for interactive video editing. Our focus is on footage from a single, conventional video camera. By relying on spatio-temporal editing techniques operating on the video cube, we do not need to recover 3D scene geometry. Our framework is capable of removing and inserting objects, object motion editing, non-rigid object deformations, keyframe interpolation, as well as emulating camera motion. We demonstrate how movie shots with moderate complexity can be persuasively modified during post-processing.  相似文献   

9.
Video remains the method of choice for capturing temporal events. However, without access to the underlying 3D scene models, it remains difficult to make object level edits in a single video or across multiple videos. While it may be possible to explicitly reconstruct the 3D geometries to facilitate these edits, such a workflow is cumbersome, expensive, and tedious. In this work, we present a much simpler workflow to create plausible editing and mixing of raw video footage using only sparse structure points (SSP) directly recovered from the raw sequences. First, we utilize user‐scribbles to structure the point representations obtained using structure‐from‐motion on the input videos. The resultant structure points, even when noisy and sparse, are then used to enable various video edits in 3D, including view perturbation, keyframe animation, object duplication and transfer across videos, etc. Specifically, we describe how to synthesize object images from new views adopting a novel image‐based rendering technique using the SSPs as proxy for the missing 3D scene information. We propose a structure‐preserving image warping on multiple input frames adaptively selected from object video, followed by a spatio‐temporally coherent image stitching to compose the final object image. Simple planar shadows and depth maps are synthesized for objects to generate plausible video sequence mimicking real‐world interactions. We demonstrate our system on a variety of input videos to produce complex edits, which are otherwise difficult to achieve.  相似文献   

10.
We present a technique for coupling simulated fluid phenomena that interact with real dynamic scenes captured as a binocular video sequence. We first process the binocular video sequence to obtain a complete 3D reconstruction of the scene, including velocity information. We use stereo for the visible parts of 3D geometry and surface completion to fill the missing regions. We then perform fluid simulation within a 3D domain that contains the object, enabling one‐way coupling from the video to the fluid. In order to maintain temporal consistency of the reconstructed scene and the animated fluid across frames, we develop a geometry tracking algorithm that combines optic flow and depth information with a novel technique for “velocity completion”. The velocity completion technique uses local rigidity constraints to hypothesize a motion field for the entire 3D shape, which is then used to propagate and filter the reconstructed shape over time. This approach not only generates smoothly varying geometry across time, but also simultaneously provides the necessary boundary conditions for one‐way coupling between the dynamic geometry and the simulated fluid. Finally, we employ a GPU based scheme for rendering the synthetic fluid in the real video, taking refraction and scene texture into account.  相似文献   

11.
一种动态场景下基于时空信息的视频对象提取算法   总被引:2,自引:0,他引:2       下载免费PDF全文
在实际应用中,许多视频序列具有运动背景,使得从其中提取视频对象变得复杂,为此提出了一种基于运动估计和图形金字塔的动态场景下的视频对象提取算法。该算法首先引入了相位相关法求取运动向量,因避免了视频序列中光照变化的影响,故可提高效率和稳健性;接着再根据参数模型进行全局运动估计来得到最终运动模板;然后利用图形金字塔算法对当前模板内图像区域进行空间分割,最终提取出语义视频对象。与现有算法相比,对于从具有动态场景的视频流中提取运动对象的情况,由于使用该算法能有效地避开精准背景补偿,因而不仅节省了计算量,而且提取出来的语义对象精度较高。实验表明,无论是对动态场景中刚性还是非刚性运动物体的分割,该算法都具有较好的效果。  相似文献   

12.
Steering and navigation are important components of character animation systems to enable them to autonomously move in their environment. In this work, we propose a synthetic vision model that uses visual features to steer agents through dynamic environments. Our agents perceive optical flow resulting from their relative motion with the objects of the environment. The optical flow is then segmented and processed to extract visual features such as the focus of expansion and time‐to‐collision. Then, we establish the relations between these visual features and the agent motion, and use them to design a set of control functions which allow characters to perform object‐dependent tasks, such as following, avoiding and reaching. Control functions are then combined to let characters perform more complex navigation tasks in dynamic environments, such as reaching a goal while avoiding multiple obstacles. Agent's motion is achieved by local minimization of these functions. We demonstrate the efficiency of our approach through a number of scenarios. Our work sets the basis for building a character animation system which imitates human sensorimotor actions. It opens new perspectives to achieve realistic simulation of human characters taking into account perceptual factors, such as the lighting conditions of the environment.  相似文献   

13.
In this paper, we show how to estimate, accurately and efficiently, the 3D motion of a rigid object and time-varying lighting in a dynamic scene. This is achieved in an inverse compositional tracking framework with a novel warping function that involves a 2D --> 3D --> 2D transformation. This also allows us to extend traditional two frame inverse compositional tracking to a sequence of frames, leading to even higher computational savings. We prove the theoretical convergence of this method and show that it leads to significant reduction in computational burden. Experimental analysis on multiple video sequences shows impressive speed-up over existing methods while retaining a high level of accuracy.  相似文献   

14.
近年来火灾事故频发,对生态环境,社会经济都造成了严重影响,视频监控系统在火灾预防和环境监控中都有非常重要的作用。针对传统的视频火焰检测方法需要手工提取火焰特征且识别率低、误检率高的缺点,提出了一种基于特征检测,多目标跟踪和深度学习的火焰检测算法。通过高斯混合模型运动检测方法对视频中的动态目标进行提取,再经过HSI与RGB结合的颜色模型进行筛选,得到疑似火焰目标,对提取的目标进行多目标跟踪算法跟踪,最终对稳定存在的目标通过CaffeNet模型进行判别,得到火焰判别结果。实验证明,本算法实现了对视频火焰的准确检测,能对火焰进行有效识别,对火焰视频数据集上的平均识别精度达到98.79%,并能适应实时检测火灾的需求。  相似文献   

15.
视频全局运动(摄像机运动)所表现的视频序列之间的时间相关性,较其它视频特征更能表达视频序列的高层语义信息.为了能够有效快速的得到视频的全局运动,通过对视频运动估计方法的研究,提出了一种新的基于奇异值分解(SVD)的视频全局运动估计算法.该方法首先通过块匹配法得到局部运动场,利用矩阵的奇异值分解估计全局运动参数,然后运用形态学运动滤波得到前景运动目标的粗略掩摸图像,最后综合利用此掩摸图像和边缘信息分割出运动目标.试验表明,提出的算法能够分割出具有全局运动特征的视频序列中的运动目标.  相似文献   

16.
17.
18.
针对当前应用于视频对象分割的图割方法容易在复杂环境、镜头移动、光照不稳定等场景下鲁棒性不佳的问题,提出了结合光流和图割的视频对象分割算法.主要思路是通过分析前景对象的运动信息,得到单帧图像上前景区域的先验知识,从而改善分割结果.论文首先通过光流场采集视频中动作信息,并提取出前景对象先验区域,然后结合前景和背景先验区域建立图割模型,实现前景对象分割.最后为提高算法在不同场景下的鲁棒性,本文改进了传统的测地显著性模型,并基于视频本征的时域平滑性,提出了基于混合高斯模型的动态位置模型优化机制.在两个标准数据集上的实验结果表明,所提算法与当前其他视频对象分割算法相比,降低了分割结果的错误率,有效提高了在多种场景下的鲁棒性.  相似文献   

19.
视频运动对象分割是计算机视觉和视频处理的基本问题。在摄像机存在全局运动的动态场景下,准确分割运动对象依然是难点和热点问题。本文提出一种基于全局运动补偿和核密度检测的动态场景下视频运动对象分割算法。首先,提出匹配加权的全局运动估计补偿算法,消除动态场景下背景运动对运动对象分割的影响;其次,采用非参数核密度估计方法分别估计各像素属于前景与背景的概率密度,通过比较属于前景和属于背景的概率及形态学处理得到运动对象分割结果。实验结果证明,该方法实现简单,有效地提高了动态场景下运动对象分割的准确性。  相似文献   

20.
In this paper, we present a theory for combining the effects of motion, illumination, 3D structure, albedo, and camera parameters in a sequence of images obtained by a perspective camera. We show that the set of all Lambertian reflectance functions of a moving object, at any position, illuminated by arbitrarily distant light sources, lies "close" to a bilinear subspace consisting of nine illumination variables and six motion variables. This result implies that, given an arbitrary video sequence, it is possible to recover the 3D structure, motion, and illumination conditions simultaneously using the bilinear subspace formulation. The derivation builds upon existing work on linear subspace representations of reflectance by generalizing it to moving objects. Lighting can change slowly or suddenly, locally or globally, and can originate from a combination of point and extended sources. We experimentally compare the results of our theory with ground truth data and also provide results on real data by using video sequences of a 3D face and the entire human body with various combinations of motion and illumination directions. We also show results of our theory in estimating 3D motion and illumination model parameters from a video sequence  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号