期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Capturing Intention‐based Full‐Frame Video Stabilization

Bing‐Yu Chen Ken‐Yi Lee Wei‐Ting Huang Jong‐Shan Lin 《Computer Graphics Forum》2008,27(7):1805-1814

Annoying shaky motion is one of the significant problems in home videos, since hand shake is an unavoidable effect when capturing by using a hand‐held camcorder. Video stabilization is an important technique to solve this problem, but the stabilized videos resulting from some current methods usually have decreased resolution and are still not so stable. In this paper, we propose a robust and practical method of full‐frame video stabilization while considering user's capturing intention to remove not only the high frequency shaky motions but also the low frequency unexpected movements. To guess the user's capturing intention, we first consider the regions of interest in the video to estimate which regions or objects the user wants to capture, and then use a polyline to estimate a new stable camcorder motion path while avoiding the user's interested regions or objects being cut out. Then, we fill the dynamic and static missing areas caused by frame alignment from other frames to keep the same resolution and quality as the original video. Furthermore, we smooth the discontinuous regions by using a three‐dimensional Poisson‐based method. After the above automatic operations, a full‐frame stabilized video can be achieved and the important regions and objects can also be preserved. 相似文献

2.

Multiplane Video Stabilization

Zhong‐Qiang Wang Lei Zhang Hua Huang 《Computer Graphics Forum》2013,32(7):265-273

This paper presents a novel video stabilization approach by leveraging the multiple planes structure of video scene to stabilize inter‐frame motion. As opposed to previous stabilization procedure operating in a single plane, our approach primarily deals with multiplane videos and builds their multiple planes structure for performing stabilization in respective planes. Hence, a robust plane detection scheme is devised to detect multiple planes by classifying feature trajectories according to reprojection errors generated by plane induced homographies. Then, an improved planar stabilization technique is applied by conforming to the compensated homography in each plane. Finally, multiple stabilized planes are coherently fused by content‐preserving image warps to obtain the output stabilized frames. Our approach does not need any stereo reconstruction, yet is able to produce commendable results due to awareness of multiple planes structure in the stabilization. Experimental results demonstrate the effectiveness and efficiency of our approach to robust stabilization on multiplane videos. 相似文献

3.

三焦点张量重投影视频稳像算法

下载免费PDF全文

王敬东薛重飞魏雪迎刘云霄《中国图象图形学报》2017,22(7):935-945

目的目前,特征点轨迹稳像算法无法兼顾轨迹长度、鲁棒性及轨迹利用率,因此容易造成该类算法的视频稳像结果扭曲失真或者局部不稳。针对此问题,提出基于三焦点张量重投影的特征点轨迹稳像算法。方法利用三焦点张量构建长虚拟轨迹,通过平滑虚拟轨迹定义稳定视图,然后利用三焦点张量将实特征点重投影到稳定视图,以此实现实特征点轨迹的平滑,最后利用网格变形生成稳定帧。结果对大量不同类型的视频进行稳像效果测试,并且与典型的特征点轨迹稳像算法以及商业软件进行稳像效果对比,其中包括基于轨迹增长的稳像算法、基于对极几何点转移的稳像算法以及商业软件Warp Stabilizer。本文算法的轨迹长度要求低、轨迹利用率高以及鲁棒性好,对于92%剧烈抖动的视频,稳像效果优于基于轨迹增长的稳像算法;对于93%缺乏长轨迹的视频以及71.4%存在滚动快门失真的视频,稳像效果优于Warp Stabilizer;而与基于对极几何点转移的稳像算法相比,退化情况更少,可避免摄像机阶段性静止、摄像机纯旋转等情况带来的算法失效问题。结论本文算法对摄像机运动模式和场景深度限制少,不仅适宜处理缺少视差、场景结构非平面、滚动快门失真等常见的视频稳像问题,而且在摄像机摇头、运动模糊、剧烈抖动等长轨迹缺乏的情况下,依然能取得较好的稳像效果,但该算法的时间性能还有所不足。相似文献

4.

Pixel-wise video stabilization

Zhongqiang Wang Hua Huang 《Multimedia Tools and Applications》2016,75(23):15939-15954

In this paper, we present a novel video stabilization method with a pixel-wise motion model. In order to avoid distortion introduced by traditional feature points based motion models, we focus on constructing a more accurate model to capture the motion in videos. By taking advantage of dense optical flow, we can obtain the dense motion field between adjacent frames and set up a pixel-wise motion model which is accurate enough. Our method first estimates dense motion field between adjacent frames. A PatchMatch based dense motion field estimation algorithm is proposed. This algorithm is specially designed for similar video frames rather than arbitrary images to reach higher speed and better performance. Then, a simple and fast smoothing algorithm is performed to make the jittered motion stabilized. After that, we warp input frames using a weighted average algorithm to construct the output frames. Some pixels in output frames may be still empty after the warping step, so in the last step, these empty pixels are filled using a patch based image completion algorithm. We test our method on many challenging videos and demonstrate the accuracy of our model and the effectiveness of our method. 相似文献

5.

Lazy texture selection based on active learning

Tian Xia Qing Wu Chun Chen Yizhou Yu 《The Visual computer》2010,26(3):157-169

Interactive selection of desired textures and textured objects from a video is a challenging problem in video editing. In this paper, we present a scalable framework that accurately selects textured objects with only moderate user interaction. Our method applies the active learning methodology, and the user only needs to label minimal initial training data and subsequent query data. An active learning algorithm uses these labeled data to obtain an initial classifier and iteratively improves it until its performance becomes satisfactory. A revised graph-cut algorithm based on the trained classifier has also been developed to improve the spatial coherence of selected texture regions. We show that our system is responsive even with videos of a large number of frames, and it frees the user from extensive labeling work. A variety of operations, such as color editing, compositing, and texture cloning, can be then applied to the selected textures to achieve interesting editing effects. 相似文献

6.

Real‐time facial expression transfer with single video camera

Shuang Liu Xiaosong Yang Zhao Wang Zhidong Xiao Jianjun Zhang 《Computer Animation and Virtual Worlds》2016,27(3-4):301-310

Facial expression transfer has been actively researched in the past few years. Existing methods either suffer from depth ambiguity or require special hardware. We present a novel marker‐less, real‐time facial transfer method that requires only a single video camera. We develop a robust model, which is adaptive to user‐specific facial data. It computes expression variances in real time and rapidly transfers them onto a target character either from images or videos. Our method can be applied to videos without prior camera calibration and focal adjustment. It enables realistic online facial expression editing and performance transferring in many scenarios such as video conference, news broadcasting, lip‐syncing for song performances and so on. With low computational cost and hardware requirement, our method tracks a single user at an average of 38fps and runs smoothly even in web browsers. Copyright © 2016 John Wiley & Sons, Ltd. 相似文献

7.

Full-frame video stabilization with motion inpainting 总被引：1，自引：0，他引：1

Matsushita Y Ofek E Ge W Tang X Shum HY 《IEEE transactions on pattern analysis and machine intelligence》2006,28(7):1150-1163

Video stabilization is an important video enhancement technology which aims at removing annoying shaky motion from videos. We propose a practical and robust approach of video stabilization that produces full-frame stabilized videos with good visual quality. While most previous methods end up with producing smaller size stabilized videos, our completion method can produce full-frame videos by naturally filling in missing image parts by locally aligning image data of neighboring frames. To achieve this, motion inpainting is proposed to enforce spatial and temporal consistency of the completion in both static and dynamic image areas. In addition, image quality in the stabilized video is enhanced with a new practical deblurring algorithm. Instead of estimating point spread functions, our method transfers and interpolates sharper image pixels of neighboring frames to increase the sharpness of the frame. The proposed video completion and deblurring methods enabled us to develop a complete video stabilizer which can naturally keep the original image quality in the stabilized videos. The effectiveness of our method is confirmed by extensive experiments over a wide variety of videos. 相似文献

8.

Video Shadow Removal Using Spatio‐temporal Illumination Transfer

下载免费PDF全文

Ling Zhang Yao Zhu Bin Liao Chunxia Xiao 《Computer Graphics Forum》2017,36(7):125-134

Shadow removal for videos is an important and challenging vision task. In this paper, we present a novel shadow removal approach for videos captured by free moving cameras using illumination transfer optimization. We first detect the shadows of the input video using interactive fast video matting. Then, based on the shadow detection results, we decompose the input video into overlapped 2D patches, and find the coherent correspondences between the shadow and non‐shadow patches via discrete optimization technique built on the patch similarity metric. We finally remove the shadows of the input video sequences using an optimized illumination transfer method, which reasonably recovers the illumination information of the shadow regions and produces spatio‐temporal shadow‐free videos. We also process the shadow boundaries to make the transition between shadow and non‐shadow regions smooth. Compared with previous works, our method can handle videos captured by free moving cameras and achieve better shadow removal results. We validate the effectiveness of the proposed algorithm via a variety of experiments. 相似文献

9.

Deep Video‐Based Performance Cloning

K. Aberman M. Shi J. Liao D. Lischinski B. Chen D. Cohen‐Or 《Computer Graphics Forum》2019,38(2):219-233

We present a new video‐based performance cloning technique. After training a deep generative network using a reference video capturing the appearance and dynamics of a target actor, we are able to generate videos where this actor reenacts other performances. All of the training data and the driving performances are provided as ordinary video segments, without motion capture or depth information. Our generative model is realized as a deep neural network with two branches, both of which train the same space‐time conditional generator, using shared weights. One branch, responsible for learning to generate the appearance of the target actor in various poses, uses paired training data, self‐generated from the reference video. The second branch uses unpaired data to improve generation of temporally coherent video renditions of unseen pose sequences. Through data augmentation, our network is able to synthesize images of the target actor in poses never captured by the reference video. We demonstrate a variety of promising results, where our method is able to generate temporally coherent videos, for challenging scenarios where the reference and driving videos consist of very different dance performances. 相似文献

10.

即时全变差优化的低延时视频稳像方法

下载免费PDF全文

刘天张磊黄华《中国图象图形学报》2018,23(2):293-302

目的传统的视频稳像方法为了获得理想的稳像效果,一般耗费较多的计算时间,且存在较长的延时。针对此问题,提出一种即时全变差优化的低延时视频稳像方法。方法首先利用特征点检测和匹配计算帧间单应变换,得到抖动视频的运动路径;然后通过即时全变差优化方法对抖动路径进行平滑优化,获得稳定的运动路径;最后通过运动补偿,生成稳定的视频。结果对公共视频数据集中的抖动视频进行稳像效果测试,并与当前稳像效果较好的几种稳像算法和商业软件进行效果和时间对比。在时间方面,统计了不同方法的每帧平均消耗时间和处理延迟帧数,不同于后期处理方法需要得到大部分视频帧才能够进行计算,本文算法能够在只有一帧延时的情况下获得最终的稳像结果,相比于MeshFlow方法有15%左右的速度提升;在稳像效果方面,计算了不同方法稳像后的视频扭曲率和裁剪率,并邀请非专业用户进行了稳定程度的主观判断,本文算法的实验结果并不输于目前被公认较好的3种后期稳像方法,优于Kalman滤波方法。结论本文所提稳像方法能够兼顾速度和有效性,相对于传统方法,更适合低延时要求的应用场景。相似文献

11.

Creating video art with evolutionary algorithms

Teresa Chambel Luís Correia Jnatas Manzolli Gonalo Dias Miguel Nuno A.C. Henriques Nuno Correia 《Computers & Graphics》2007,31(6):837-847

The boundaries of art are subjective, but the impetus for art is often associated with creativity, regarded with wonder and admiration along human history. Most interesting activities and their products are a result of creativity. The main goal of our approach is to explore new creative ways of editing and producing videos, using evolutionary algorithms. A creative evolutionary system makes use of evolutionary computation operators and properties and is designed to aid our own creative processes, and to generate results to problems that traditionally required creative people to solve. Our system is able to generate new videos or to help a user in doing so. New video sequences are combined and selected, based on their characteristics represented as video annotations, either by defining criteria or by interactively performing selections in the evolving population of video clips, in forms that can reflect editing styles. With evolving video, the clips can be explored through emergent narratives and aesthetics in ways that may reveal or inspire creativity in digital art. 相似文献

12.

Video recommendation over multiple information sources

Xiaojian Zhao Jin Yuan Meng Wang Guangda Li Richang Hong Zhoujun Li Tat-Seng Chua 《Multimedia Systems》2013,19(1):3-15

Video recommendation is an important tool to help people access interesting videos. In this paper, we propose a universal scheme to integrate rich information for personalized video recommendation. Our approach regards video recommendation as a ranking task. First, it generates multiple ranking lists by exploring different information sources. In particular, one novel source user’s relationship strength is inferred through the online social network and applied to recommend videos. Second, based on multiple ranking lists, a multi-task rank aggregation approach is proposed to integrate these ranking lists to generate a final result for video recommendation. It is shown that our scheme is flexible that can easily incorporate other methods by adding their generated ranking lists into our multi-task rank aggregation approach. We conduct experiments on a large dataset with 76 users and more than 11,000 videos. The experimental results demonstrate the feasibility and effectiveness of our approach. 相似文献

13.

基于弹幕情感分析和主题模型的视频推荐算法

朱思淼魏世伟魏思恒余敦辉《计算机应用》2021,41(10):2813-2819

针对互联网上大量自制视频缺少用户评分、推荐准确率不高的问题,提出一种融合弹幕情感分析和主题模型的视频推荐算法（VRDSA）。首先,对视频的弹幕评论进行情感分析,得到视频的情感向量,之后基于情感向量计算视频之间的情感相似度;同时,基于视频的标签建立主题模型来得到视频标签的主题分布,并使用主题分布计算视频之间的主题相似度;接着,对视频的情感相似度和主题相似度进行融合得到视频间的综合相似度;然后,结合视频间的综合相似度和用户的历史记录得到用户对视频的偏好度;同时通过视频的点赞量、弹幕量、收藏数等用户互动指标对视频的大众认可度进行量化,并结合用户历史记录计算出视频的综合认可度;最后,基于用户对视频的偏好度和视频的综合认可度预测用户对视频的认可度,并生成个性化推荐列表来完成视频的推荐。实验结果表明,与融合协同过滤和主题模型的弹幕视频推荐算法（DRCFT）以及嵌入LDA主题模型的协同过滤算法（ULR-itemCF）相比,所提算法推荐的准确率平均提高了17.1%,召回率平均提高了22.9%,F值平均提高了22.2%。所提算法对弹幕进行情感分析,并融合主题模型,以此来完成对视频的推荐,并且充分挖掘了弹幕数据的情感性,使得推荐结果更加准确。相似文献

14.

Controlling Motion Blur in Synthetic Long Time Exposures

M. Lancelle P. Dogan M. Gross 《Computer Graphics Forum》2019,38(2):393-403

In a photo, motion blur can be used as an artistic style to convey motion and to direct attention. In panning or tracking shots, a moving object of interest is followed by the camera during a relatively long exposure. The goal is to get a blurred background while keeping the object sharp. Unfortunately, it can be difficult to impossible to precisely follow the object. Often, many attempts or specialized physical setups are needed. This paper presents a novel approach to create such images. For capturing, the user is only required to take a casually recorded hand‐held video that roughly follows the object. Our algorithm then produces a single image which simulates a stabilized long time exposure. This is achieved by first warping all frames such that the object of interest is aligned to a reference frame. Then, optical flow based frame interpolation is used to reduce ghosting artifacts from temporal undersampling. Finally, the frames are averaged to create the result. As our method avoids segmentation and requires little to no user interaction, even challenging sequences can be processed successfully. In addition, artistic control is available in a number of ways. The effect can also be applied to create videos with an exaggerated motion blur. Results are compared with previous methods and ground truth simulations. The effectiveness of our method is demonstrated by applying it to hundreds of datasets. The most interesting results are shown in the paper and in the supplemental material. 相似文献

15.

Interactive Videos: Plausible Video Editing using Sparse Structure Points

下载免费PDF全文

Chia‐Sheng Chang Hung‐Kuo Chu Niloy J. Mitra 《Computer Graphics Forum》2016,35(2):489-500

Video remains the method of choice for capturing temporal events. However, without access to the underlying 3D scene models, it remains difficult to make object level edits in a single video or across multiple videos. While it may be possible to explicitly reconstruct the 3D geometries to facilitate these edits, such a workflow is cumbersome, expensive, and tedious. In this work, we present a much simpler workflow to create plausible editing and mixing of raw video footage using only sparse structure points (SSP) directly recovered from the raw sequences. First, we utilize user‐scribbles to structure the point representations obtained using structure‐from‐motion on the input videos. The resultant structure points, even when noisy and sparse, are then used to enable various video edits in 3D, including view perturbation, keyframe animation, object duplication and transfer across videos, etc. Specifically, we describe how to synthesize object images from new views adopting a novel image‐based rendering technique using the SSPs as proxy for the missing 3D scene information. We propose a structure‐preserving image warping on multiple input frames adaptively selected from object video, followed by a spatio‐temporally coherent image stitching to compose the final object image. Simple planar shadows and depth maps are synthesized for objects to generate plausible video sequence mimicking real‐world interactions. We demonstrate our system on a variety of input videos to produce complex edits, which are otherwise difficult to achieve. 相似文献

16.

A PatchMatch‐based Approach for Matte Propagation in Videos

Marcos H. Backes Manuel M. Oliveira 《Computer Graphics Forum》2019,38(7):651-662

Despite considerable advances in natural image matting over the last decades, video matting still remains a difficult problem. The main challenges faced by existing methods are the large amount of user input required, and temporal inconsistencies in mattes between pairs of adjacent frames. We present a temporally‐coherent matte‐propagation method for videos based on PatchMatch and edge‐aware filtering. Given an input video and trimaps for a few frames, including the first and last, our approach generates alpha mattes for all frames of the video sequence. We also present a user scribble‐based interface for video matting that takes advantage of the efficiency of our method to interactively refine the matte results. We demonstrate the effectiveness of our approach by using it to generate temporally‐coherent mattes for several natural video sequences. We perform quantitative comparisons against the state‐of‐the‐art sparse‐input video matting techniques and show that our method produces significantly better results according to three different metrics. We also perform qualitative comparisons against the state‐of‐the‐art dense‐input video matting techniques and show that our approach produces similar quality results while requiring only about 7% of the amount of user input required by such techniques. These results show that our method is both effective and user‐friendly, outperforming state‐of‐the‐art solutions. 相似文献

17.

Usage derived recommendations for a video digital library

《Journal of Network and Computer Applications》2007,30(3):1059-1083

We describe a minimalist methodology to develop usage-based recommender systems for multimedia digital libraries. A prototype recommender system based on this strategy was implemented for the Open Video Project, a digital library of videos that are freely available for download. Sequential patterns of video retrievals are extracted from the project's web download logs and analyzed to generate a network of video relationships. A spreading activation algorithm locates video recommendations by searching for associative paths connecting query-related videos. We evaluate the performance of the resulting system relative to an item-based collaborative filtering technique operating on user profiles extracted from the same log data. 相似文献

18.

A web-based tool for fast instance-level labeling of videos and the creation of spatiotemporal media fragments

Anastasia Ioannidou Evlampios Apostolidis Chrysa Collyda Vasileios Mezaris 《Multimedia Tools and Applications》2017,76(2):1735-1774

相似文献

19.

A Generalized Cubemap for Encoding 360° VR Videos using Polynomial Approximation

J. Y. Xiao J.T. Tang X. Y. Zhang 《Computer Graphics Forum》2019,38(7):359-367

360^° VR videos provide users with an immersive visual experience. To encode 360^° VR videos, spherical pixels must be mapped onto a two‐dimensional domain to take advantage of the existing video encoding and storage standards. In VR industry, standard cubemap projection is the most widely used projection method for encoding 360^° VR videos. However, it exhibits pixel density variation at different regions due to projection distortion. We present a generalized algorithm to improve the efficiency of cubemap projection using polynomial approximation. In our algorithm, standard cubemap projection can be regarded as a special form with 1st‐order polynomial. Our experiments show that the generalized cubemap projection can significantly reduce the projection distortion using higher order polynomials. As a result, pixel distribution can be well balanced in the resulting 360^° VR videos. We use PSNR, S‐PSNR and CPP‐PSNR to evaluate the visual quality and the experimental results demonstrate promising performance improvement against standard cubemap projection and Google's equi‐angular cubemap. 相似文献

20.

Creating Fluid Animation from a Single Image using Video Database

Makoto Okabe Ken Anjyor Rikio Onai 《Computer Graphics Forum》2011,30(7):1973-1982

We present a method for synthesizing fluid animation from a single image, using a fluid video database. The user inputs a target painting or photograph of a fluid scene along with its alpha matte that extracts the fluid region of interest in the scene. Our approach allows the user to generate a fluid animation from the input image and to enter a few additional commands about fluid orientation or speed. Employing the database of fluid examples, the core algorithm in our method then automatically assigns fluid videos for each part of the target image. Our method can therefore deal with various paintings and photographs of a river, waterfall, fire, and smoke. The resulting animations demonstrate that our method is more powerful and efficient than our prior work. 相似文献