首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
为了提高关键帧提取的准确率,改善视频摘要的质量,提出了一种HEVC压缩域的视频摘要关键帧提取方法。首先,对视频序列进行编解码,在解码中统计HEVC帧内编码PU块的亮度预测模式数目。然后,特征提取是利用统计得到的模式数目构建成模式特征向量,并将其作为视频帧的纹理特征用于关键帧的提取。最后,利用融合迭代自组织数据分析算法(ISODATA)的自适应聚类算法对模式特征向量进行聚类,在聚类结果中选取每个类内中间向量对应的帧作为候选关键帧,并通过相似度对候选关键帧进行再次筛选,剔除冗余帧,得到最终的关键帧。实验结果表明,在Open Video Project数据集上进行的大量实验验证,该方法提取关键帧的精度为79.9%、召回率达到93.6%、F-score为86.2%,有效地改善了视频摘要的质量。   相似文献   

2.
基于图中心和自动阈值的关键帧提取方法   总被引:4,自引:0,他引:4  
随着因特网的迅猛发展,迫切需要对越来越多的视频信息进行迅速且低成本的管理.而关键帧的使用可以大大减少视频索引的数据量,同时也为查询和检索视频提供了一个组织框架.文章中利用图与视频结构的相似性,将求关键帧的问题转化为求图的P中心问题;同时利用镜头之间的活动度不同,自动选取阈值.实验结果表明提取的关键帧集能够很好地表示镜头的主要内容,且有较低的计算复杂度。  相似文献   

3.
提出了一种基于交互信息量的视频摘要生成方法。该方法首先使用基于交互信息量的方法进行视频镜头检测,通过对检测到镜头帧的聚类提取镜头候选关键帧。然后对候选关键帧按照相邻帧间交互信息量的比较来提取镜头关键帧,最后将镜头关键帧按时序排列在一起形成视频摘要。试验表明,这种关键帧提取算法是有效的,其建立的视频摘要能较好的反映原视频的内容。  相似文献   

4.
Video representation through key frames has been addressed frequently as an efficient way of preserving the whole temporal information of sequence with a considerably smaller amount of data. Such compact video representation is suitable for the purpose of video browsing in limited storage or transmission bandwidth environments. In this case, the controllability of the total key frame number (i.e. key frame rate) depending on the storage or bandwidth capacity is an important requirement for the key frame selection method. In this paper, we present a sequential key frame selection method when the number of key frames is given as a constraint. It first selects the pre-determined number of initial key frames and time-intervals. Then, it adjusts the positions of key frames and time-intervals by iteration, which reduces the distortion step by step. Experimental results demonstrate the improved performance of our algorithm over the existing approaches.  相似文献   

5.
达婷  李芝棠 《通信学报》2014,35(Z1):6-30
针对视频帧的时域特性,提出一种以帧间相关性构建带权值无向图的视频隐写分析方法。首先,分别计算待检测视频每帧亮度的灰度共生矩阵,把得出的8维特征作为该帧的特征向量。再以视频帧为节点,用帧间特征向量的欧氏距离作为节点间的权值,构造出表示帧间相关性的带权值无向图。根据嵌入信息后帧间相关性发生改变的特性来判断视频中是否有秘密信息嵌入。实验结果表明,用带权值无向图的方法可以快速准确地区分载密视频和原始视频,并且有较高的正确率。  相似文献   

6.
Video summarization is a method to reduce redundancy and generate succinct representation of the video data. One of the mechanisms to generate video summaries is to extract key frames which represent the most important content of the video. In this paper, a new technique for key frame extraction is presented. The scheme uses an aggregation mechanism to combine the visual features extracted from the correlation of RGB color channels, color histogram, and moments of inertia to extract key frames from the video. An adaptive formula is then used to combine the results of the current iteration with those from the previous. The use of the adaptive formula generates a smooth output function and also reduces redundancy. The results are compared to some of the other techniques based on objective criteria. The experimental results show that the proposed technique generates summaries that are closer to the summaries created by humans.  相似文献   

7.

Video compression is one among the pre-processes in video streaming. While capturing moving objects with moving cameras, more amount of redundant data is recorded along with dynamic change. In this paper, this change is identified using various geometric transformations. To register all these dynamic relations with minimal storage, tensor representation is used. The amount of similarity between the frames is measured using canonical correlation analysis (CCA). The key frames are identified by comparing the canonical auto-correlation analysis score of the candidate key frame with CCA score of other frames. In this method, coded video is represented using tensor which consists of intra-coded key frame, a vector of P frame identifiers, transformation of each variable sized block and information fusion that has three levels of abstractions: measurements, characteristics and decisions that combine all these factors into a single entity. Each dimension can have variable sizes which facilitates storing all characteristics without missing any information. In this paper, the proposed video compression method is applied to under-water videos that have more redundancy as both the camera and the underwater species are in motion. This method is compared with H.264, H.265 and some recent compression methods. Metrics like Peak Signal to Noise Ratio and compression ratio for various bit rates are used to evaluate the performance. From the results obtained, it is obvious that the proposed method performs compression with a high compression ratio, and the loss is comparatively less.

  相似文献   

8.
冀中  樊帅飞 《电子学报》2017,45(5):1035-1043
视频摘要技术作为一种快速感知视频内容的方式得到了广泛的关注.现有基于图模型的视频摘要方法将视频帧作为顶点,通过边表示两个顶点之间的关系,但并不能很好地捕获视频帧之间的复杂关系.为了克服该缺点,本文提出了一种基于超图排序算法的静态视频摘要方法(Hyper-Graph Ranking based Video Summarization,HGRVS).HGRVS方法首先通过构建视频超图模型,将任意多个有内在关联的视频帧使用一条超边连接;然后提出一种基于超图排序的视频帧分类算法将视频帧按内容分类;最后通过求解提出的一种优化函数来生成静态视频摘要.在Open Video Project和YouTube两个数据集上的大量主观与客观实验验证了所提HGRVS算法的优良性能.  相似文献   

9.
关键帧是视频中的一组有限数量的帧的子集,一个视频的关键帧序列能够合理地概括该视频信息,从而减少过大的视频数据对生产生活带来的承载负重.本文讨论了基于Tsallis熵的Jensen距离公式——JTD在视频关键帧提取中的应用.根据得到的差异性距离值JTD,首先检查子镜头边界,进而从每个子镜头中抽取一帧作为该镜头的代表帧,最终得到该段视频的关键帧序列.  相似文献   

10.

It is a generic belief that digital video can be proffered as visual evidence in areas like politics, criminal litigation, journalism and military intelligence services. Multicamera smartphones with megapixels of resolution are a common hand-held device used by everyone. This has made the task of video recording very easy. At the same time a variety of applications available on smart phones have made this indispensable source of information vulnerable to deliberate manipulations. Hence, content authentication of video evidence becomes essential. Copy-move forgery or Copy-paste forgery is consequential forgery done by forgers for changing the basic understanding of the scene. Removal or addition of frames in a video clip can also be managed by advanced apps on smartphones. In case of surveillance, the video camera and the background are stable which makes forgery easy and imperceptible. Therefore, accurate Video forgery detection is crucial. This paper proposes an efficient method—VFDHSOG based on Histograms of the second order gradient to locate ‘suspicious’ frames and then localize the CMF within the frame. A ‘suspicious’ frame is located by computing correlation coefficients of the HSOG feature after obtaining a binary image of a frame. Performance evaluation is done using the benchmark datasets Surrey university library for forensic analysis (SULFA), the Video tampering dataset (VTD) and SYSU-OBJFORGED dataset. SULFA has video files of different quality like q10, q20 etc., which represents high compression. The VTD dataset provides both, i.e. inter and intra frame forgery. The SYSU dataset covers different attacks like scaling and rotation. An overall accuracy of 92.26% is achieved with the capability to identify attacks like scale up/down and rotation.

  相似文献   

11.
Video summarization can facilitate rapid browsing and efficient video indexing in many applications. A good summary should maintain the semantic interestingness and diversity of the original video. While many previous methods extracted key frames based on low-level features, this study proposes Memorability-Entropy-based video summarization. The proposed method focuses on creating semantically interesting summaries based on image memorability. Further, image entropy is introduced to maintain the diversity of the summary. In the proposed framework, perceptual hashing-based mutual information (MI) is used for shot segmentation. Then, we use a large annotated image memorability dataset to fine-tune Hybrid-AlexNet. We predict the memorability score by using the fine-tuned deep network and calculate the entropy value of the images. The frame with the maximum memorability score and entropy value in each shot is selected to constitute the video summary. Finally, our method is evaluated on a benchmark dataset, which comes with five human-created summaries. When evaluating our method, we find it generates high-quality results, comparable to human-created summaries and conventional methods.  相似文献   

12.
The huge amount of video data on the internet requires efficient video browsing and retrieval strategies. One of the viable solutions is to provide summaries of the videos in the form of key frames. The video summarization using visual attention modeling has been used of late. In such schemes, the visually salient frames are extracted as key frames on the basis of theories of human attention modeling. The visual attention modeling schemes have proved to be effective in video summarization. However, the high computational costs incurred by these techniques limit their applicability in practical scenarios. In this context, this paper proposes an efficient visual attention model based key frame extraction method. The computational cost is reduced by using the temporal gradient based dynamic visual saliency detection instead of the traditional optical flow methods. Moreover, for static visual saliency, an effective method employing discrete cosine transform has been used. The static and dynamic visual attention measures are fused by using a non-linear weighted fusion method. The experimental results indicate that the proposed method is not only efficient, but also yields high quality video summaries.  相似文献   

13.
针对现有视频水印算法在抗几何攻击能力方面的一些不足,提出一种基于关键帧和小波变换的盲视频水印算法.依据视频帧间差欧氏距离提取视频关键帧并分块,结合人类视觉系统特性,从关键帧不同的块中选择亮度敏感性和纹理敏感性最高的块构成三维体块,进行三维小波变换,将置乱的水印图像以不同的嵌入强度嵌入到小波系数的低频区域,利用阈值门限实现水印的盲提取.结果表明,该算法提高了水印嵌入容量,对于针对视频水印的攻击具有较好的不可见性和稳健性.  相似文献   

14.
基于压缩传感和EMD距离的视频镜头关键帧提取   总被引:2,自引:2,他引:0  
潘磊  束鑫  程科 《电视技术》2015,39(17):5-8
关键帧提取是视频内容分析与检索技术的核心问题。提出一种基于压缩传感和EMD距离的关键帧提取方法,首先构造一个符合有限等距性质的稀疏矩阵,将帧高维特征投影到低维空间,然后通过计算帧低维特征之间的调节余弦相似度完成子镜头分割。在各子镜头中,利用EMD距离计算帧与子镜头中心的差异,并选择差异最小值所对应的帧作为该子镜头的关键帧。实验结果表明,该方法提取的关键帧能够对视频内容进行准确的描述。  相似文献   

15.
Automatic temporal segmentation and visual summary generation methods that require minimal user interaction are key requirements in video information management systems. Clustering presents an ideal method for achieving these goals, as it allows direct integration of multiple information sources. This paper proposes a clustering-based framework to achieve these tasks automatically and with a minimum of user-defined parameters. The use of multiple frame difference features and short-time techniques are presented for efficient detection of cut-type shot boundaries. Generic temporal filtering methods are used to process the signals used in shot boundary detection, resulting in better suppression of false alarms. Clustering is also extended to the key frame extraction problem: Color-based shot representations are provided by average and intersection histograms, which are then used in a clustering scheme to identify reference key frames within each slot. The technique achieves good compaction with a minimum number of visually nonredundant key frames.  相似文献   

16.
基于动态规划的自适应关键帧提取算法   总被引:2,自引:2,他引:0  
提出一种基于内容的视频检索系统的关键帧提取新算法,把关键帧提取问题建模为一个可以用动态规划算法隶解的全局优化问题.首先建立二值的帧差矩阵来表示低维特征空间中帧与帧之间的相似性度量,然后使用动态规划算法分割帧差矩阵从而提取出关键帧.该算法具有低计算复杂度和对于视频内容的自适应性,而且保持了关键帧的时间顺序.可以方便地根据需要调节关键帧数目.  相似文献   

17.
With the fast evolution of digital video, research and development of new technologies are greatly needed to lower the cost of video archiving, cataloging and indexing, as well as improve the efficiency and accessibility of stored video sequences. A number of methods to respectively meet these requirements have been researched and proposed. As one of the most important research topics, video abstraction helps to enable us to quickly browse a large video database and to achieve efficient content access and representation. In this paper, a video abstraction algorithm based on the visual attention model and online clustering is proposed. First, shot boundaries are detected and key frames in each shot are extracted so that consecutive key frames in a shot have the same distance. Second, the spatial saliency map indicating the saliency value of each region of the image is generated from each key frame and regions of interest (ROI) is extracted according to the saliency map. Third, key frames, as well as their corresponding saliency map, are passed to a specific filter, and several thresholds are used so that the key frames containing less information are discarded. Finally, key frames are clustered using an online clustering method based on the features in ROIs. Experimental results demonstrate the performance and effectiveness of the proposed video abstraction algorithm.  相似文献   

18.
This paper addresses the problem of side information extraction for distributed coding of videos captured by a camera moving in a 3-D static environment. Examples of targeted applications are augmented reality, remote-controlled robots operating in hazardous environments, or remote exploration by drones. It explores the benefits of the structure-from-motion paradigm for distributed coding of this type of video content. Two interpolation methods constrained by the scene geometry, based either on block matching along epipolar lines or on 3-D mesh fitting, are first developed. These techniques are based on a robust algorithm for sub-pel matching of feature points, which leads to semi-dense correspondences between key frames. However, their rate-distortion (RD) performances are limited by misalignments between the side information and the actual Wyner-Ziv (WZ) frames due to the assumption of linear motion between key frames. To cope with this problem, two feature point tracking techniques are introduced, which recover the camera parameters of the WZ frames. A first technique, in which the frames remain encoded separately, performs tracking at the decoder and leads to significant RD performance gains. A second technique further improves the RD performances by allowing a limited tracking at the encoder. As an additional benefit, statistics on tracks allow the encoder to adapt the key frame frequency to the video motion content.  相似文献   

19.
Key frame extraction based on visual attention model   总被引:2,自引:0,他引:2  
Key frame extraction is an important technique in video summarization, browsing, searching and understanding. In this paper, we propose a novel approach to extract the most attractive key frames by using a saliency-based visual attention model that bridges the gap between semantic interpretation of the video and low-level features. First, dynamic and static conspicuity maps are constructed based on motion, color and texture features. Then, by introducing suppression factor and motion priority schemes, the conspicuity maps are fused into a saliency map that includes only true attention regions to produce attention curve. Finally, after time-constraint cluster algorithm grouping frames with similar content, the frames with maximum saliency value are selected as key-frames. Experimental results demonstrate the effectiveness of our approach for video summarization by retrieving the meaningful key frames.  相似文献   

20.
基于内容的视频分析技术研究   总被引:2,自引:0,他引:2  
视频数据分析处理是实现基于内容的视频检索的一项关键技术,它直接影响到视频特征匹配和检索的精度。从分析视频数据的结构和特点出发,总结了基于内容检索的视频处理方法的一般过程,即视频分割、关键帧选取、静态和动态特征提取以及视频聚类等。对基于内容视频分析中的关键技术进行了深入的探讨研究,分析了各种方法的优缺点并介绍了一些新的方法,最后提出了一些值得进一步研究的问题。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号