首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
林碧兰  郑宝玉  钱程 《信号处理》2015,31(2):201-207
在很多的应用场景中需要具有低复杂度的视频编码器,新兴的分布式视频编码和压缩感知技术正好适用于这些场景中,因而出现了一种新的视频编码方案——分布式压缩视频编码。在现有的一些分布式压缩视频编码方案中,视频帧在编码端是独立编码,在解码端进行联合解码,具体来说就是关键帧独立解码,非关键帧在由关键帧生成的边信息的帮助下进行解码,这就忽略了非关键帧之间的相关性。本文提出一个新的分布式视频编码方案,将非关键帧分为主非关键帧和次非关键帧,主非关键帧利用关键帧生成地边信息进行解码,而次非关键帧先利用相邻的主非关键帧进行观测值预测,然后再利用关键帧生成的边信息进行解码。实验结果表明,在本文提出的框架下,非关键帧的重构质量提高了有2dB~4dB。   相似文献   

2.
Key frame extraction based on visual attention model   总被引:2,自引:0,他引:2  
Key frame extraction is an important technique in video summarization, browsing, searching and understanding. In this paper, we propose a novel approach to extract the most attractive key frames by using a saliency-based visual attention model that bridges the gap between semantic interpretation of the video and low-level features. First, dynamic and static conspicuity maps are constructed based on motion, color and texture features. Then, by introducing suppression factor and motion priority schemes, the conspicuity maps are fused into a saliency map that includes only true attention regions to produce attention curve. Finally, after time-constraint cluster algorithm grouping frames with similar content, the frames with maximum saliency value are selected as key-frames. Experimental results demonstrate the effectiveness of our approach for video summarization by retrieving the meaningful key frames.  相似文献   

3.
为了提高关键帧提取的准确率,改善视频摘要的质量,提出了一种HEVC压缩域的视频摘要关键帧提取方法。首先,对视频序列进行编解码,在解码中统计HEVC帧内编码PU块的亮度预测模式数目。然后,特征提取是利用统计得到的模式数目构建成模式特征向量,并将其作为视频帧的纹理特征用于关键帧的提取。最后,利用融合迭代自组织数据分析算法(ISODATA)的自适应聚类算法对模式特征向量进行聚类,在聚类结果中选取每个类内中间向量对应的帧作为候选关键帧,并通过相似度对候选关键帧进行再次筛选,剔除冗余帧,得到最终的关键帧。实验结果表明,在Open Video Project数据集上进行的大量实验验证,该方法提取关键帧的精度为79.9%、召回率达到93.6%、F-score为86.2%,有效地改善了视频摘要的质量。   相似文献   

4.
5.
综合利用了图像的颜色、形状和纹理特征,实现了对视频关键帧进行基于内容的检索。首先研究关键帧的选取、特征匹配等问题,再从视频处理的层次化结构的底层分析入手,构建了视频的连续帧图像序列,运用时间自适应检测法对镜头的关键帧进行了选取,建立了关键帧图像数据库。实验结果证明该方法性能良好。  相似文献   

6.
This article presents a coding method for the lossless compression of color video.In the proposed method,four-dimensional matrix Walsh transform(4D-M-Walsh-T)is used for color video coding.The whole n frames of a color video sequence are divided into '3D-blocks' which are image width(row component),image height(column component),image width(vertical component)in a color video sequence,and adjacency(depth component)of n frames(Y,U or V)of the video sequence.Similar to the method of 2D-Walsh transform,4D-M-Walsh-T is 4D sub-matrices,and the size of each sub-matrix is n.The method can fully utilize correlations to encode for lossless compression and reduce the redundancy of color video,such as adjacent pixels in one frame or different frames of a video at the same time.Experimental results show that the proposed method can achieve higher lossless compression ratio(CR)for the color video sequence.  相似文献   

7.
Video summarization can facilitate rapid browsing and efficient video indexing in many applications. A good summary should maintain the semantic interestingness and diversity of the original video. While many previous methods extracted key frames based on low-level features, this study proposes Memorability-Entropy-based video summarization. The proposed method focuses on creating semantically interesting summaries based on image memorability. Further, image entropy is introduced to maintain the diversity of the summary. In the proposed framework, perceptual hashing-based mutual information (MI) is used for shot segmentation. Then, we use a large annotated image memorability dataset to fine-tune Hybrid-AlexNet. We predict the memorability score by using the fine-tuned deep network and calculate the entropy value of the images. The frame with the maximum memorability score and entropy value in each shot is selected to constitute the video summary. Finally, our method is evaluated on a benchmark dataset, which comes with five human-created summaries. When evaluating our method, we find it generates high-quality results, comparable to human-created summaries and conventional methods.  相似文献   

8.
9.
The huge amount of video data on the internet requires efficient video browsing and retrieval strategies. One of the viable solutions is to provide summaries of the videos in the form of key frames. The video summarization using visual attention modeling has been used of late. In such schemes, the visually salient frames are extracted as key frames on the basis of theories of human attention modeling. The visual attention modeling schemes have proved to be effective in video summarization. However, the high computational costs incurred by these techniques limit their applicability in practical scenarios. In this context, this paper proposes an efficient visual attention model based key frame extraction method. The computational cost is reduced by using the temporal gradient based dynamic visual saliency detection instead of the traditional optical flow methods. Moreover, for static visual saliency, an effective method employing discrete cosine transform has been used. The static and dynamic visual attention measures are fused by using a non-linear weighted fusion method. The experimental results indicate that the proposed method is not only efficient, but also yields high quality video summaries.  相似文献   

10.
Video summary technology based on keyframe extraction is an effective means to rapidly access video content. Traditional video summary generation technology requires high video resolution, which poses a problem as most existing studies have no targeted solutions for videos that are subject to privacy protection. We propose a novel keyframe extraction algorithm for video data in the visual shielding domain, named visual shielding compressed sensing coding and double-layer affinity propagation (VSCS-DAP). VSCS-DAP involves three main steps. First, the video is compressed by compressed sensing technology to provide a visual shielding effect (protecting the privacy of monitored figures), while the data volume is significantly reduced. Then, pyramid histogram of oriented gradients (PHOG) features are extracted from the compressed video to be clustered by the first step affinity propagation (AP) to gain the summaries of the first stage. Finally, the PHOG and Hist fusion features are extracted from the keyframes of the first stage, and they cluster the fused PHOG-Hist features by the second step AP algorithm to obtain the final output summaries. Experimental results obtained on two common video datasets show that our method exhibits advantages including low redundancy and few missing frames, low computational complexity, strong real-time performance, and robustness to vision-shielded video.  相似文献   

11.
12.

Video compression is one among the pre-processes in video streaming. While capturing moving objects with moving cameras, more amount of redundant data is recorded along with dynamic change. In this paper, this change is identified using various geometric transformations. To register all these dynamic relations with minimal storage, tensor representation is used. The amount of similarity between the frames is measured using canonical correlation analysis (CCA). The key frames are identified by comparing the canonical auto-correlation analysis score of the candidate key frame with CCA score of other frames. In this method, coded video is represented using tensor which consists of intra-coded key frame, a vector of P frame identifiers, transformation of each variable sized block and information fusion that has three levels of abstractions: measurements, characteristics and decisions that combine all these factors into a single entity. Each dimension can have variable sizes which facilitates storing all characteristics without missing any information. In this paper, the proposed video compression method is applied to under-water videos that have more redundancy as both the camera and the underwater species are in motion. This method is compared with H.264, H.265 and some recent compression methods. Metrics like Peak Signal to Noise Ratio and compression ratio for various bit rates are used to evaluate the performance. From the results obtained, it is obvious that the proposed method performs compression with a high compression ratio, and the loss is comparatively less.

  相似文献   

13.
Video object tracking using adaptive Kalman filter   总被引:1,自引:0,他引:1  
In this paper, a new video moving object tracking method is proposed. In initialization, a moving object selected by the user is segmented and the dominant color is extracted from the segmented target. In tracking step, a motion model is constructed to set the system model of adaptive Kalman filter firstly. Then, the dominant color of the moving object in HSI color space will be used as feature to detect the moving object in the consecutive video frames. The detected result is fed back as the measurement of adaptive Kalman filter and the estimate parameters of adaptive Kalman filter are adjusted by occlusion ratio adaptively. The proposed method has the robust ability to track the moving object in the consecutive frames under some kinds of real-world complex situations such as the moving object disappearing totally or partially due to occlusion by other ones, fast moving object, changing lighting, changing the direction and orientation of the moving object, and changing the velocity of moving object suddenly. The proposed method is an efficient video object tracking algorithm.  相似文献   

14.
智能视频监控中滞留物与人的关联分析算法   总被引:1,自引:0,他引:1  
对智能视频监控中滞留物与人的关联分析算法进行了研究.当监控场景中出现物体滞留,检测滞留物并发出报警或提示,同时提取滞留关键帧;滞留物被移走时,检测移走并提取移走关键帧,根据滞留关键帧和移走关键帧中人的颜色特征进行相似性计算,来匹配移走者与滞留者是否为同一人,进而作出报警或提示等不同响应.实验证明,提出的算法具有很高的正确率.  相似文献   

15.
关键帧是视频中的一组有限数量的帧的子集,一个视频的关键帧序列能够合理地概括该视频信息,从而减少过大的视频数据对生产生活带来的承载负重.本文讨论了基于Tsallis熵的Jensen距离公式——JTD在视频关键帧提取中的应用.根据得到的差异性距离值JTD,首先检查子镜头边界,进而从每个子镜头中抽取一帧作为该镜头的代表帧,最终得到该段视频的关键帧序列.  相似文献   

16.
Recently, video action recognition about two-stream network is still a popular research topic in computer vision. However, most of current two-stream-based methods have two redundancy issues, including: inter-frame redundancy and intra-frame redundancy. To solve the above problems, a Spatial-Temporal Saliency Action Mask Attention network (STSAMANet) is built for action recognition. First, this paper introduces a key-frame mechanism to eliminate inter-frame redundancy. This mechanism can compute key frames on each video sequence to get the greatest difference between frames. Then, Mask R-CNN detection technology is introduced to build a saliency attention layer to eliminate intra-frame redundancy. This layer is to focus on the saliency human body and objects for each action class. We experiment on two public video action datasets, i.e., the UCF101 dataset and Penn Action dataset to verify the effectiveness of our method in action recognition.  相似文献   

17.
Joint object tracking and pose estimation is an important issue in Augmented Reality (AR), interactive systems, and robotic systems. Many studies are based on object detection methods that only focus on the reliability of the features. Other methods combine object detection with frame-by-frame tracking using the temporal redundancy in the video. However, in some mixed methods, the interval between consecutive detection frames is usually too short to take the full advantage of the frame-by-frame tracking, or there is no appropriate switching mechanism between detection and tracking. In this paper, an iterative optimization tracking method is proposed to alleviate the deviations of the tracking points and prolong the interval, and thus speed up the pose estimation process. Moreover, an adaptive detection interval algorithm is developed, which can make the switch between detection and frame-by-frame tracking automatically according to the quality of frames so as to improve the accuracy in a tough tracking environment. Experimental results on the benchmark dataset manifest that the proposed algorithms, as an independent part, can be combined with some inter-frame tracking methods for optimization.  相似文献   

18.
提出了一种基于关键帧颜色和纹理特征的视频拷贝检测方法。首先通过子片段方法提取视频的关键帧,然后将关键帧分成3个子块,提取每个子块的三维量化颜色直方图,通过直方图相交法来进行颜色特征的匹配。对检索得到的结果视频关键帧进行纹理特征提取,通过其灰度共生矩阵的角二阶矩和熵来表征其纹理特征,纹理特征的匹配可进一步过滤不相关的视频。实验结果表明,该方法效果好、稳健性强且可应用于多种类型的视频。  相似文献   

19.
Video summarization has gained increased popularity in the emerging multimedia communication applications, however, very limited work has been conducted to address the transmission problem of video summary frames. In this paper, we propose a cross-layer optimization framework for delivering video summaries over wireless networks. Within a rate-distortion theoretical framework, the source coding, allowable retransmission, and adaptive modulation and coding have been jointly optimized, which reflects the joint selection of parameters at physical, data link and application layers. The goal is to achieve the best video quality and content coverage of the received summary frames and to meet the delay constraint. The problem is solved using Lagirangian relaxation and dynamic programming. Experimental results indicate the effectiveness and efficiency of the proposed optimization framework, especially when the delay budget imposed by the upper layer applications is small, where more than 10% distortion gain can be achieved.  相似文献   

20.
Porno video recognition is important for Internet content monitoring.In this paper,a novel porno video recognition method by fusing the audio and video cues is proposed.Firstly,global color and texture...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号