首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Many video fingerprints have been proposed to handle the video transformations problems when the original contents are copied and redistributed. However, most of them did not take into account flipping and rotation transformations. In this paper, we propose a novel video fingerprint based on region binary patterns, aiming to realize robust and fast video copy detection against video transformations including rotation and flipping. We extract two complementary region binary patterns from several rings in keyframes. These two kinds of binary patterns are converted into a new type of patterns for the proposed video fingerprint which is robust against rotation and flipping. The experimental results demonstrated that the proposed video fingerprint is effective for video copy detection particularly in the case of rotation and flipping. Furthermore, our experimental results proved that the proposed method allows for high storage efficiency and low computation complexity, which is suitable for practical video copy system.  相似文献   

2.
随着视频等多媒体数据呈指数式迅猛增长,高效快速的视频检索算法引起越来越多的重视。传统的图像特征如颜色直方图以及尺度不变特征变换等对视频拷贝检测中检索速度以及检测精度等问题无法达到很好的效果,因此文中提出一种多特征融合的视频检索方法。该方法利用前后两帧的时空特征进行基于滑动窗口的时间对齐算法,以达到减少检索的范围和提高检索速度的目的。该算法对关键帧进行灰度序列特征、颜色相关图特征以及SIFT局部特征提取,然后融合全局特征和局部特征两者的优势,从而提高检测精度。实验结果表明,该方法可达到较好的视频检索精度。  相似文献   

3.
In order to evaluate different video fusion algorithms in temporal stability and consistency as well as in spatial information transfer, a novel objective video fusion quality metric is proposed with the structural similarity (SSIM) index and the perception characteristics of human visual system (HVS) in this paper. Firstly, for each frame, two sub-indices, i.e., the spatial fusion quality index and the temporal fusion quality index, are defined by the weighted local SSIM indices. Secondly, for the current frame, an individual-frame fusion quality measure is obtained by integrating the above two sub-indices. Lastly, the proposed global video fusion metric is constructed as the weighted average of all the individual-frame fusion quality measures. In addition, according to the perception characteristics of HVS, some local and global spatial-temporal information, such as local variance, pixel movement, global contrast, background motion and so on, is employed to define the weights in the proposed metric. Several sets of experimental results demonstrate that the proposed metric can evaluate different video fusion algorithms accurately, and the evaluation results coincide with the subjective results well.  相似文献   

4.
A compressed domain video saliency detection algorithm, which employs global and local spatiotemporal (GLST) features, is proposed in this work. We first conduct partial decoding of a compressed video bitstream to obtain motion vectors and DCT coefficients, from which GLST features are extracted. More specifically, we extract the spatial features of rarity, compactness, and center prior from DC coefficients by investigating the global color distribution in a frame. We also extract the spatial feature of texture contrast from AC coefficients to identify regions, whose local textures are distinct from those of neighboring regions. Moreover, we use the temporal features of motion intensity and motion contrast to detect visually important motions. Then, we generate spatial and temporal saliency maps, respectively, by linearly combining the spatial features and the temporal features. Finally, we fuse the two saliency maps into a spatiotemporal saliency map adaptively by comparing the robustness of the spatial features with that of the temporal features. Experimental results demonstrate that the proposed algorithm provides excellent saliency detection performance, while requiring low complexity and thus performing the detection in real-time.  相似文献   

5.
张博  景晓军  孙松林  张少乐 《电子学报》2011,39(9):2130-2134
针对残缺指纹中方向场的重构,本文提出了一种利用全局方向信息和局部梯度信息的基于互信息准则的方向场融合算法此算法根据局部梯度信息的竞争性、结合全局方向信息的互补性、冗余性,首先根据局部指纹梯度信息定义离散度,自适应的融合梯度信息,其次运用互信息准则结合指纹全局信息,对残缺部分的方向进行重估和修正根据融合结果,进行后期指纹...  相似文献   

6.
Saliency detection is widely used to pick out relevant parts of a scene as visual attention regions for various image/video applications. Since video is increasingly being captured, moved and stored in compressed form, there is a need for detecting video saliency directly in compressed domain. In this study, a compressed video saliency detection algorithm is proposed based on discrete cosine transformation (DCT) coefficients and motion information within a visual window. Firstly, DCT coefficients and motion information are extracted from H.264 video bitstream without full decoding. Due to a high quantization parameter setting in encoder, skip/intra is easily chosen as the best prediction mode, resulting in a large number of blocks with zero motion vector and no residual existing in video bitstream. To address these problems, the motion vectors of skip/intra coded blocks are calculated by interpolating its surroundings. In addition, a visual window is constructed to enhance the contrast of features and to avoid being affected by encoder. Secondly, after spatial and temporal saliency maps being generated by the normalized entropy, a motion importance factor is imposed to refine the temporal saliency map. Finally, a variance-like fusion method is proposed to dynamically combine these maps to yield the final video saliency map. Experimental results show that the proposed approach significantly outperforms other state-of-the-art video saliency detection models.  相似文献   

7.
For a variety of applications such as video surveillance and event annotation, the spatial–temporal boundaries between video objects are required for annotating visual content with high-level semantics. In this paper, we define spatial–temporal sampling as a unified process of extracting video objects and computing their spatial–temporal boundaries using a learnt video object model. We first provide a computational approach for learning an optimal key-object codebook sequence from a set of training video clips to characterize the semantics of the detected video objects. Then, dynamic programming with the learnt codebook sequence is used to locate the video objects with spatial–temporal boundaries in a test video clip. To verify the performance of the proposed method, a human action detection and recognition system is constructed. Experimental results show that the proposed method gives good performance on several publicly available datasets in terms of detection accuracy and recognition rate.  相似文献   

8.
一种融合时域和空域信息的运动目标分割新方法   总被引:3,自引:0,他引:3       下载免费PDF全文
提出了一种融合时域和空域信息的方法,用于从视频序列中分割出运动物体。该方法是在分割过程中通过区域捆绑逐步融合时域和空域信息,而不是在时域分割结束之后再融合空域信息。分布式地表达分割物体并刻画其特征是区域捆绑的主要特征。本文的方法首先通过早期分割得到许多小区域,然后将这些小区域捆绑成一些捆绑核,再将剩下的区域通过强或弱的规则捆绑到相邻的捆绑核,从而实现目标区域的分割。实验结果显示了该方法的良好性能。  相似文献   

9.
A double optimal projection method that involves projections for intra-cluster and inter-cluster dimensionality reduction are proposed for video fingerprinting. The video is initially set as a graph with frames as its vertices in a high-dimensional space. A similarity measure that can compute the weights of the edges is then proposed. Subsequently, the video frames are partitioned into different clusters based on the graph model. Double optimal projection is used to explore the optimal mapping points in a low-dimensional space to reduce the video dimensions. The statistics and geometrical fingerprints are generated to determine whether a query video is copied from one of the videos in the database. During matching, the video can be roughly matched by utilizing the statistics fingerprint. Further matching is thereafter performed in the corresponding group using geometrical fingerprints. Experimental results show the good performance of the proposed video fingerprinting method in robustness and discrimination.  相似文献   

10.
This paper proposes an efficient error concealment method for the reconstruction of pixels that are lost in video communication. The proposed method is developed by combining exemplar-based image inpainting for patch reconstruction and spatial interpolation for pixel reconstruction using adaptive threshold by local complexity. By exemplar-based image inpainting, regions with regular structures are reconstructed. For complex regions with irregular structures, just one pixel is reconstructed using the proposed spatial interpolation method. The proposed spatial interpolation method performs reconstruction by selecting adaptively directional interpolation or neighbor interpolation based on gradient information. Simulation results show that the proposed hybrid method performs reconstruction with significantly improved subjective quality compared with the previous spatial error concealment and image inpainting methods. The proposed method also gives substantial improvements of PSNR compared with the previous methods.  相似文献   

11.
Videos captured by stationary cameras are widely used in video surveillance and video conference. This kind of video often has static or gradually changed background. By analyzing the properties of static-background videos, this work presents a novel approach to detect double MPEG-4 compression based on local motion vector field analysis in static-background videos. For a given suspicious video, the local motion vector field is used to segment background regions in each frame. According to the segmentation of backgrounds and the motion strength of foregrounds, the modified prediction residual sequence is calculated, which retains robust fingerprints of double compression. After post-processing, the detection and GOP estimation results are obtained by applying the temporal periodic analysis method to the final feature sequence. Experimental results have demonstrated better robustness and efficiency of the proposed method in comparison to several state-of-the-art methods. Besides, the proposed method is more robust to various rate control modes.  相似文献   

12.
The detection of near-duplicate video clips (NDVCs) is an area of current research interest and intense development. Most NDVC detection methods represent video clips with a unique set of low-level visual features, typically describing color or texture information. However, low-level visual features are sensitive to transformations of the video content. Given the observation that transformations tend to preserve the semantic information conveyed by the video content, we propose a novel approach for identifying NDVCs, making use of both low-level visual features (this is, MPEG-7 visual features) and high-level semantic features (this is, 32 semantic concepts detected using trained classifiers). Experimental results obtained for the publicly available MUSCLE-VCD-2007 and TRECVID 2008 video sets show that bimodal fusion of visual and semantic features facilitates robust NDVC detection. In particular, the proposed method is able to identify NDVCs with a low missed detection rate (3% on average) and a low false alarm rate (2% on average). In addition, the combined use of visual and semantic features outperforms the separate use of either of them in terms of NDVC detection effectiveness. Further, we demonstrate that the effectiveness of the proposed method is on par with or better than the effectiveness of three state-of-the-art NDVC detection methods either making use of temporal ordinal measurement, features computed using the Scale-Invariant Feature Transform (SIFT), or bag-of-visual-words (BoVW). We also show that the influence of the effectiveness of semantic concept detection on the effectiveness of NDVC detection is limited, as long as the mean average precision (MAP) of the semantic concept detectors used is higher than 0.3. Finally, we illustrate that the computational complexity of our NDVC detection method is competitive with the computational complexity of the three aforementioned NDVC detection methods.  相似文献   

13.
Fusion of information gathered from multiple sources is essential to build a comprehensive situation picture for autonomous ground vehicles. In this paper, an approach which performs scene parsing and data fusion for a 3D-LIDAR scanner (Velodyne HDL-64E) and a video camera is described. First of all, a geometry segmentation algorithm is proposed for detection of obstacles and ground areas from data collected by the Velodyne scanner. Then, corresponding image collected by the video camera is classified patch by patch into more detailed categories. After that, parsing result of each frame is obtained by fusing result of Velodyne data and that of image using the fuzzy logic inference framework. Finally, parsing results of consecutive frames are smoothed by the Markov random field based temporal fusion method. The proposed approach has been evaluated with datasets collected by our autonomous ground vehicle testbed in both rural and urban areas. The fused results are more reliable than that acquired via analysis of only images or Velodyne data.  相似文献   

14.
Digital fingerprinting is an emerging technology to protect multimedia content from illegal redistribution, where each distributed copy is labeled with unique identification information. In video streaming, huge amount of data have to be transmitted to a large number of users under stringent latency constraints, so the bandwidth-efficient distribution of uniquely fingerprinted copies is crucial. This paper investigates the secure multicast of anticollusion fingerprinted video in streaming applications and analyzes their performance. We first propose a general fingerprint multicast scheme that can be used with most spread spectrum embedding-based multimedia fingerprinting systems. To further improve the bandwidth efficiency, we explore the special structure of the fingerprint design and propose a joint fingerprint design and distribution scheme. From our simulations, the two proposed schemes can reduce the bandwidth requirement by 48% to 87%, depending on the number of users, the characteristics of video sequences, and the network and computation constraints. We also show that under the constraint that all colluders have the same probability of detection, the embedded fingerprints in the two schemes have approximately the same collusion resistance. Finally, we propose a fingerprint drift compensation scheme to improve the quality of the reconstructed sequences at the decoder's side without introducing extra communication overhead.  相似文献   

15.
针对直接利用卷积自编码网络未考虑视频时间信息的问题,该文提出基于贝叶斯融合的时空流异常行为检测模型。空间流模型采用卷积自编码网络对视频单帧进行重构,时间流模型采用卷积长短期记忆(LSTM)编码-解码网络对短期光流序列进行重构。接着,分别计算空间流模型和时间流模型下每帧的重构误差,设计自适应阈值对重构误差图进行二值化,并基于贝叶斯准则对空间流和时间流下的重构误差进行融合,得到融合重构误差图,并在此基础上进行异常行为判断。实验结果表明,该算法在UCSD和Avenue视频库上的检测效果优于现有异常检测算法。  相似文献   

16.
指纹匹配算法的好坏直接影响识别系统的精度。提出了一种新的基于细节点聚类的多参考中心指纹匹配算法,在两枚指纹对齐阶段,不仅考虑了指纹的全局特性而且根据不同的细节点类自适应地构造不同的局部结构.有效地利用了一些孤立但信息量较大的细节点,提高重叠区域内细节点较少且分散的情况下对齐的准确性。在匹配阶段。多参考中心的使用和相似元分析的结合能在一定程度上克服指纹非线性形变的影响,降低了匹配算法的拒识率。实验结果表明该方法提高了匹配的性能。  相似文献   

17.
To resolve video enhancement problems, a novel method of gradient domain fusion wherein gradient domain frames of the background in daytime video are fused with nighttime video frames is proposed. To verify the superiority of the proposed method, it is compared to conventional techniques. The implemented output of our method is shown to offer enhanced visual quality.  相似文献   

18.
19.
20.
深度换脸技术的出现严重威胁了公众的隐私安全。为了解决现有深度换脸检测方法的局限性,基于多任务学习策略提出了一种双分支检测网络,实现在检测视频伪造的同时逐帧检测。该网络引入了注意力机制和时序学习模块,通过学习局部空间信息和时序信息提升检测性能。该方法在公开数据集Celeb-DF和FaceForensics++上获得了比当前先进换脸检测方法更高的准确率和ROC(Receiver Operating Characteristic)曲线下面积(Area under ROC Curve,AUC),面对不同光照、人脸朝向、视频质量时表现出了良好的鲁棒性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号