共查询到20条相似文献,搜索用时 31 毫秒
1.
《Signal Processing: Image Communication》2014,29(7):788-806
In this paper, we propose a new and novel modality fusion method designed for combining spatial and temporal fingerprint information to improve video copy detection performance. Most of the previously developed methods have been limited to use only pre-specified weights to combine spatial and temporal modality information. Hence, previous approaches may not adaptively adjust the significance of the temporal fingerprints that depends on the difference between the temporal variances of compared videos, leading to performance degradation in video copy detection. To overcome the aforementioned limitation, the proposed method has been devised to extract two types of fingerprint information: (1) spatial fingerprint that consists of the signs of DCT coefficients in local areas in a keyframe and (2) temporal fingerprint that computes the temporal variances in local areas in consecutive keyframes. In addition, the so-called temporal strength measurement technique is developed to quantitatively represent the amount of the temporal variances; it can be adaptively used to consider the significance of compared temporal fingerprints. The experimental results show that the proposed modality fusion method outperforms other state-of-the-arts fusion methods and popular spatio-temporal fingerprints in terms of video copy detection. Furthermore, the proposed method can save 39.0%, 25.1%, and 46.1% time complexities needed to perform video fingerprint matching without a significant loss of detection accuracy for our synthetic dataset, TRECVID 2009 CCD Task, and MUSCLE-VCD 2007, respectively. This result indicates that our proposed method can be readily incorporated into the real-life video copy detection systems. 相似文献
2.
Debabrata Dutta Sanjoy Kumar Saha Bhabatosh Chanda 《Signal, Image and Video Processing》2013,7(4):665-677
Due to the advancement in the field of multimedia technology and communication, it has become easier to access, store, and edit video data. Easy manipulation of video data and its rapid distribution have made content-based video copy detection (CBVCD) an active area of research. In a CBVCD system, reference video sequence and query sequence are compared to detect whether the query video sequence is a copy of reference video sequence. Thus, the generation of fingerprint of a video sequence and sequence matching technique are the core tasks of such system. In order to evade such detection process, a copied version may undergo different kinds of transformations like photometric and post-production attack. So the detection system must be robust enough against such attacks. In this work, fingerprint is generated from the sub-bands of wavelet decomposed intensity image and localized intensity gradient histograms of low sub-band. The fingerprint thus obtained reflects considerable discriminating capability and robustness against the attacks. Furthermore, to cope up with the attacks, we have adopted simple pre-processing technique, which enhances the robustness of the system further. A robust sequence matching technique based on multivariate Wald–Wolfowitz test is proposed here. An experiment has been carried out with a database consisting of distinct 642 shots and 1,485 query sequences representing different attacks. Proposed methodology achieves high copy detection rate (99.39 %) and very low false alarm rate (0.14 %) and performs better than other spatio-temporal measure based systems. 相似文献
3.
4.
Image forensics of sharpening has aroused the great interest of researchers in recent decades. The state-of-the-art techniques have achieved high accuracies of strong sharpening detection, while it remains a challenge to detect weak sharpening. This paper proposes an algorithm based on thresholding binary coding for image sharpening detection. The overshoot artifact introduced by sharpening enlarges the difference between the local maximum and minimum of both image pixels and unsharp mask elements, based on which the threshold local binary pattern operator is applied to capture the trace of sharpening. Then the patterns are coded according to the rotation symmetry invariance and the texture type. Features are extracted from the statistical distribution of the coded patterns and fed to the classifier for sharpening detection. In practice, two classifiers are constructed for the lightweight and offline applications respectively, one is a single Fisher linear discriminant (FLD) with 182 features, and the other is an ensemble classifier (EC) with 5460 features. The experimental results on BOSS, NRCS and RAISE datasets show that for weak sharpening detection, the FLD outperforms the CNN and SVMs with EPTC, EPBC, and LBP features, and using EC with TBCs features further improves the performance, which obtains better results than ECs with TLBP and SRM features. Besides, the proposed algorithm is robust to post-JPEG compression and noise addition and could differentiate sharpening from other manipulations. 相似文献
5.
6.
A video signature is a set of feature vectors that compactly represents and uniquely characterizes one video clip from another for fast matching. To find a short duplicated region, the video signature must be robust against common video modifications and have a high discriminability. The matching method must be fast and be successful at finding locations. In this paper, a frame‐based video signature that uses the spatial information and a two‐stage matching method is presented. The proposed method is pair‐wise independent and is robust against common video modifications. The proposed two‐stage matching method is fast and works very well in finding locations. In addition, the proposed matching structure and strategy can distinguish a case in which a part of the query video matches a part of the target video. The proposed method is verified using video modified by the VCE7 experimental conditions found in MPEG‐7. The proposed video signature method achieves a robustness of 88.7% under an independence condition of 5 parts per million with over 1,000 clips being matched per second. 相似文献
7.
Video hashing is a useful technique of many multimedia systems, such as video copy detection, video authentication, tampering localization, video retrieval, and anti-privacy search. In this paper, we propose a novel video hashing with secondary frames and invariant moments. An important contribution is the secondary frame construction with 3D discrete wavelet transform, which can reach initial data compression and robustness against noise and compression. In addition, since invariant moments are robust and discriminative features, hash generation based on invariant moments extracted from secondary frames can ensure good classification of the proposed video hashing. Extensive experiments on 8300 videos are conducted to validate efficiency of the proposed video hashing. The results show that the proposed video hashing can resist many digital operations and has good discrimination. Performance comparisons with some state-of-the-art algorithms illustrate that the proposed video hashing outperforms the compared algorithms in classification in terms of receiver operating characteristic results. 相似文献
8.
Ta Minh Thanh Pham Thanh Hiep Ta Minh Tam Keisuke Tanaka 《AEUE-International Journal of Electronics and Communications》2014,68(10):1007-1015
In this paper, we present a frame-patch matching based robust semi-blind video watermarking using KAZE feature. The KAZE feature is employed for matching the feature points of frame-patch with those of all frames in video for detecting the embedding and extracting regions. In our method, the watermark information is embedded in Discrete Cosine Transform (DCT) domain of randomly generated blocks in the matched region. In the extraction process, we synchronize the embedded region from the distorted video by using KAZE feature matching. Based on the matched KAZE feature points, RST (rotation, scaling, translation) parameters are estimated and the watermark information can be successfully extracted. Experimental results show that our proposed method is robust against geometrical attacks, video processing attacks, temporal attacks, and so on. 相似文献
9.
Hyun-seok Min Jae Young Choi Wesley De Neve Yong Man Ro 《Signal Processing: Image Communication》2011,26(10):612-627
The detection of near-duplicate video clips (NDVCs) is an area of current research interest and intense development. Most NDVC detection methods represent video clips with a unique set of low-level visual features, typically describing color or texture information. However, low-level visual features are sensitive to transformations of the video content. Given the observation that transformations tend to preserve the semantic information conveyed by the video content, we propose a novel approach for identifying NDVCs, making use of both low-level visual features (this is, MPEG-7 visual features) and high-level semantic features (this is, 32 semantic concepts detected using trained classifiers). Experimental results obtained for the publicly available MUSCLE-VCD-2007 and TRECVID 2008 video sets show that bimodal fusion of visual and semantic features facilitates robust NDVC detection. In particular, the proposed method is able to identify NDVCs with a low missed detection rate (3% on average) and a low false alarm rate (2% on average). In addition, the combined use of visual and semantic features outperforms the separate use of either of them in terms of NDVC detection effectiveness. Further, we demonstrate that the effectiveness of the proposed method is on par with or better than the effectiveness of three state-of-the-art NDVC detection methods either making use of temporal ordinal measurement, features computed using the Scale-Invariant Feature Transform (SIFT), or bag-of-visual-words (BoVW). We also show that the influence of the effectiveness of semantic concept detection on the effectiveness of NDVC detection is limited, as long as the mean average precision (MAP) of the semantic concept detectors used is higher than 0.3. Finally, we illustrate that the computational complexity of our NDVC detection method is competitive with the computational complexity of the three aforementioned NDVC detection methods. 相似文献
10.
11.
12.
13.
Duan-Yu Chen Yu-Ming Chiu 《Journal of Visual Communication and Image Representation》2013,24(5):544-551
In this paper, to efficiently detect video copies, focus of interests in videos is first localized based on 3D spatiotemporal visual attention modeling. Salient feature points are then detected in visual attention regions. Prior to evaluate similarity between source and target video sequences using feature points, geometric constraint measurement is employed for conducting bi-directional point matching in order to remove noisy feature points and simultaneously maintain robust feature point pairs. Consequently, video matching is transformed to frame-based time-series linear search problem. Our proposed approach achieves promising high detection rate under distinct video copy attacks and thus shows its feasibility in real-world applications. 相似文献
14.
Leelavathy Narkedamilly Venkateswara Prasad Evani Srinivas Kumar Samayamantula 《ETRI Journal》2015,37(3):595-605
This paper proposes a robust, imperceptible block‐based digital video watermarking algorithm that makes use of the Speeded Up Robust Feature (SURF) technique. The SURF technique is used to extract the most important features of a video. A discrete multiwavelet transform (DMWT) domain in conjunction with a discrete cosine transform is used for embedding a watermark into feature blocks. The watermark used is a binary image. The proposed algorithm is further improved for robustness by an error‐correction code to protect the watermark against bit errors. The same watermark is embedded temporally for every set of frames of an input video to improve the decoded watermark correlation. Extensive experimental results demonstrate that the proposed DMWT domain video watermarking using SURF features is robust against common image processing attacks, motion JPEG2000 compression, frame averaging, and frame swapping attacks. The quality of a watermarked video under the proposed algorithm is high, demonstrating the imperceptibility of an embedded watermark. 相似文献
15.
16.
This paper proposes a novel robust video watermarking scheme based on local affine invariant features in the compressed domain. This scheme is resilient to geometric distortions and quite suitable for DCT-encoded compressed video data because it performs directly in the block DCTs domain. In order to synchronize the watermark, we use local invariant feature points obtained through the Harris-Affine detector which is invariant to affine distortions. To decode the frames from DCT domain to the spatial domain as fast as possible, a fast inter-transformation between block DCTs and sub-block DCTs is employed and down-sampling frames in the spatial domain are obtained by replacing each sub-blocks DCT of 2×2 pixels with half of the corresponding DC coefficient. The above-mentioned strategy can significantly save computational cost in comparison with the conventional method which accomplishes the same task via inverse DCT (IDCT). The watermark detection is performed in spatial domain along with the decoded video playing. So it is not sensitive to the video format conversion. Experimental results demonstrate that the proposed scheme is transparent and robust to signal-processing attacks, geometric distortions including rotation, scaling, aspect ratio changes, linear geometric transforms, cropping and combinations of several attacks, frame dropping, and frame rate conversion. 相似文献
17.
Detecting the visually identical regions among successive frames for noisy videos, called visual identicalness detection (VID) in this paper, is a fundamental tool in video applications for lower power consumption and higher efficiency. In this paper, instead of performing VID on the original video signal or on the de-noised video signal, a Retinex based VID approach is proposed to perform VID on the Retinex signal to eliminate the noise influence introduced by imaging system. Several Retinex output generation approaches are compared, within which the proposed Cohen–Daubechies–Feauveau wavelet based approach is demonstrated to have better efficiency in detection and higher adaptability to the video content and noise severity. Compared with approaches performing detection in the de-noised images, the proposed algorithm presents up to 4.78 times higher detection rate for the videos with moving objects and up to 30.79 times higher detection rate for the videos with static scenes, respectively, at the same error rate. Also, an application of this technique is provided by integrating it into an H.264/AVC video encoder. Compared with compressing the de-noised videos using the existing fast algorithm, an average of 1.7 dB performance improvement is achieved with up to 5.47 times higher encoding speed. Relative to the reference encoder, up to 32.47 times higher encoding speed is achieved without sacrificing the subjective quality. 相似文献
18.
视频合成孔径雷达(SAR)具有高帧率成像能力,可作为地面运动目标探测的重要技术手段。经典SAR地面动目标显示(SAR-GMTI)依靠目标回波能量来实现动目标检测,同时动目标阴影亦可作为视频SAR动目标检测的重要途径。然而,由于动目标能量和阴影的畸变或涂抹,依靠单一方式难以实现稳健的动目标检测。该文基于目标能量和阴影的双域联合检测思想,分别通过快速区域卷积神经网络和航迹关联两种技术途径实现了视频SAR动目标联合检测,给出了机载实测数据处理结果,并进行了详细分析。该文方法充分利用目标阴影与能量的特征及空时信息,提升了机动目标检测的稳健性。 相似文献
19.
用于视频对象平面生成的运动对象自动分割 总被引:1,自引:0,他引:1
新的视频编码标准MPEG-4具有基于内容的功能。它把图像序列分解成视频对象平面(VOP),每个VOP代表一个运动对象。文中提出了一种提取运动对象的新的视频序列分割算法,算法的核心是一个对象跟踪器,它利用Hausdorff距离将对象的二维二值模型与后续帧进行匹配,然后采用一种新的基于运动相连成分的模型刷新方法对模型的每一帧进行刷新。初始的模型自动产生,再利用滤波技术滤除静止背景,最后,利用二值模型从序列中提取出VOP。 相似文献
20.
Zen Chen Shu-Kuo Sun 《IEEE transactions on image processing》2010,19(1):205-219