期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Rotation and flipping robust region binary patterns for video copy detection

《Journal of Visual Communication and Image Representation》2014,25(2):373-383

Many video fingerprints have been proposed to handle the video transformations problems when the original contents are copied and redistributed. However, most of them did not take into account flipping and rotation transformations. In this paper, we propose a novel video fingerprint based on region binary patterns, aiming to realize robust and fast video copy detection against video transformations including rotation and flipping. We extract two complementary region binary patterns from several rings in keyframes. These two kinds of binary patterns are converted into a new type of patterns for the proposed video fingerprint which is robust against rotation and flipping. The experimental results demonstrated that the proposed video fingerprint is effective for video copy detection particularly in the case of rotation and flipping. Furthermore, our experimental results proved that the proposed method allows for high storage efficiency and low computation complexity, which is suitable for practical video copy system. 相似文献

2.

Bimodal fusion of low-level visual features and high-level semantic features for near-duplicate video clip detection

Hyun-seok Min Jae Young Choi Wesley De Neve Yong Man Ro 《Signal Processing: Image Communication》2011,26(10):612-627

The detection of near-duplicate video clips (NDVCs) is an area of current research interest and intense development. Most NDVC detection methods represent video clips with a unique set of low-level visual features, typically describing color or texture information. However, low-level visual features are sensitive to transformations of the video content. Given the observation that transformations tend to preserve the semantic information conveyed by the video content, we propose a novel approach for identifying NDVCs, making use of both low-level visual features (this is, MPEG-7 visual features) and high-level semantic features (this is, 32 semantic concepts detected using trained classifiers). Experimental results obtained for the publicly available MUSCLE-VCD-2007 and TRECVID 2008 video sets show that bimodal fusion of visual and semantic features facilitates robust NDVC detection. In particular, the proposed method is able to identify NDVCs with a low missed detection rate (3% on average) and a low false alarm rate (2% on average). In addition, the combined use of visual and semantic features outperforms the separate use of either of them in terms of NDVC detection effectiveness. Further, we demonstrate that the effectiveness of the proposed method is on par with or better than the effectiveness of three state-of-the-art NDVC detection methods either making use of temporal ordinal measurement, features computed using the Scale-Invariant Feature Transform (SIFT), or bag-of-visual-words (BoVW). We also show that the influence of the effectiveness of semantic concept detection on the effectiveness of NDVC detection is limited, as long as the mean average precision (MAP) of the semantic concept detectors used is higher than 0.3. Finally, we illustrate that the computational complexity of our NDVC detection method is competitive with the computational complexity of the three aforementioned NDVC detection methods. 相似文献

3.

Adaptive weighted fusion with new spatial and temporal fingerprints for improved video copy detection

《Signal Processing: Image Communication》2014,29(7):788-806

In this paper, we propose a new and novel modality fusion method designed for combining spatial and temporal fingerprint information to improve video copy detection performance. Most of the previously developed methods have been limited to use only pre-specified weights to combine spatial and temporal modality information. Hence, previous approaches may not adaptively adjust the significance of the temporal fingerprints that depends on the difference between the temporal variances of compared videos, leading to performance degradation in video copy detection. To overcome the aforementioned limitation, the proposed method has been devised to extract two types of fingerprint information: (1) spatial fingerprint that consists of the signs of DCT coefficients in local areas in a keyframe and (2) temporal fingerprint that computes the temporal variances in local areas in consecutive keyframes. In addition, the so-called temporal strength measurement technique is developed to quantitatively represent the amount of the temporal variances; it can be adaptively used to consider the significance of compared temporal fingerprints. The experimental results show that the proposed modality fusion method outperforms other state-of-the-arts fusion methods and popular spatio-temporal fingerprints in terms of video copy detection. Furthermore, the proposed method can save 39.0%, 25.1%, and 46.1% time complexities needed to perform video fingerprint matching without a significant loss of detection accuracy for our synthetic dataset, TRECVID 2009 CCD Task, and MUSCLE-VCD 2007, respectively. This result indicates that our proposed method can be readily incorporated into the real-life video copy detection systems. 相似文献

4.

基于时空组合特征的视频拷贝快速检测

张鹿周荷琴《电子技术》2010,47(8):16-18

为了有效提升大规模数据集下视频拷贝检测的速度,提出了一种基于时空组合特征的视频拷贝快速检测算法。首先抽取视频的时间序列特征和空间分布特征,然后采用粗略匹配和精确匹配相结合的二级匹配框架进行拷贝检测,并在每一级匹配过程中都加入了"尽早停止"策略,以便尽可能快地过滤出非拷贝视频,从而降低在大规模数据集下的计算复杂度。实验结果表明,与已有的算法相比,所提出的快速检测算法取得了较高的检测精度,并大幅提升了检测速度。相似文献

5.

A perceptual scheme for fully automatic video shot boundary detection

《Signal Processing: Image Communication》2014,29(3):410-423

In this paper, we propose a novel and robust modus operandi for fast and accurate shot boundary detection where the whole design philosophy is based on human perceptual rules and the well-known “Information Seeking Mantra”. By adopting a top–down approach, redundant video processing is avoided and furthermore elegant shot boundary detection accuracy is obtained under significantly low computational costs. Objects within shots are detected via local image features and used for revealing visual discontinuities among shots. The proposed method can be used for detecting all types of gradual transitions as well as abrupt changes. Another important feature is that the proposed method is fully generic, which can be applied to any video content without requiring any training or tuning in advance. Furthermore, it allows a user interaction to direct the SBD process to the user's “Region of Interest” or to stop it once satisfactory results are obtained. Experimental results demonstrate that the proposed algorithm achieves superior computational times compared to the state-of-art methods without sacrificing performance. 相似文献

6.

Visual attention guided video copy detection based on feature points matching with geometric-constraint measurement

Duan-Yu Chen Yu-Ming Chiu 《Journal of Visual Communication and Image Representation》2013,24(5):544-551

In this paper, to efficiently detect video copies, focus of interests in videos is first localized based on 3D spatiotemporal visual attention modeling. Salient feature points are then detected in visual attention regions. Prior to evaluate similarity between source and target video sequences using feature points, geometric constraint measurement is employed for conducting bi-directional point matching in order to remove noisy feature points and simultaneously maintain robust feature point pairs. Consequently, video matching is transformed to frame-based time-series linear search problem. Our proposed approach achieves promising high detection rate under distinct video copy attacks and thus shows its feasibility in real-world applications. 相似文献

7.

A video semantic detection method based on locality-sensitive discriminant sparse representation and weighted KNN

《Journal of Visual Communication and Image Representation》2016

Video semantic detection has been one research hotspot in the field of human-computer interaction. In video features-oriented sparse representation, the features from the same category video could not achieve similar coding results. To address this, the Locality-Sensitive Discriminant Sparse Representation (LSDSR) is developed, in order that the video samples belonging to the same video category are encoded as similar sparse codes which make them have better category discrimination. In the LSDSR, a discriminative loss function based on sparse coefficients is imposed on the locality-sensitive sparse representation, which makes the optimized dictionary for sparse representation be discriminative. The LSDSR for video features enhances the power of semantic discrimination to optimize the dictionary and build the better discriminant sparse model. More so, to further improve the accuracy of video semantic detection after sparse representation, a weighted K-Nearest Neighbor (KNN) classification method with the loss function that integrates reconstruction error and discrimination for the sparse representation is adopted to detect video semantic concepts. The proposed methods are evaluated on the related video databases in comparison with existing sparse representation methods. The experimental results show that the proposed methods significantly enhance the power of discrimination of video features, and consequently improve the accuracy of video semantic concept detection. 相似文献

8.

基于模糊同质性映射的文本检测方法 总被引：2，自引：0，他引：2

黄剑华承恒达吴锐刘家锋《电子与信息学报》2008,30(6):1376-1380

视频图像中的文本是从语义层次对视频图像内容进行描述的非常有效信息,文本检测为基于语义的图像检索提供了条件。该文提出了一种基于模糊逻辑和同质映射相结合的文本检测方法,首先利用最大信息熵准则将原始图像模糊化;然后构造基于边缘信息和纹理信息的图像同质性,并利用它将图像映射到模糊同质性空间;最后在模糊同质性空间通过纹理分析检测文本区域。与直接在图像空间域中提取特征的文本检测方法相比,该方法对复杂背景视频图像的文本检测取得了更好的效果,并且适用于多种类型的视频图像中文本的检测。相似文献

9.

Retinex based visual identicalness detection for videos corrupted by imaging noise

Xin Jin Satoshi Goto Qionghai Dai 《Signal Processing: Image Communication》2013,28(9):1187-1201

Detecting the visually identical regions among successive frames for noisy videos, called visual identicalness detection (VID) in this paper, is a fundamental tool in video applications for lower power consumption and higher efficiency. In this paper, instead of performing VID on the original video signal or on the de-noised video signal, a Retinex based VID approach is proposed to perform VID on the Retinex signal to eliminate the noise influence introduced by imaging system. Several Retinex output generation approaches are compared, within which the proposed Cohen–Daubechies–Feauveau wavelet based approach is demonstrated to have better efficiency in detection and higher adaptability to the video content and noise severity. Compared with approaches performing detection in the de-noised images, the proposed algorithm presents up to 4.78 times higher detection rate for the videos with moving objects and up to 30.79 times higher detection rate for the videos with static scenes, respectively, at the same error rate. Also, an application of this technique is provided by integrating it into an H.264/AVC video encoder. Compared with compressing the de-noised videos using the existing fast algorithm, an average of 1.7 dB performance improvement is achieved with up to 5.47 times higher encoding speed. Relative to the reference encoder, up to 32.47 times higher encoding speed is achieved without sacrificing the subjective quality. 相似文献

10.

基于非约束图像参考集匹配的视频人脸识别

谢丽萍彭波《电视技术》2014,38(7):208-212,202

针对视频人脸识别中系统不能很好地学习人脸图像有效特征的问题,提出了非约束图像参考集匹配方法,通过在两个图像集之间预先定义参考集构造多个离线的局部模型,并分别与参考集进行匹配,无需考虑所有的成对情况,从而计算出它们的相似度,有效地将视频人脸识别问题转化成二次规划问题。所提方法的有效性在Honda、MoBo及YouTube三大视频人脸数据库上进行了验证,实验结果表明,与现有的视频人脸识别方法相比,所提方法取得了更好的识别效果。相似文献

11.

Symmetric ear and profile face fusion for identical twins and non-twins recognition

Önsen Toygar Esraa Alqaralleh Ayman Afaneh 《Signal, Image and Video Processing》2018,12(6):1157-1164

Humans have bilateral body symmetry such that the left and right sides are mirror images of each other. This study tries to measure the performance on human recognition where the stored templates in the database are acquired from one side of a biometric trait such as left profile face, while the tested samples correspond to the other side of the same trait after applying a horizontal flip. Two different biometric traits are used in this study, namely profile face and ear biometrics. The experiments are conducted using the feature extraction methods namely Principal Component Analysis, Scale-Invariant Feature Transform, Local Binary Patterns, Local Phase Quantization and Binarized Statistical Image Features. Several experiments are performed on identical twins and non-twins individuals using ND-Twins-2009-2010 and UBEAR databases. Furthermore, the symmetry of profile face and ear is used to propose a hybrid approach of human recognition system that involves feature-level and score-level fusion of both traits. The proposed method is superior to all the unimodal and multimodal biometric methods that are implemented in this study for human recognition in the case of symmetry. 相似文献

12.

一种基于纹理的车辆阴影消除新算法

张明星代永霞张静《信息通信》2011,(6):18-19

基于对智能交通系统(ITS,Intelligent Transport Systems)中视频检测的研究和分析,特别针对其中关键步骤之一的阴影消除展开深入探讨,分析了阴影产生的原理和特点,阐述了现有的阴影去除算法,在现有算法的基础上提出了一种基于区域聚类的阴影消除算法.实践证明,该方法能够较好的去除运动车辆的阴影,保留... 相似文献

13.

Image copy-move forgery detection based on dynamic threshold with dense points

《Journal of Visual Communication and Image Representation》2022

相似文献

14.

一种基于监控系统的图像清晰度检测算法

王聪《电视技术》2012,36(21):162-164,175

视频图像的清晰度是衡量视频监控系统的重要指标。通过比较灰度变化函数、频谱函数、方差梯度函数、灰度熵函数评价图像清晰度的优劣,提出了一种新的清晰度评价函数。通过测试大量的图片和监控视频,对改进后的算法和其他检测算法的性能进行了比较,结果表明,提出的检测算法具有单峰性强、灵敏度高、无偏好性、信噪比高等特点,能够自动实时有效地完成对视频图像清晰度的检测。相似文献

15.

基于视频摘要生成技术的研究 总被引：5，自引：0，他引：5

朱志辉《微电子学与计算机》2006,23(2):76-78,82

文章研究了标题形式摘要，故事板摘要及缩略视频摘要三种形式的摘要。充分利用各种多媒体融合分析手段，提出了视频内客判定模型。根据不同的视频分解粒度，提出了不同层次的对象重要度判定模型．生成有意义的视频摘要。设计实现了有效的视频摘要生成系统，融合多种技术与方法，形成完整的检索视频索引生成系统。相似文献

16.

Video saliency detection incorporating temporal information in compressed domain

《Signal Processing: Image Communication》2015

Saliency detection is widely used to pick out relevant parts of a scene as visual attention regions for various image/video applications. Since video is increasingly being captured, moved and stored in compressed form, there is a need for detecting video saliency directly in compressed domain. In this study, a compressed video saliency detection algorithm is proposed based on discrete cosine transformation (DCT) coefficients and motion information within a visual window. Firstly, DCT coefficients and motion information are extracted from H.264 video bitstream without full decoding. Due to a high quantization parameter setting in encoder, skip/intra is easily chosen as the best prediction mode, resulting in a large number of blocks with zero motion vector and no residual existing in video bitstream. To address these problems, the motion vectors of skip/intra coded blocks are calculated by interpolating its surroundings. In addition, a visual window is constructed to enhance the contrast of features and to avoid being affected by encoder. Secondly, after spatial and temporal saliency maps being generated by the normalized entropy, a motion importance factor is imposed to refine the temporal saliency map. Finally, a variance-like fusion method is proposed to dynamically combine these maps to yield the final video saliency map. Experimental results show that the proposed approach significantly outperforms other state-of-the-art video saliency detection models. 相似文献

17.

基于关键帧颜色和纹理特征的视频拷贝检测

陈秀新贾克斌魏世昂《电视技术》2012,36(15):36-39

提出了一种基于关键帧颜色和纹理特征的视频拷贝检测方法。首先通过子片段方法提取视频的关键帧,然后将关键帧分成3个子块,提取每个子块的三维量化颜色直方图,通过直方图相交法来进行颜色特征的匹配。对检索得到的结果视频关键帧进行纹理特征提取,通过其灰度共生矩阵的角二阶矩和熵来表征其纹理特征,纹理特征的匹配可进一步过滤不相关的视频。实验结果表明,该方法效果好、稳健性强且可应用于多种类型的视频。相似文献

18.

Graph-based video fingerprinting using double optimal projection

《Journal of Visual Communication and Image Representation》2015

A double optimal projection method that involves projections for intra-cluster and inter-cluster dimensionality reduction are proposed for video fingerprinting. The video is initially set as a graph with frames as its vertices in a high-dimensional space. A similarity measure that can compute the weights of the edges is then proposed. Subsequently, the video frames are partitioned into different clusters based on the graph model. Double optimal projection is used to explore the optimal mapping points in a low-dimensional space to reduce the video dimensions. The statistics and geometrical fingerprints are generated to determine whether a query video is copied from one of the videos in the database. During matching, the video can be roughly matched by utilizing the statistics fingerprint. Further matching is thereafter performed in the corresponding group using geometrical fingerprints. Experimental results show the good performance of the proposed video fingerprinting method in robustness and discrimination. 相似文献

19.

基于V4L2的移动视频监控系统研究与设计

黄俊伟巴义《电视技术》2012,36(17):159-162

设计了一种基于ARM-Linux的移动视频监控系统,介绍了嵌入式Linux操作系统下的V4L2视频采集流程。该系统分为采集端和监控端,采集端主要完成视频采集、图像压缩处理和视频传输等功能;远程监控端通过手机3G上网方式连接到视频采集端,从而实现视频监控功能。该系统结构简单,使用便利,成本低廉,非常适用于家居安防。相似文献

20.

基于内容的海量监控视频的多层次检索系统

郑海波《电视技术》2014,38(19)

设计和实现了一种基于内容的海量监控视频的多层次检索系统。该系统首先从监控视频中提取关键帧图像,其次利用行人检测、人脸识别及车辆检测等算法将关键帧中的行人图像、人脸图像和车辆图像等感兴趣目标提取出来,然后提取这些图像的颜色、纹理等特征,利用改进的LIRe(Lucene Image Retrieval)建立分布式的特征库,最终形成了多层次的信息数据库。实验表明,该系统具有较高的检索准确率和较快的检索速率,并支持海量监控视频的检索。相似文献