期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

童晓峰刘青山卢汉清金洪亮《自动化学报》2005,31(4):523-529

A semantic unit based event detection scheme in soccer videos is proposed in this paper. The scheme can be characterized as a three-layer framework. At the lowest layer, low-level features including color, texture, edge, shape, and motion are extracted. High-level semantic events are defined at the highest layer. In order to connect low-level features and high-level semantics, we design and define some semantic units at the intermediate layer. A semantic unit is composed of a sequence of consecutives frames with the same cue that is deduced from low-level features. Based on semantic units, a Bayesian network is used to reason the probabilities of events. The experiments for shoot and card event detection in soccer videos show that the proposed method has an encouraging performance. 相似文献

2.

Multi-feature hierarchical topic models for human behavior recognition

LI HePing ZHANG Feng ZHANG ShuWu 《中国科学:信息科学(英文版)》2014,(9):83-97

Human behavior recognition is one important task of image processing and surveillance system. One main challenge of human behavior recognition is how to effectively model behaviors on condition of unconstrained videos due to tremendous variations from camera motion,background clutter,object appearance and so on. In this paper,we propose two novel Multi-Feature Hierarchical Latent Dirichlet Allocation models for human behavior recognition by extending the bag-of-word topic models such as the Latent Dirichlet Allocation model and the Multi-Modal Latent Dirichlet Allocation model. The two proposed models with three hierarchies including low-level visual features,feature topics,and behavior topics can effectively fuse two different types of features including motion and static visual features,avoid detecting or tracking the motion objects,and improve the recognition performance even if the features are extracted with a great amount of noise. Finally,we adopt the variational EM algorithm to learn the parameters of these models. Experiments on the YouTube dataset demonstrate the effectiveness of our proposed models. 相似文献

3.

Video motion stitching using trajectory and position similarities

CHEN XiaoWu LI Qing LI Xin & ZHAO QinPing State 《中国科学:信息科学(英文版)》2012,(3):600-614

Stitching motions in multiple videos into a single video scene is a challenging task in current video fusion and mosaicing research and film production. In this paper, we present a novel method of video motion stitching based on the similarities of trajectory and position of foreground objects. First, multiple video sequences are registered in a common reference frame, whereby we estimate the static and dynamic backgrounds, with the former responsible for distinguishing the foreground from the background and the static region from the dynamic region, and the latter functioning in mosaicing the warped input video sequences into a panoramic video. Accordingly, the motion similarity is calculated by reference to trajectory and position similarity, whereby the corresponding motion parts are extracted from multiple video sequences. Finally, using the corresponding motion parts, the foregrounds of different videos and dynamic backgrounds are fused into a single video scene through Poisson editing, with the motions involved being stitched together. Our major contributions are a framework of multiple video mosaicing based on motion similarity and a method of calculating motion similarity from the trajectory similarity and the position similarity. Experiments on everyday videos show that the agreement of trajectory and position similarities with the real motion similarity plays a decisive role in determining whether two motions can be stitched. We acquire satisfactory results for motion stitching and video mosaicing. 相似文献

4.

Detecting Objectionable Videos

WANG Qian HU Wei-Ming TAN Tie-Niu 《自动化学报》2005,(2)

This paper addresses the problem of detecting objectionable videos, which has never been carefully studied before. Our method can be efficiently used to filter objectionable videos on Internet. One tensor based key-frame selection algorithm, one cube based color model and one objectionable video estimation algorithm are presented. The key frame selection is based on motion analysis using the three-dimensional structure tensor. Then the cube based color model is employed to detect skin color in each key frame. Finally, the video estimation algorithm is applied to estimate objectionable degree in videos. Experimental results on a variety of real-world videos downloaded from Internet show that this method is promising. 相似文献

5.

敏感视频检测

王谦胡卫明谭铁牛《自动化学报》2005,31(2):280-286

This paper addresses the problem of detecting objectionable videos, which has never been carefully studied before. Our method can be efficiently used to filter objectionable videos on Internet. One tensor based key-frame selection algorithm, one cube based color model and one objectionable video estimation algorithm are presented. The key frame selection is based on motion analysis using the three-dimensional structure tensor. Then the cube based color model is employed to detect skin color in each key frame. Finally, the video estimation algorithm is applied to estimate objectionable degree in videos. Experimental results on a variety of real-world videos downloaded from Internet show that this method is promising. 相似文献

6.

Multi-cue-based face and facial feature detection on video segments

下载免费PDF全文

彭振云艾海舟洪微梁路宏徐光祐《计算机科学技术学报》2003,18(2):0-0

An approach is presented to detect faces and facial features on a video segment based on multi-cues,including gray-level distribution,color,motion,templates,algebraic fatures and so on .Faces are fist detected across the frames by using color segmentation,template matching and artificial neural network.A PCA-based (Principal Component Analysis) feature detector for still images is then used to detect facial features on each single frame until the resulting features of three adjacent frames ,named as base frames,are consistent with each other.The fatures of frames neighboring the base ,frames are first detected by the still-imagte feature detector,then verified and corrected according to the smoothness constraint and the planar surface motion constraint.Experiments have been performed on video segments captured under different environments ,and the presented method is proved to be robust and accurate over variable poses ,agtes and illumination conditions. 相似文献

7.

A New Fire Detection Method Using a Multi-Expert System Based on Color Dispersion,Similarity and Centroid Motion in Indoor Environment

下载免费PDF全文

Teng Wang Leping Bu Zhikai Yang Peng Yuan Jineng Ouyang 《IEEE/CAA Journal of Automatica Sinica》2020,7(1):263-275

In this paper, a video fire detection method is proposed, which demonstrated good performance in indoor environment. Three main novel ideas have been introduced. Firstly, a flame color model in RGB and HIS color space is used to extract pre-detected regions instead of traditional motion differential method, as it’s more suitable for fire detection in indoor environment. Secondly, according to the flicker characteristic of the flame, similarity and two main values of centroid motion are proposed. At the same time, a simple but effective method for tracking the same regions in consecutive frames is established. Thirdly,a multi-expert system consisting of color component dispersion,similarity and centroid motion is established to identify flames.The proposed method has been tested on a very large dataset of fire videos acquired both in real indoor environment tests and from the Internet. The experimental results show that the proposed approach achieved a balance between the false positive rate and the false negative rate, and demonstrated a better performance in terms of overall accuracy and F standard with respect to other similar fire detection methods in indoor environment. 相似文献

8.

基于模糊逻辑的多相机非重叠场景的物体匹配

LOKE Yuan Ren KUMAR Pankaj 黄为民《自动化学报》2006,32(6):978-987

An approach based on fuzzy logic for matching both articulated and non-articulated objects across multiple non-overlapping field of views (FoVs) from multiple cameras is proposed. We call it fuzzy logic matching algorithm (FLMA). The approach uses the information of object motion, shape and camera topology for matching objects across camera views. The motion and shape information of targets are obtained by tracking them using a combination of ConDensation and CAMShift tracking algorithms. The information of camera topology is obtained and used by calculating the projective transformation of each view with the common ground plane. The algorithm is suitable for tracking non-rigid objects with both linear and non-linear motion. We show videos of tracking objects across multiple cameras based on FLMA. From our experiments, the system is able to correctly match the targets across views with a high accuracy. 相似文献

9.

Real-time and automatic close-up retrieval from compressed videos

Ying Weng Jianmin Jiang 《国际自动化与计算杂志》2008,5(2):198-201

相似文献

10.

An efficient visual tracking method for multiple moving targets

CHEN Xiao-hui Lyudmila Mihaylova David R Bull Nishan Canagarajah 《通讯和计算机》2008,5(5):61-65

An efficient algorithm of the edge detection according to integrating the edge gradient with the average filter is proposed, which can significantly reduce sensitivity of the background subtraction method to noise and illumination. Taking into account the features of the target such as color, size, etc., a new modified Nearest Neighbor （NN） algorithm for data association using the target features is designed. A designed Interacting Multiple Model （IMM） filter is utilized to track the maneuvering target motion, i.e. the feature point （called the centroid of the target） motion of the target. The algorithms are validated via an example with natural video sequences. The results show the algorithms are performances and validity for visual tracking. In complex environment, the algorithm can still work well. 相似文献

11.

Fuzzy segmentation of video shots using hybrid color spaces and motion information

Bruno M. Carvalho Edgar Garduño Tiago S. Santos Lucas M. Oliveira José F. Silva Neto 《Pattern Analysis & Applications》2014,17(2):249-264

Video segmentation can be defined as the process of partitioning video into spatio-temporal objects that are homogeneous in some feature space, with the choice of features being very important to the success of the segmentation process. Fuzzy segmentation is a semi-automatic region-growing segmentation algorithm that assigns to each element in an image a grade of membership in an object. In this paper, we propose an extension of the multi-object fuzzy segmentation algorithm to segment pre-acquired color video shots. The color features are selected from channels belonging to different color models using two different heuristics: one that uses the correlations between the color channels to maximize the amount of information used in the segmentation process, and one that chooses the color channels based on the separation of the clusters formed by the seed spels for all possible color spaces. Motion information is also incorporated into the segmentation process by making use of dense optical flow maps. We performed experiments on synthetic videos, with and without noise, as well as on some real videos. The experiments show promising results, with the segmentations of real videos produced using hybrid color spaces being more accurate than the ones produced using three other color models. We also show that our method compares favorably to a state-of-the art video segmentation algorithm. 相似文献

12.

基于运动特征融合的快速视频超分辨率重构方法

付利华孙晓威赵宇李宗刚黄笳倞王路远《模式识别与人工智能》2019,32(11):1022-1031

基于深度学习的视频超分辨率重构方法常面临重构精度不高或重构时间过长的问题,难以实时获得高精度的重构结果.针对此问题,文中提出基于深度残差网络的视频超分辨率重构方法,可以快速地对视频进行高精度重构,并在较小分辨率视频的重构过程中达到实时重构的要求.自适应关键帧判别子网自适应地从视频帧中判别关键帧,关键帧经过高精度关键帧重构子网进行重构.对于非关键帧,将其特征与邻近关键帧间的运动估计特征和邻近关键帧的特征逐层融合,直接获得非关键帧的特征,从而快速获得非关键帧的重构结果.在公开数据集上的实验表明,文中方法能实现对视频的快速、高精度重构,鲁棒性较好. 相似文献

13.

基于决策树的MPEG视频镜头分割算法 总被引：1，自引：0，他引：1

沈玉利任建峰郭雷《计算机工程与应用》2006,42(12):27-29,59

压缩视频镜头的分割是视频内容分析中的一个难点,由于镜头在组织和索引视频中起关键性的作用,提出了一种基于决策树的MPEG视频镜头分割算法。该算法采用决策树这种机器学习方法对样本视频进行训练,通过融合运动信息、颜色、边缘等特征获得镜头分割的最佳阈值,较好地解决了压缩视频处理中检测镜头突变和渐变难题,同时还能够检测出镜头是否产生闪光现象和相机运动的产生。实验证明本算法在压缩视频镜头检测方面取得了较好的检测结果。相似文献

14.

Content based video matching using spatiotemporal volumes 总被引：1，自引：0，他引：1

Arslan Basharat Yun Zhai Mubarak Shah 《Computer Vision and Image Understanding》2008,110(3):360

相似文献

15.

Video stabilization using maximally stable extremal region features

Manish Okade Prabir Kumar Biswas 《Multimedia Tools and Applications》2014,68(3):947-968

Video stabilization is an important technique in present day digital cameras as most of the cameras are hand-held, mounted on moving platforms or subjected to atmospheric vibrations. In this paper we propose a novel video stabilization scheme based on estimating the camera motion using maximally stable extremal region features. These features traditionally used in wide baseline stereo problems were never explored for video stabilization purposes. Through our extensive experiments show we how some properties of these region features are suitable for the stabilization task. After estimating the global camera motion parameters using these region features, we smooth the motion parameters using a gaussian filter to retain the desired motion. Finally, motion compensation is carried out to obtain a stabilized video sequence. A number of examples on real and synthetic videos demonstrate the effectiveness of our proposed approach. We compare our results to existing techniques and show how our proposed approach compares favorably to them. Interframe Transformation Fidelity is used for objective evaluation of our proposed approach. 相似文献

16.

基于三步筛选的视频渐变镜头检测

下载免费PDF全文

王剑峰杜奎然《计算机工程》2011,37(24):269-271

针对视频中的叠化与淡入淡出现象,提出一种基于三步筛选的渐变镜头检测算法。提取视频帧的亮度和方差作为特征,通过有限状态机实现初始渐变检测,并计算视频帧的颜色、共生矩阵、运动特征,从而进行三步筛选,保证检测的准确性。对TRECVID视频进行实验,结果表明,该算法对渐变具有较好的检测性能,对运动及闪光现象有较强的鲁棒性。相似文献

17.

融合评论分析和隐语义模型的视频推荐算法

尹路通于炯鲁亮英昌甜郭刚《计算机应用》2015,35(11):3247-3251

针对网络视频元数据信息缺失严重和多媒体数据本身特征难以提取等问题,提出了融合评论分析和隐语义模型的网络视频推荐算法.从视频评论入手,通过分析用户对不同视频的评论内容以判断其情感倾向并加以量化,继而构建用户对项目的虚拟评分矩阵,弥补了显式评分数据稀疏性问题.考虑到网络视频的多元性和高维度特性,为了深度挖掘用户对网络视频的潜在兴趣,针对虚拟评分矩阵采用隐语义模型(LFM)对网络视频分类,在传统的用户—项目二元推荐系统基础之上添加虚拟类目信息以进一步发掘用户—类目—项目关联关系.实验在多重标准下进行,对YouTube评论集的实验表明,所提推荐方法获得了较高的推荐精度. 相似文献

18.

Detecting both superimposed and scene text with multiple languages and multiple alignments in video

Xiaodong Huang Huadong Ma Charles X. Ling Guangyu Gao 《Multimedia Tools and Applications》2014,70(3):1703-1727

Video text often contains highly useful semantic information that can contribute significantly to video retrieval and understanding. Video text can be classified into scene text and superimposed text. Most of the previous methods detect superimposed or scene text separately due to different text alignments. Moreover, because different language characters have different edge and texture features, it is very difficult to detect the multilingual text. In this paper, we first perform a detailed analysis of motion patterns of video text, and show that the superimposed and scene text exhibit different motion patterns on consecutive frames, which is insensitive to multiple language characters and multiple text alignments. Based on our analysis, we define Motion Perception Field (MPF) to represent the text motion patterns. Finally, we propose a text detection algorithms using MPF for both superimposed and scene text with multiple languages and multiple alignments. Experimental results on diverse videos demonstrate that our algorithms are robust, and outperform previous methods for detecting both superimposed and scene texts with multiple languages and multiple alignments. 相似文献

19.

基于注意力融合网络的视频超分辨率重建

卞鹏程郑忠龙李明禄何依然王天翔张大伟陈丽媛《计算机应用》2021,41(4):1012-1019

基于深度学习的视频超分辨率方法主要关注视频帧内和帧间的时空关系,但以往的方法在视频帧的特征对齐和融合方面存在运动信息估计不精确、特征融合不充分等问题。针对这些问题,采用反向投影原理并结合多种注意力机制和融合策略构建了一个基于注意力融合网络（AFN）的视频超分辨率模型。首先,在特征提取阶段,为了处理相邻帧和参考帧之间的多种运动,采用反向投影结构来获取运动信息的误差反馈;然后,使用时间、空间和通道注意力融合模块来进行多维度的特征挖掘和融合;最后,在重建阶段,将得到的高维特征经过卷积重建出高分辨率的视频帧。通过学习视频帧内和帧间特征的不同权重,充分挖掘了视频帧之间的相关关系,并利用迭代网络结构采取渐进的方式由粗到精地处理提取到的特征。在两个公开的基准数据集上的实验结果表明,AFN能够有效处理包含多种运动和遮挡的视频,与一些主流方法相比在量化指标上提升较大,如对于4倍重建任务,AFN产生的视频帧的峰值信噪比（PSNR）在Vid4数据集上比帧循环视频超分辨率网络（FRVSR）产生的视频帧的PSNR提高了13.2%,在SPMCS数据集上比动态上采样滤波视频超分辨率网络（VSR-DUF）产生的视频帧的PSNR提高了15.3%。相似文献

20.

基于图像主色彩的视频关键帧提取方法

王松韩永国吴亚东张赛楠《计算机应用》2013,33(9):2631-2635

针对现有关键帧提取算法存在的计算量大、阈值选择困难、视频类型受限等问题, 提出了一种基于图像主色彩的视频关键帧提取方法。该方法利用基于八叉树结构的色彩量化算法提取图像主色彩特征,通过计算颜色特征的相似度实现镜头边界检测,最后采用K-均值算法对提取出的代表帧序列进行聚类,准确提取出指定数目的关键帧。实验结果表明,所提算法计算简单、空间耗费少,具有良好的通用性和适应性。相似文献