首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
提出一种基于视觉注意机制的运动目标跟踪方法。该方法借鉴人类的视觉注意机制的研究成果,建立视觉注意机制的计算模型,计算视频中各部分内容的视觉显著性。结合视觉显著性计算结果,提取视频图像中的显著性目标。利用颜色分布模型作为目标的特征表示模型,与视频中各显著目标进行特征匹配,实现目标的跟踪。在多个视频序列中进行实验,并给出相应的实验结果及分析。实验结果表明,提出的目标检测与跟踪算法是正确有效的。  相似文献   

2.
基于多模态的检测方法是过滤成人视频的有效手段,然而现有方法中缺乏准确的音频语义表示方法。因此本文提出融合音频单词与视觉特征的成人视频检测方法。先提出基于周期性的能量包络单元(简称EE)分割算法,将音频流准确地分割为EE的序列;再提出基于EE和BoW(Bag-of-Words)的音频语义表示方法,将EE的特征描述为音频单词的出现概率;采用复合加权方法融合音频单词与视觉特征的检测结果;还提出基于周期性的成人视频判别算法,与基于周期性的EE分割算法前后配合,以充分利用周期性进行检测。实验结果表明,与基于视觉特征的方法相比,本文方法显著提高了检测性能。当误检率为9.76%时,检出率可达94.44%。  相似文献   

3.
Due to the prevalence of digital video camcorders, home videos have become an important part of life-logs of personal experiences. To enable efficient video parsing, a critical step is to automatically extract objects, events and scene characteristics present in videos. This paper addresses the problem of extracting objects from home videos. Automatic detection of objects is a classical yet difficult vision problem, particularly for videos with complex scenes and unrestricted domains. Compared with edited and surveillant videos, home videos captured in uncontrolled environment are usually coupled with several notable features such as shaking artifacts, irregular motions, and arbitrary settings. These characteristics have actually prohibited the effective parsing of semantic video content using conventional vision analysis. In this paper, we propose a new approach to automatically locate multiple objects in home videos, by taking into account of how and when to initialize objects. Previous approaches mostly consider the problem of how but not when due to the efficiency or real-time requirements. In home-video indexing, online processing is optional. By considering when, some difficult problems can be alleviated, and most importantly, enlightens the possibility of parsing semantic video objects. In our proposed approach, the how part is formulated as an object detection and association problem, while the when part is a saliency measurement to determine the best few locations to start multiple object initialization  相似文献   

4.
Salient object detection aims to extract the attractive objects in images and videos. It can support various robotics tasks and multimedia applications, such as object detection, action recognition and scene analysis. However, efficient detection of salient objects in videos still faces many challenges as compared to that in still images. In this paper, we propose a novel video-based salient object detection method by exploring spatio-temporal characteristics of video content, i.e., spatial-temporal difference and spatial-temporal coherence. First, we initialize the saliency map for each keyframe by deriving spatial-temporal difference from color cue and motion cue. Next, we generate the saliency maps of other frames by propagating the saliency intra and inter frames with the constraint of spatio-temporal coherence. Finally, the saliency maps of both keyframes and non-keyframes are refined in the saliency propagation. In this way, we can detect salient objects in videos efficiently by exploring their spatio-temporal characteristics. We evaluate the proposed method on two public datasets, named SegTrackV2 and UVSD. The experimental results show that our method outperforms the state-of-the-art methods when taking account of both effectiveness and efficiency.  相似文献   

5.
当前传统交通事故检测和查阅主要通过人工监测的方法,这种方法效率低且实时性差,本文提出一种基于最新压缩域视频编码标准HEVC(High-efficiency video coding)的车辆异常事件检测方法。首先对HEVC码流中提取出的运动矢量信息进行运动矢量累积迭代和中值滤波的预处理,之后根据提取出的块划分信息和运动矢量信息计算运动对象的运动强度,然后根据运动强度值和八连通区域法提取出运动对象,最后根据空间距离法和运动强度判别法检测出视频序列中发生的车辆异常事件。实验证明,该方法可以准确地检测出视频序列中发生的车辆异常事件;对于有着快速移动的运动目标以及多个运动目标的视频效果更好。  相似文献   

6.
7.
Video remains the method of choice for capturing temporal events. However, without access to the underlying 3D scene models, it remains difficult to make object level edits in a single video or across multiple videos. While it may be possible to explicitly reconstruct the 3D geometries to facilitate these edits, such a workflow is cumbersome, expensive, and tedious. In this work, we present a much simpler workflow to create plausible editing and mixing of raw video footage using only sparse structure points (SSP) directly recovered from the raw sequences. First, we utilize user‐scribbles to structure the point representations obtained using structure‐from‐motion on the input videos. The resultant structure points, even when noisy and sparse, are then used to enable various video edits in 3D, including view perturbation, keyframe animation, object duplication and transfer across videos, etc. Specifically, we describe how to synthesize object images from new views adopting a novel image‐based rendering technique using the SSPs as proxy for the missing 3D scene information. We propose a structure‐preserving image warping on multiple input frames adaptively selected from object video, followed by a spatio‐temporally coherent image stitching to compose the final object image. Simple planar shadows and depth maps are synthesized for objects to generate plausible video sequence mimicking real‐world interactions. We demonstrate our system on a variety of input videos to produce complex edits, which are otherwise difficult to achieve.  相似文献   

8.
9.
In this paper, we propose a new real-time content filtering framework for live broadcasts in TV terminals. Content filtering in TV terminals is a necessary provision of personalized broadcasting services in that it enables a TV viewer to obtain desired scenes from multiple channel broadcasts. In this paper, a stable and reliable filtering structure and an algorithm for multiple inputs are proposed. Moreover, real-time filtering requirements such as frame sampling rate per channel, number of input channels, and buffer condition are analyzed to achieve real-time processing in terminals with limited computing power. Based on queueing theory, we model the system and resolve the filtering requirements. To verify the proposed system and analysis, a filtering algorithm for soccer videos is applied which is modified for real-time processing. Through analysis of visual features (e.g., dominant color and edge components) and detection of spatial objects (e.g., a score board), it recognizes a temporal pattern between successive video frames and filters desired scenes. Experiments on soccer videos have been performed and the results validate the effectiveness of the proposed approach and system.
Yong Man Ro (Corresponding author)Email:
  相似文献   

10.
With the recent popularization of mobile video cameras including camera phones, a new technology, mobile video surveillance, which uses mobile video cameras for video surveillance has been emerging. Such videos, however, may infringe upon the privacy of others by disclosing privacy sensitive information (PSI), i.e., their appearances. To prevent videos from infringing on the right to privacy, new techniques are required that automatically obscure PSI regions. The problem is how to determine the PSI regions to be obscured while maintaining enough video content to present the camera persons’ capture-intentions, i.e., what they want to record in their videos to achieve their surveillance tasks. To this end, we introduce a new concept called intended human objects that are defined as human objects essential for capture-intentions, and develop a new method called intended human object detection that automatically detects the intended human objects in videos taken by different camera persons. Through the process of intended human object detection, we develop a system for automatically obscuring PSI regions. We experimentally show the performance of intended human object detection and the contributions of the features used. Our user study shows the potential applicability of our proposed system.  相似文献   

11.
监控视频是安防系统的重要组成部分。在如今的各行各业中,只要涉及到安全,均 离不开监控视频。但对监控视频内容的分析主要依靠大量人工来完成,人力和时间成本巨大。随 着监控视频数据越来越多,如何提高针对视频内容的分析效率、降低用户认知负荷是拓展视频利 用率的重要方面。为此,针对监控视频存在的冗余信息较多、人工获取视频关键内容效率低的问 题,采用螺旋视频摘要及相应交互技术,开发了一种面向监控视频内容的可视分析系统,结合运 动目标检测结果数据,基于螺旋摘要的展示优势实现多角度可视化视频目标统计信息,并辅以针 对螺旋摘要的导航、定位操作以及草图交互等方式,实现对监控视频内容的快速有效获取。  相似文献   

12.
This work addresses the development of a computational model of visual attention to perform the automatic summarization of digital videos from television archives. Although the television system represents one of the most fascinating media phenomena ever created, we still observe the absence of effective solutions for content-based information retrieval from video recordings of programs produced by this media universe. This fact relates to the high complexity of the content-based video retrieval problem, which involves several challenges, among which we may highlight the usual demand on video summaries to facilitate indexing, browsing and retrieval operations. To achieve this goal, we propose a new computational visual attention model, inspired on the human visual system and based on computer vision methods (face detection, motion estimation and saliency map computation), to estimate static video abstracts, that is, collections of salient images or key frames extracted from the original videos. Experimental results with videos from the Open Video Project show that our approach represents an effective solution to the problem of automatic video summarization, producing video summaries with similar quality to the ground-truth manually created by a group of 50 users.  相似文献   

13.
欧伟奇    尹辉    许宏丽    刘志浩   《智能系统学报》2019,14(2):246-253
Egocentric视频具有目标运动剧烈、遮挡频繁、目标尺度差异明显及视角时变性强的特点,给目标跟踪任务造成了极大的困难。本文从重建不同视角Egocentric视频中各目标的运动轨迹出发,提出一种基于Multi-Egocentric视频运动轨迹重建的多目标跟踪算法,该方法基于多视角同步帧之间的单应性约束解决目标遮挡和丢失问题,然后根据多视角目标空间位置约束关系通过轨迹重建进一步优化目标定位,并采用卡尔曼滤波构建目标运动模型优化目标运动轨迹,在BJMOT、EPLF-campus4数据集上的对比实验验证了本文算法在解决Multi-Egocentric视频多目标跟踪轨迹不连续问题的有效性。  相似文献   

14.
郭洋  马翠霞  滕东兴  杨祎  王宏安 《软件学报》2016,27(5):1151-1162
随着治安监控系统的普及,越来越多的监控摄像头被安装在各个交通道路和公共场所中,每天都产生大量的监控视频.如今,监控视频分析工作主要是采用人工观看的方式来排查异常,以这种方式来分析视频内容耗费大量的人力和时间.目前,关于视频分析方面的研究大多是针对目标个体的异常行为检测和追踪,缺乏针对对象之间的关联关系的分析,对视频中的一些对象和场景之间的关联关系等还没有较为有效的表示和分析方法.针对这一现状,提出一种基于运动目标三维轨迹的关联视频可视分析方法来辅助人工分析视频,首先对视频资料进行预处理,获取各个目标对象的运动轨迹信息,由于二维轨迹难以处理轨迹的自相交、循环运动和停留等现象,并且没有时间信息就难以对同一空间内多个对象轨迹进行的关联性分析,于是结合时间维度对轨迹进行三维化扩展.该方法支持草图交互方式来操作,在分析过程中进行添加草图注释来辅助分析.可结合场景和对象的时空关系对轨迹进行关联性计算,得出对象及场景之间的关联模型,通过对对象在各个场景出现状况的统计,结合人工预先设定的规则,可实现对异常行为报警,辅助用户决策.  相似文献   

15.
Converting unconstrained video sequences into videos that loop seamlessly is an extremely challenging problem. In this work, we take the first steps towards automating this process by focusing on an important subclass of videos containing a single dominant foreground object. Our technique makes two novel contributions over previous work: first, we propose a correspondence‐based similarity metric to automatically identify a good transition point in the video where the appearance and dynamics of the foreground are most consistent. Second, we develop a technique that aligns both the foreground and background about this transition point using a combination of global camera path planning and patch‐based video morphing. We demonstrate that this allows us to create natural, compelling, loopy videos from a wide range of videos collected from the internet.  相似文献   

16.
17.
With the wide spread of smartphones, a large number of user-generated videos are produced everyday. The embedded sensors, e.g., GPS and the digital compass, make it possible that videos are accessed based on their geo-properties. In our previous work, we have created a framework for integrated, sensor-rich video acquisition (with one instantiation implemented in the form of smartphone applications) which associates a continuous stream of location and viewing direction information with the collected videos, hence allowing them to be expressed and manipulated as spatio-temporal objects. These sensor meta-data are considerably smaller in size compared to the visual content and are helpful in effectively and efficiently searching for geo-tagged videos in large-scale repositories. In this study, we propose a novel three-level grid-based index structure and introduce a number of related query types, including typical spatial queries and ones based on bounded radius and viewing direction restriction. These two criteria are important in many video applications and we demonstrate the importance with a real-world dataset. Moreover, experimental results on a large-scale synthetic dataset show that our approach can provide a significant speed improvements of at least 30 %, considering a mix of queries, compared to a multi-dimensional R-tree implementation.  相似文献   

18.
近年来,深度学习在人工智能领域表现出优异的性能。基于深度学习的人脸生成和操纵技术已经能够合成逼真的伪造人脸视频,也被称作深度伪造,让人眼难辨真假。然而,这些伪造人脸视频可能会给社会带来巨大的潜在威胁,比如被用来制作政治虚假新闻,从而引发政治暴力或干扰正常选举等。因此,亟需研发对应的检测方法来主动发现伪造人脸视频。现有的方法在制作伪造人脸视频时,容易在空间上和时序上留下一些细微的伪造痕迹,比如纹理和颜色上的扭曲或脸部的闪烁等。主流的检测方法同样采用深度学习,可以被划分为两类,即基于视频帧的方法和基于视频片段的方法。前者采用卷积神经网络(Convolutional Neural Network,CNN)发现单个视频帧中的空间伪造痕迹,后者则结合循环神经网络(Recurrent Neural Network,RNN)捕捉视频帧之间的时序伪造痕迹。这些方法都是基于图像的全局信息进行决策,然而伪造痕迹一般存在于五官的局部区域。因而本文提出了一个统一的伪造人脸视频检测框架,利用全局时序特征和局部空间特征发现伪造人脸视频。该框架由图像特征提取模块、全局时序特征分类模块和局部空间特征分类模块组成。在FaceForensics++数据集上的实验结果表明,本文所提出的方法比之前的方法具有更好的检测效果。  相似文献   

19.
Automatic content analysis of sports videos is a valuable and challenging task. Motivated by analogies between a class of sports videos and languages, the authors propose a novel approach for sports video analysis based on compiler principles. It integrates both semantic analysis and syntactic analysis to automatically create an index and a table of contents for a sports video. Each shot of the video sequence is first annotated and indexed with semantic labels through detection of events using domain knowledge. A grammar-based parser is then constructed to identify the tree structure of the video content based on the labels. Meanwhile, the grammar can be used to detect and recover errors during the analysis. As a case study, a sports video parsing system is presented in the particular domain of diving. Experimental results indicate the proposed approach is effective.  相似文献   

20.
可供性是指在环境内物体所提供的一系列交互可能,描述环境属性与个体之间的连接过程。其中,视觉可供性研究即通过使用图像、视频等视觉数据,探究视觉主体与环境或物体交互的可能性,涉及到场景识别、动作识别、物体检测等相关领域。视觉可供性可广泛应用于机器人、场景理解等领域。根据目前已有的相关研究,按功能可供性、行为可供性、社交可供性三方面对视觉可供性进行分类,并针对每一类可供性检测方法按照传统机器学习方法和深度学习方法进行详细论述。对当前典型的视觉可供性数据集进行归纳与分析,对视觉可供性的应用方向及未来可能的研究方向进行讨论。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号