首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 0 毫秒
目的 卫星视频作为新兴遥感数据,可以提供观测区域高分辨率的空间细节信息与丰富的时序变化信息,为交通监测与特定车辆目标跟踪等应用提供了不同于传统视频视角的信息。相较于传统视频数据,卫星视频中的车辆目标分辨率低、尺度小、包含的信息有限。因此,当目标边界不明、存在部分遮挡或者周边环境表观模糊时,现有的目标跟踪器往往存在严重的目标丢失问题。对此,本文提出一种基于特征融合的卫星视频车辆核相关跟踪方法。方法 对车辆目标使用原始像素和方向梯度直方图(histogram of oriented gradient,HOG)方法提取包含互补判别能力的特征,利用核相关目标跟踪器分别得到具备不变性和判别性的响应图;通过响应图融合的方式结合两种特征的互补信息,得到目标位置;使用响应分布指标(response distribution criterion,RDC)判断当前目标特征的稳定性,决定是否更新跟踪器的表征模型。本文使用的相关滤波方法具有计算量小且运算速度快的特点,具备跟踪多个车辆目标的拓展能力。结果 在8个卫星视频序列上与主流的6种相关滤波跟踪器进行比较,实验数据涵盖光照变化、快速转弯、部分遮挡、阴影干扰、道路颜色变化和相似目标临近等情况,使用准确率曲线和成功率曲线的曲线下面积(area under curve,AUC)对车辆跟踪的精度进行评价。结果表明,本文方法较好地均衡了使用不同特征的基础跟踪器(性能排名第2)的判别能力,准确率曲线AUC提高了2.9%,成功率曲线AUC下降了4.1%,成功跟踪车辆目标,不发生丢失,证明了本文方法的先进性和有效性。结论 本文提出的特征融合的卫星视频车辆核相关跟踪方法,均衡了不同特征提取器的互补信息,较好解决了卫星视频中车辆目标信息不足导致的目标丢失问题,提升了精度。  相似文献   

针对现有基于信号特征的硬件木马检测方法中存在木马特征集单一、检测精度低和普适性差等问题,提出一种基于多信号特征融合的硬件木马识别方法.通过分析硬件木马的隐藏性,建立触发节点植入与载荷节点植入的硬件木马隐藏性模型,构造低静态翻转率、低动态翻转率、低组合0可控性、低组合1可控性和低组合可观察性的硬件木马特征集,利用KNN算法建立硬件木马检测模型.实验结果表明,该方法达到了98.23% 的木马信号平均识别率,与文献[3]和文献[15]相比,分别提高了16.30% 和10.24%,大幅提升了木马检测能力.  相似文献   

Automatic indexing of images and videos is a highly relevant and important research area in multimedia information retrieval. The difficulty of this task is no longer something to prove. Most efforts of the research community have been focusing, in the past, on the detection of single concepts in images/videos, which is already a hard task. With the evolution of information retrieval systems, users’ needs become more abstract, and lead to a larger number of words composing the queries. It is important to think about indexing multimedia documents with more than just individual concepts, to help retrieval systems to answer such complex queries. Few studies addressed specifically the problem of detecting multiple concepts (multi-concept) in images and videos. Most of them concern the detection of concept pairs. These studies showed that such challenge is even greater than the one of single concept detection. In this work, we address the problem of multi-concept detection in images/videos by making a comparative and detailed study. Three types of approaches are considered: 1) building detectors for multi-concept, 2) fusing single concepts detectors and 3) exploiting detectors of a set of single concepts in a stacking scheme. We conducted our evaluations on PASCAL VOC’12 collection regarding the detection of pairs and triplets of concepts. We extended the evaluation process on TRECVid 2013 dataset for infrequent concept pairs’ detection. Our results show that the three types of approaches give globally comparable results for images, but they differ for specific kinds of pairs/triplets. In the case of videos, late fusion of detectors seems to be more effective and efficient when single concept detectors have good performances. Otherwise, directly building bi-concept detectors remains the best alternative, especially if a well-annotated dataset is available. The third approach did not bring additional gain or efficiency.  相似文献   

Research in the video surveillance is gaining more popularity due to its widespread applications as well as social impact. In this paper, we present an intelligent framework for detection of multiple events in surveillance videos. Based on the principle of compositionality, we modularize the surveillance problems into a set of variables comprising regions-of-interest, classes (i.e. human, vehicle), attributes (i.e. speed, locality) and a set of notions (i.e. rules) associated to each of the attributes to construct a knowledge-based understanding of the environment. The final output from the reasoning process, which combines the definition domains of the various variables, allows a broader and integrated understanding of complex pattern of activities in the scene. This is in contrast to the state-of-the-art solutions that are only able to perform only a singular task, at a time. Experimental results on both the public and real-time datasets have demonstrated the effectiveness and robustness of the proposed framework in detecting multiple events in surveillance videos.  相似文献   

为了在只有几个样本的情况下在待检测图像中采用模板匹配的方式定位目标,提出了一种两阶段投票的目标检测方法。首先采用概率模型通过几个样本离线构造投票空间,然后采用两阶段投票的方式在待测图像中检测目标:第一阶段在待测图像中通过投票检测目标的图像块,并记录下这些部件块在样本中所处的位置信息;第二阶段基于第一阶段所得到的图像块来投票计算样本整体的相似度,从而定位目标。理论推导和实验结果验证了所提出方法比前人的工作具有更低的时间复杂度和更高的目标检测准确率。  相似文献   

Multimedia Tools and Applications - Violent scenes detection (VSD) is a challenging problem because of the heterogeneous content, large variations in video quality, and complex semantic meanings of...  相似文献   

Yi  Yun  Wang  Hanli  Zhang  Bowen 《Multimedia Tools and Applications》2017,76(18):18891-18913
Multimedia Tools and Applications - Human action recognition in realistic videos is an important and challenging task. Recent studies demonstrate that multi-feature fusion can significantly improve...  相似文献   

Autonomous mobile vehicles are becoming commoner in outdoor scenarios for agricultural applications. They must be equipped with a robot navigation system for sensing, mapping, localization, path planning, and obstacle avoidance. In autonomous vehicles, safety becomes a major challenge where unexpected obstacles in the working area must be conveniently addressed. Of particular interest are, people or animals crossing in front of the vehicle or fixed/moving uncatalogued elements in specific positions. Detection of unexpected obstacles or elements on video sequences acquired with a machine vision system on-board a tractor moving in cornfields makes the main contribution to this research. We propose a new strategy for automatic video analysis to detect static/dynamic obstacles in agricultural environments via spatial-temporal analysis. At a first stage obstacles are detected by using spatial information based on spectral colour analysis and texture data. At a second stage temporal information is used to detect moving objects/obstacles at the scene, which is of particular interest in camouflaged elements within the environment. A main feature of our method is that it does not require any training process. Another feature of our approach consists in the spatial analysis to obtain an initial segmentation of interesting objects; afterwards, temporal information is used for discriminating between moving and static objects. To the best of our knowledge in the field of agricultural image analysis, classical approaches make use of either spatial or temporal information, but not both at the same time, making an important contribution. Our method shows favourable results when tested in different outdoor scenarios in agricultural environments, which are really complex, mainly due to the high variability in the illumination conditions, causing undesired effects such as shadows and alternating lighted and dark areas. Dynamic background, camera vibrations and static and dynamic objects are also factors complicating the situation. The results are comparable to those obtained with other state-of-art techniques reported in literature.  相似文献   

针对目标检测模型在训练过程中正负样本分配时没有考虑真实框的长宽比、对物体不同分布的适应能力差等不足,提出了比例先验和损失感知的分配算法RLA。RLA不改变原有检测模型的结构,首先根据真实框的长宽比选择等比例的中心区域,然后计算锚点综合损失,考虑真实框内物体的实际分布,最后通过动态损失阈值的方式区分正负样本。该算法解决了基于IoU分配时适应性差、难以选出最佳正样本等问题,对偏心物体和长宽比悬殊物体的样本分配更加合理。与已有的样本分配算法对比,该算法在MS COCO数据集上的表现更优,比基线FCOS的AP提升1.66%;在模型结构相同时,比ATSS和PAA算法的AP分别提升了0.76%和0.24%,证明了RLA算法的有效性。  相似文献   

Automatic annotation of semantic events allows effective retrieval of video content. In this work, we present solutions for highlights detection in sports videos. This application is particularly interesting for broadcasters, since they extensively use manual annotation to select interesting highlights that are edited to create new programmes. The proposed approach exploits the typical structure of a wide class of sports videos, namely, those related to sports which are played in delimited venues with playfields of well known geometry, like soccer, basketball, swimming, track and field disciplines, and so on. For this class of sports, a modeling scheme based on a limited set of visual cues and on finite state machines (FSM) that encode the temporal evolution of highlights is presented. Algorithms for model checking and for visual cues estimation are discussed, as well as applications of the representation to different sport domains.  相似文献   

Neural Computing and Applications - Human action recognition (HAR) is a topic widely studied in computer vision and pattern recognition. Despite the success of recent models for this issue, most of...  相似文献   

Automatic classification of shots extracted by news videos plays an important role in the context of news video segmentation, which is an essential step towards effective indexing of broadcasters digital databases. In spite of the efforts reported by the researchers involved in this field, no techniques providing fully satisfactory performance have been presented until now. In this paper, we propose a multi-expert approach for unsupervised shot classification. The proposed multi-expert system (MES) combines three algorithms that are model-free and do not require a specific training phase. In order to assess the performance of the MES, we built up a database significantly wider than those typically used in the field. Experimental results demonstrate the effectiveness of the proposed approach both in terms of shot classification and of news story detection capability.  相似文献   

魏玮  马瑞  王小芳 《计算机应用》2017,37(3):801-805
现有的人脸检测评判标准通常情况下仅仅只是对人脸有无的定性检测,对于视频中人脸位置的定量描述并没有严格的规范;另外,现在的一些研究如视频人脸替换等对视频流中人脸位置的连续性有较高的要求。为了解决上述两个问题,相比之前的人脸检测以及人脸跟踪评估标准,提出了一种视频中人脸位置的定量检测评估标准,并且提出了一种视频中人脸位置的检测方法。该方法首先通过改进的Haar-Like级联分类器在目标区域中检测到人脸初始位置;然后采用金字塔光流法对人脸位置进行预测,同时引入正反向误差检测机制实现对结果的自检测,最终确定人脸位置。实验结果表明,检测标准能够对测试算法在视频人脸检测的定量描述结果给出评判,提出的检测算法在人脸位置的时间一致性上有所提升。  相似文献   

一种快速新闻视频标题字幕探测与定位方法*   总被引:1,自引:0,他引:1  
新闻视频字幕包含有丰富的语义信息,尤其是标题字幕,对新闻视频高层语义内容的分析和理解具有 重要作用。利用标题字幕的时空分布特征,提出了一个新闻视频标题字幕的快速探测与定位方法。首先利用标 题字幕持续多帧出现的特点降低所需处理的帧数,然后基于标题字幕的边缘特征和位置特征,标记帧图像的候 选字幕块,对帧序列中的图像进行统计分析,探测出视频中标题字幕的位置及出现消失时间。实验结果表明所 提方法简单有效,能够快速、鲁棒地探测并定位新闻视频中的标题字幕。  相似文献   

Zhang  Xufan  Wang  Yong  Chen  Zhenxing  Yan  Jun  Wang  Dianhong 《Multimedia Tools and Applications》2020,79(31-32):23147-23159

Saliency detection is a technique to analyze image surroundings to extract relevant regions from the background. In this paper, we propose a simple and effective saliency detection method based on image sparse representation and color features combination. First, the input image is segmented into non-overlapping super-pixels, so as to perform the saliency detection at the region level to reduce computational complexity. Then, a background optimization selection scheme is used to construct an appropriate background template. Based on this, a primary saliency map is obtained by using image sparse representation. Next, through the linear combination of color coefficients we generate an improved saliency map with more prominent salient regions. Finally, the two saliency maps are integrated within Bayesian framework to obtain the final saliency map. Experimental results show that the proposed method has desirable detection performance in terms of detection accuracy and running time.


针对自然背景下的行人检测问题,提出一种多特征与霍夫森林结合的行人检测算法。在特征提取阶段,分别采用梯度方向直方图、局部二值模式和LAB颜色空间来提取行人的梯度、纹理和颜色频率特征,构成丰富的特征集来描述行人;采用霍夫森林算法来创建分类器,对其投票方式进行改进,提出一种基于高斯模板的区域加权投票方式,提高了检测精度。实验结果表明,该算法在误检率FPPW为10-4时,检测率为90.12%, ROC曲线性能上优于 HOG+SVM 与原霍夫森林算法。  相似文献   

通过对电影景别音阶的识别检测,可以有效地分析和检索电影视频情感变化的片段.在系统分析以往研究成果的基础上,利用电影领域知识构建了局部运动占有率、摄像机运动和镜头间相似度等新的特征, 结合常用的视频特征, 采用贝叶斯分类器来识别电影视频的镜头景别, 并根据景别变化同观众情感之间的关系, 设计了5种能够激发观众情感的景别音阶, 在景别识别的基础上实现了对景别音阶的检测.实验结果表明, 选取的特征能够得到较好的检测结果, 与其他方法相比, 远景与近景的识别在准确率和查全率上均有不同程度的提高.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号