期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Abnormal visual event detection based on multi-instance learning and autoregressive integrated moving average model in edge-based Smart City surveillance

Xianghua Xu LiQiming Liu Lingjun Zhang Ping Li Jinjun Chen 《Software》2020,50(5):476-488

The abnormal visual event detection is an important subject in Smart City surveillance where a lot of data can be processed locally in edge computing environment. Real-time and detection effectiveness are critical in such an edge environment. In this paper, we propose an abnormal event detection approach based on multi-instance learning and autoregressive integrated moving average model for video surveillance of crowded scenes in urban public places, focusing on real-time and detection effectiveness. We propose an unsupervised method for abnormal event detection by combining multi-instance visual feature selection and the autoregressive integrated moving average model. In the proposed method, each video clip is modeled as a visual feature bag containing several subvideo clips, each of which is regarded as an instance. The time-transform characteristics of the optical flow characteristics within each subvideo clip are considered as a visual feature instance, and time-series modeling is carried out for multiple visual feature instances related to all subvideo clips in a surveillance video clip. The abnormal events in each surveillance video clip are detected using the multi-instance fusion method. This approach is verified on publically available urban surveillance video datasets and compared with state-of-the-art alternatives. Experimental results demonstrate that the proposed method has better abnormal event detection performance for crowded scene of urban public places with an edge environment. 相似文献

2.

Semantic trajectory-based event detection and event pattern mining

Xiaofeng Wang Gang Li Guang Jiang Zhongzhi Shi 《Knowledge and Information Systems》2013,37(2):305-329

相似文献

3.

Hierarchical visual event pattern mining and its applications 总被引：1，自引：0，他引：1

Peng Cui Zhi-Qiang Liu Li-Feng Sun Shi-Qiang Yang 《Data mining and knowledge discovery》2011,22(3):467-492

In this paper, we propose a hierarchical visual event pattern mining approach and utilize the patterns to address the key problems in video mining and understanding field. We classify events into primitive events (PEs) and compound events (CEs), where PEs are the units of CEs, and CEs serve as smooth priors and rules for PEs. We first propose a tensor-based video representation and Joint Matrix Factorization (JMF) for unsupervised primitive event categorization. Then we apply frequent pattern mining techniques to discover compound event pattern structures. After that, we utilize the two kinds of event patterns to address the applications of event recognition and anomaly detection. First we extend the Sequential Monte Carlo (SMC) method to recognition of live, sequential visual events. To accomplish this task we present a scheme that alternatively recognizes primitive and compound events in one framework. Then, we categorize the anomalies into abnormal events (never seen events) and abnormal contexts (rule breakers), and the two kinds of anomalies are detected simultaneously by embedding a deviation criterion into the SMC framework. Extensive experiments have been conducted which demonstrate that the proposed approach is effective as compared to other major approaches. 相似文献

4.

Anomalous sound event detection: A survey of machine learning based methods and applications

Mnasri Zied Rovetta Stefano Masulli Francesco 《Multimedia Tools and Applications》2022,81(4):5537-5586

With the development of multi-modal man-machine interaction, audio signal analysis is gaining importance in a field traditionally dominated by video. In particular, anomalous sound event detection offers novel options to improve audio-based man-machine interaction, in many useful applications such as surveillance systems, industrial fault detection and especially safety monitoring, either indoor or outdoor. Event detection from audio can fruitfully integrate visual information and can outperform it in some respects, thus representing a complementary perceptual modality. However, it also presents specific issues and challenges. In this paper, a comprehensive survey of anomalous sound event detection is presented, covering various aspects of the topic, ?.e.feature extraction methods, datasets, evaluation metrics, methods, applications, and some open challenges and improvement ideas that have been recently raised in the literature.

相似文献

5.

基于本体的监控视频语义事件探测*

陈姣姣张晓如周永梅《计算机应用研究》2012,29(1):112-115

为了探测视频高层复杂事件,架构了一个视频事件分析框架,采用本体和Petri网进行推理从而获取复合事件;运用视频语义本体标注算法分析低层视频语义,在高层构建一个视频事件分析本体,将低层本体映射到事件分析本体表示高层视频事件;通过本体和扩展Petri网结合的方法对监控视频中的事件进行图形化异步事件推理;最后用semantic Web rule language(SWRL)规则表示视频监控事件的探测。实验证明,提出的方法比基于模式识别的事件探测方法更加有效。相似文献

6.

A novel unsupervised 3D skeleton detection in RGB-D images for video surveillance

Cheng Shyi-Chyi Hsiao Kuei-Fang Yang Chen-Kuei Hsiao Po-Fu Yu Wan-Hsuan 《Multimedia Tools and Applications》2020,79(23-24):15829-15857

In this paper we present a novel moment-based skeleton detection for representing human objects in RGB-D videos with animated 3D skeletons. An object often consists of several parts, where each of them can be concisely represented with a skeleton. However, it remains as a challenge to detect the skeletons of individual objects in an image since it requires an effective part detector and a part merging algorithm to group parts into objects. In this paper, we present a novel fully unsupervised learning framework to detect the skeletons of human objects in a RGB-D video. The skeleton modeling algorithm uses a pipeline architecture which consists of a series of cascaded operations, i.e., symmetry patch detection, linear time search of symmetry patch pairs, part and symmetry detection, symmetry graph partitioning, and object segmentation. The properties of geometric moment-based functions for embedding symmetry features into centers of symmetry patches are also investigated in detail. As compared with the state-of-the-art deep learning approaches for skeleton detection, the proposed approach does not require tedious human labeling work on training images to locate the skeleton pixels and their associated scale information. Although our algorithm can detect parts and objects simultaneously, a pre-learned convolution neural network (CNN) can be used to locate the human object from each frame of the input video RGB-D video in order to achieve the goal of constructing real-time applications. This much reduces the complexity to detect the skeleton structure of individual human objects with our proposed method. Using the segmented human object skeleton model, a video surveillance application is constructed to verify the effectiveness of the approach. Experimental results show that the proposed method gives good performance in terms of detection and recognition using publicly available datasets.

相似文献

7.

A Novel Framework for Semantic Annotation and Personalized Retrieval of Sports Video 总被引：5，自引：0，他引：5

Changsheng Xu Jinjun Wang Hanqing Lu Yifan Zhang 《Multimedia, IEEE Transactions on》2008,10(3):421-436

Sports video annotation is important for sports video semantic analysis such as event detection and personalization. In this paper, we propose a novel approach for sports video semantic annotation and personalized retrieval. Different from the state of the art sports video analysis methods which heavily rely on audio/visual features, the proposed approach incorporates web-casting text into sports video analysis. Compared with previous approaches, the contributions of our approach include the following. 1) The event detection accuracy is significantly improved due to the incorporation of web-casting text analysis. 2) The proposed approach is able to detect exact event boundary and extract event semantics that are very difficult or impossible to be handled by previous approaches. 3) The proposed method is able to create personalized summary from both general and specific point of view related to particular game, event, player or team according to user's preference. We present the framework of our approach and details of text analysis, video analysis, text/video alignment, and personalized retrieval. The experimental results on event boundary detection in sports video are encouraging and comparable to the manually selected events. The evaluation on personalized retrieval is effective in helping meet users' expectations. 相似文献

8.

Activity based surveillance video content modelling

Tao Xiang Shaogang Gong 《Pattern recognition》2008,41(7):2309-2326

This paper tackles the problem of surveillance video content modelling. Given a set of surveillance videos, the aims of our work are twofold: firstly a continuous video is segmented according to the activities captured in the video; secondly a model is constructed for the video content, based on which an unseen activity pattern can be recognised and any unusual activities can be detected. To segment a video based on activity, we propose a semantically meaningful video content representation method and two segmentation algorithms, one being offline offering high accuracy in segmentation, and the other being online enabling real-time performance. Our video content representation method is based on automatically detected visual events (i.e. ‘what is happening in the scene’). This is in contrast to most previous approaches which represent video content at the signal level using image features such as colour, motion and texture. Our segmentation algorithms are based on detecting breakpoints on a high-dimensional video content trajectory. This differs from most previous approaches which are based on shot change detection and shot grouping. Having segmented continuous surveillance videos based on activity, the activity patterns contained in the video segments are grouped into activity classes and a composite video content model is constructed which is capable of generalising from a small training set to accommodate variations in unseen activity patterns. A run-time accumulative unusual activity measure is introduced to detect unusual behaviour while usual activity patterns are recognised based on an online likelihood ratio test (LRT) method. This ensures robust and reliable activity recognition and unusual activity detection at the shortest possible time once sufficient visual evidence has become available. Comparative experiments have been carried out using over 10 h of challenging outdoor surveillance video footages to evaluate the proposed segmentation algorithms and modelling approach. 相似文献

9.

SOINN-Based Abnormal Trajectory Detection for Efficient Video Condensation

Chin-Shyurng Fahn Chang-Yi Kao Meng-Luen Wu Hao-En Chueh 《计算机系统科学与工程》2022,42(2):451-463

With the evolution of video surveillance systems, the requirement of video storage grows rapidly; in addition, safe guards and forensic officers spend a great deal of time observing surveillance videos to find abnormal events. As most of the scene in the surveillance video are redundant and contains no information needs attention, we propose a video condensation method to summarize the abnormal events in the video by rearranging the moving trajectory and sort them by the degree of anomaly. Our goal is to improve the condensation rate to reduce more storage size, and increase the accuracy in abnormal detection. As the trajectory feature is the key to both goals, in this paper, a new method for feature extraction of moving object trajectory is proposed, and we use the SOINN (Self-Organizing Incremental Neural Network) method to accomplish a high accuracy abnormal detection. In the results, our method is able to shirk the video size to 10% storage size of the original video, and achieves 95% accuracy of abnormal event detection, which shows our method is useful and applicable to the surveillance industry. 相似文献

10.

融合自编码器和one-class SVM的异常事件检测

下载免费PDF全文

胡海洋张力李忠金《中国图象图形学报》2020,25(12):2614-2629

目的在自动化和智能化的现代生产制造过程中,视频异常事件检测技术扮演着越来越重要的角色,但由于实际生产制造中异常事件的复杂性及无关生产背景的干扰,使其成为一项非常具有挑战性的任务。很多传统方法采用手工设计的低级特征对视频的局部区域进行特征提取,然而此特征很难同时表示运动与外观特征。此外,一些基于深度学习的视频异常事件检测方法直接通过自编码器的重构误差大小来判定测试样本是否为正常或异常事件,然而实际情况往往会出现一些原本为异常的测试样本经过自编码得到的重构误差也小于设定阈值,从而将其错误地判定为正常事件,出现异常事件漏检的情形。针对此不足,本文提出一种融合自编码器和one-class支持向量机（support vector machine,SVM）的异常事件检测模型。方法通过高斯混合模型（Gaussian mixture model,GMM）提取固定大小的时空兴趣块（region of interest,ROI）;通过预训练的3维卷积神经网络（3D convolutional neural network,C3D）对ROI进行高层次的特征提取;利用提取的高维特征训练一个堆叠的降噪自编码器,通过比较重构误差与设定阈值的大小,将测试样本判定为正常、异常和可疑3种情况之一;对自编码器降维后的特征训练一个one-class SVM模型,用于对可疑测试样本进行二次检测,进一步排除异常事件。结果本文对实际生产制造环境下的机器人工作场景进行实验,采用AUC （area under ROC）和等错误率（equal error rate,EER）两个常用指标进行评估。在设定合适的误差阈值时,结果显示受试者工作特征（receiver operating characteristic,ROC）曲线下AUC达到91.7%,EER为13.8%。同时,在公共数据特征集USCD （University of California,San Diego） Ped1和USCD Ped2上进行了模型评估,并与一些常用方法进行了比较,在USCD Ped1数据集中,相比于性能第2的方法,AUC在帧级别和像素级别分别提高了2.6%和22.3%;在USCD Ped2数据集中,相比于性能第2的方法,AUC在帧级别提高了6.7%,从而验证了所提检测方法的有效性与准确性。结论本文提出的视频异常事件检测模型,结合了传统模型与深度学习模型,使视频异常事件检测结果更加准确。相似文献

11.

A framework for flexible summarization of racquet sports video using multiple modalities

《Computer Vision and Image Understanding》2009,113(3):415-424

While most existing sports video research focuses on detecting event from soccer and baseball etc., little work has been contributed to flexible content summarization on racquet sports video, e.g. tennis, table tennis etc. By taking advantages of the periodicity of video shot content and audio keywords in the racquet sports video, we propose a novel flexible video content summarization framework. Our approach combines the structure event detection method with the highlight ranking algorithm. Firstly, unsupervised shot clustering and supervised audio classification are performed to obtain the visual and audio mid-level patterns respectively. Then, a temporal voting scheme for structure event detection is proposed by utilizing the correspondence between audio and video content. Finally, by using the affective features extracted from the detected events, a linear highlight model is adopted to rank the detected events in terms of their exciting degrees. Experimental results show that the proposed approach is effective. 相似文献

12.

Automated camera sabotage detection for enhancing video surveillance systems

Sitara K. Mehtre B. M. 《Multimedia Tools and Applications》2019,78(5):5819-5841

Surveillance cameras are vital source of information in crime investigations. A surveillance video must be recorded with correct field of view and be of good quality, otherwise, it may not be suitable for investigation or analysis purposes. Perpetrators may tamper the recorded video or the physical device itself, in order to conceal their dubious activities. Generally, surveillance systems are unmanned due to limitations of manual monitoring. Automatic detection of camera tamper events is crucial for timely operator intervention. We propose a new method for detecting video camera tampering events like occlusion, defocus and displacement. The features used are edge information, frame count, foreground objects’ coverage area and its static nature. Effectiveness of our method is tested through experimentation on public datasets. The results obtained are encouraging with high detection and low false alarm rates. The proposed method automatically detects routine problems with cameras like dirt on camera lens, fog and smoke.

相似文献

13.

Multisource surveillance video data coding with hierarchical knowledge library

Chen Yu Hu Ruimin Xiao Jing Xu Liang Wang Zhongyuan 《Multimedia Tools and Applications》2019,78(11):14705-14731

The rapidly increasing surveillance video data has challenged the existing video coding standards. Even though knowledge based video coding scheme has been proposed to remove redundancy of moving objects across multiple videos and achieved great coding efficiency improvement, it still has difficulties to cope with complicated visual changes of objects resulting from various factors. In this paper, a novel hierarchical knowledge extraction method is proposed. Common knowledge on three coarse-to-fine levels, namely category level, object level and video level, are extracted from history data to model the initial appearance, stable changes and temporal changes respectively for better object representation and redundancy removal. In addition, we apply the extracted hierarchical knowledge to surveillance video coding tasks and establish a hybrid prediction based coding framework. On the one hand, hierarchical knowledge is projected to the image plane to generate reference for I frames to achieve better prediction performance. On the other hand, we develop a transform based prediction for P/B frames to reduce the computational complexity while improve the coding efficiency. Experimental results demonstrate the effectiveness of our proposed method.

相似文献

14.

Large scale continuous visual event recognition using max-margin Hough transformation framework

Bhaskar Chakraborty Jordi Gonzàlez F. Xavier Roca 《Computer Vision and Image Understanding》2013,117(10):1356-1368

In this paper we propose a novel method for continuous visual event recognition (CVER) on a large scale video dataset using max-margin Hough transformation framework. Due to high scalability, diverse real environmental state and wide scene variability direct application of action recognition/detection methods such as spatio-temporal interest point (STIP)-local feature based technique, on the whole dataset is practically infeasible. To address this problem, we apply a motion region extraction technique which is based on motion segmentation and region clustering to identify possible candidate “event of interest” as a preprocessing step. On these candidate regions a STIP detector is applied and local motion features are computed. For activity representation we use generalized Hough transform framework where each feature point casts a weighted vote for possible activity class centre. A max-margin frame work is applied to learn the feature codebook weight. For activity detection, peaks in the Hough voting space are taken into account and initial event hypothesis is generated using the spatio-temporal information of the participating STIPs. For event recognition a verification Support Vector Machine is used. An extensive evaluation on benchmark large scale video surveillance dataset (VIRAT) and as well on a small scale benchmark dataset (MSR) shows that the proposed method is applicable on a wide range of continuous visual event recognition applications having extremely challenging conditions. 相似文献

15.

An efficient subsequence search for video anomaly detection and localization

Kai-Wen Cheng Yie-Tarng Chen Wen-Hsien Fang 《Multimedia Tools and Applications》2016,75(22):15101-15122

This paper presents a novel framework for anomaly event detection and localization in crowded scenes. For anomaly detection, one-class support vector machine with Bayesian derivation is applied to detect unusual events. We also propose a novel event representation, called subsequence, which refers to a time series of spatial windows in proximity. Unlike recent works encoded an event with a 3D bounding box which may contain irrelevant information, e.g. background, a subsequence can concisely capture the unstructured property of an event. To efficiently locate anomalous subsequences in a video space, we propose the maximum subsequence search. The proposed search algorithm integrates local anomaly scores into a global consistent detection so that the start and end of an abnormal event can be determined under false and missing detections. Experimental results on two public datasets show that our method is robust to the illumination change and achieve at least 80% localization rate which approximately doubles the accuracy of recent works. This study concludes that anomaly localization is crucial in finding abnormal events. 相似文献

16.

Scenario-based query processing for video-surveillance archives

Ediz Şaykol Uğur Güdükbay Özgür Ulusoy 《Engineering Applications of Artificial Intelligence》2010,23(3):331-345

Automated video surveillance has emerged as a trendy application domain in recent years, and accessing the semantic content of surveillance video has become a challenging research area. The results of a considerable amount of research dealing with automated access to video surveillance have appeared in the literature; however, significant semantic gaps in event models and content-based access to surveillance video remain. In this paper, we propose a scenario-based query-processing system for video surveillance archives. In our system, a scenario is specified as a sequence of event predicates that can be enriched with object-based low-level features and directional predicates. We introduce an inverted tracking scheme, which effectively tracks the moving objects and enables view-based addressing of the scene. Our query-processing system also supports inverse querying and view-based querying, for after-the-fact activity analysis. We propose a specific surveillance query language to express the supported query types in a scenario-based manner. We also present a visual query-specification interface devised to facilitate the query-specification process. We have conducted performance experiments to show that our query-processing technique has a high expressive power and satisfactory retrieval accuracy in video surveillance. 相似文献

17.

Sniffer-Net: quantitative evaluation of smoke in the wild based on spatial–temporal motion spectrum

Mi Zeyang Zhang Weiwei Wu Xuncheng Gao Qiaoming Luo Suyun 《Neural computing & applications》2020,32(13):9165-9180

Smoke detection plays an essential role in the wild video surveillance systems for abnormal events warning. In this paper, we introduced a dedicated neural network structure named Sniffer-Net to simultaneously extract smoke dynamic feature robustly and evaluate the smoke concentration accurately. Firstly, we utilize an improved LiteFlowNet to estimate the global optical flow from image sequence. Meanwhile, a Marr–Hildreth method is brought up and fused into this network to distinguish and eliminate occluded regions from global flow map. Then, an evaluation module based on Context-Encoder network is put forward specially to quantify smoke concentration levels. This network, following the improved LiteFlowNet, is modified through replacing the loss function and removing the multiscale scheme and trained to infer approximate smoke optical flow behind occlusion regions. Starting from the statistical view, the irregular RGB/HSV feature spaces are converted into a specific quantitative evaluation space. As a result, the whole evaluation system is responsible to transform the distribution of irregular smoke motion feature into a quantified form of representation. In turn, this transformation endows the system with a novel numerical standard for smoke concentration evaluation. Finally, an accuracy assessment method is applied to compare the results of detected smoke concentration with the human experience prior model, which feedback the accuracy and false detection rate of system algorithm. In the experiments of five smoke datasets, our proposed smoke detection approach is superior to other state-of-the-art methods, and concentration algorithm achieves the satisfactory performance of 97.3% accuracy on some specialized dataset.

相似文献

18.

Attention-based framework for weakly supervised video anomaly detection

Ma Hualin Zhang Liyan 《The Journal of supercomputing》2022,78(6):8409-8429

Video anomaly detection automatically recognizes abnormal events in surveillance videos. Existing works have made advances in recognizing whether a video contains abnormal events; however, they cannot temporally localize the abnormal events within videos. This paper presents a novel anomaly attention-based framework for accurately temporally localize the abnormal events. Benefiting from the proposed framework, we can achieve frame-level VAD using video-level labels, which significantly reduces the burden of data annotation. Our method is an end-to-end deep neural network-based approach, which contains three modules: anomaly attention module (AAM), discriminative anomaly attention module (DAAM) and generative anomaly attention module (GAAM). Specifically, AAM is trained to generate the anomaly attention, which is used to measure the abnormal degree of each frame. Whereas, DAAM and GAAM are used to alternately augmenting AAM from two different aspects. On the one hand, DAAM enhancing AAM by optimizing the video-level video classification. On the other hand, GAAM adopts a conditional variational autoencoder to model the likelihood of each frame given the attention for refining AAM. As a result, AAM can generate higher anomaly scores for abnormal frames while lower anomaly scores for normal frames. Experimental results show that our proposed approach outperforms state-of-the-art methods, which validates the superiority of our AAVAD.

相似文献

19.

Collective Representation for Abnormal Event Detection

下载免费PDF全文

Renzhen Ye Xuelong Li 《计算机科学技术学报》2017,32(3):470-479

相似文献

20.

基于HEVC的车辆异常事件检测

常同伟梁久祯吴秦王念兵《数据采集与处理》2018,33(2):370-378

当前传统交通事故检测和查阅主要通过人工监测的方法,这种方法效率低且实时性差,本文提出一种基于最新压缩域视频编码标准HEVC（High-efficiency video coding）的车辆异常事件检测方法。首先对HEVC码流中提取出的运动矢量信息进行运动矢量累积迭代和中值滤波的预处理,之后根据提取出的块划分信息和运动矢量信息计算运动对象的运动强度,然后根据运动强度值和八连通区域法提取出运动对象,最后根据空间距离法和运动强度判别法检测出视频序列中发生的车辆异常事件。实验证明,该方法可以准确地检测出视频序列中发生的车辆异常事件;对于有着快速移动的运动目标以及多个运动目标的视频效果更好。相似文献