首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
In a typical surveillance installation, a human operator has to constantly monitor a large array of video feeds for suspicious behaviour. As the number of cameras increases, information overload makes manual surveillance increasingly difficult, adding to other confounding factors such as human fatigue and boredom. The objective of an intelligent vision-based surveillance system is to automate the monitoring and event detection components of surveillance, alerting the operator only when unusual behaviour or other events of interest are detected. While most traditional methods for trajectory-based unusual behaviour detection rely on low-level trajectory features such as flow vectors or control points, this paper builds upon a recently introduced approach that makes use of higher-level features of intentionality. Individuals in the scene are modelled as intentional agents, and unusual behaviour is detected by evaluating the explicability of the agent's trajectory with respect to known spatial goals. The proposed method extends the original goal-based approach in three ways: first, the spatial scene structure is learned in a training phase; second, a region transition model is learned to describe normal movement patterns between spatial regions; and third, classification of trajectories in progress is performed in a probabilistic framework using particle filtering. Experimental validation on three published third-party datasets demonstrates the validity of the proposed approach.  相似文献   

2.
We present a new set of interface techniques for visualizing and editing animation directly in a single three-dimensional scene. Motion is edited using direct-manipulation tools which satisfy high-level goals such as “reach this point at this time” or “go faster at this moment”. These tools can be applied over an arbitrary temporal range and maintain arbitrary degrees of spatial and temporal continuity. We separate spatial and temporal control of position by using two curves for each animated object: the motion path which describes the 3D spatial path along which an object travels, and the motion graph, a function describing the distance traveled along this curve over time. Our direct-manipulation tools are implemented using displacement functions, a straightforward and scalable technique for satisfying motion constraints by composition of the displacement function with the motion graph or motion path. This paper will focus on applying displacement functions to positional change. However, the techniques presented are applicable to the animation of orientation, color, or any other attribute that varies over time.  相似文献   

3.
In this paper we introduce VideoGraph, a novel non-linear representation for scene structure of a video. Unlike classical linear sequential organization, VideoGraph concentrates the video content across the time line by structuring scenes and materializes with two-dimensional graph, which enables non-linear exploration on the scenes and their transitions. To construct VideoGraph, we adopt a sub-shot induced method to evaluate the spatio-temporal similarity between shot segments of video. Then, scene structure is derived by grouping similar shots and identifying the valid transitions between scenes. The final stage is to represent the scene structure using a graph with respect to scene transition topology. Our VideoGraph can provide a condensed representation in the scene level and facilitate a non-linear manner to browse videos. Experimental results are presented to demonstrate the effectiveness and efficiency by using VideoGraph to explore and access the video content.  相似文献   

4.
5.
With the evolution of video surveillance systems, the requirement of video storage grows rapidly; in addition, safe guards and forensic officers spend a great deal of time observing surveillance videos to find abnormal events. As most of the scene in the surveillance video are redundant and contains no information needs attention, we propose a video condensation method to summarize the abnormal events in the video by rearranging the moving trajectory and sort them by the degree of anomaly. Our goal is to improve the condensation rate to reduce more storage size, and increase the accuracy in abnormal detection. As the trajectory feature is the key to both goals, in this paper, a new method for feature extraction of moving object trajectory is proposed, and we use the SOINN (Self-Organizing Incremental Neural Network) method to accomplish a high accuracy abnormal detection. In the results, our method is able to shirk the video size to 10% storage size of the original video, and achieves 95% accuracy of abnormal event detection, which shows our method is useful and applicable to the surveillance industry.  相似文献   

6.
ContextAs trajectory analysis is widely used in the fields of video surveillance, crowd monitoring, behavioral prediction, and anomaly detection, finding motion patterns is a fundamental task for pedestrian trajectory analysis.ObjectiveIn this paper, we focus on learning dominant motion patterns in unstructured scene.MethodsAs the invisible implicit indicator to scene structure, latent structural information is first defined and learned by clustering source/sink points using CURE algorithm. Considering the basic assumption that most pedestrians would find the similar paths to pass through an unstructured scene if their entry and exit areas are fixed, trajectories are then grouped based on the latent structural information. Finally, the motion patterns are learned for each group, which are characterized by a series of statistical temporal and spatial properties including length, duration and envelopes in polar coordinate space.ResultsExperimental results demonstrate the feasibility and effectiveness of our method, and the learned motion patterns can efficiently describe the statistical spatiotemporal models of the typical pedestrian behaviors in a real scene. Based on the learned motion patterns, abnormal or suspicious trajectories are detected.ConclusionThe performance of our approach shows high spatial accuracy and low computational cost.  相似文献   

7.
Techniques for video object motion analysis, behaviour recognition and event detection are becoming increasingly important with the rapid increase in demand for and deployment of video surveillance systems. Motion trajectories provide rich spatiotemporal information about an object's activity. This paper presents a novel technique for classification of motion activity and anomaly detection using object motion trajectory. In the proposed motion learning system, trajectories are treated as time series and modelled using modified DFT-based coefficient feature space representation. A modelling technique, referred to as m-mediods, is proposed that models the class containing n members with m mediods. Once the m-mediods based model for all the classes have been learnt, the classification of new trajectories and anomaly detection can be performed by checking the closeness of said trajectory to the models of known classes. A mechanism based on agglomerative approach is proposed for anomaly detection. Four anomaly detection algorithms using m-mediods based representation of classes are proposed. These includes: (i)global merged anomaly detection (GMAD), (ii) localized merged anomaly detection (LMAD), (iii) global un-merged anomaly detection (GUAD), and (iv) localized un-merged anomaly detection (LUAD). Our proposed techniques are validated using variety of simulated and complex real life trajectory datasets.  相似文献   

8.
A video segmentation algorithm that takes advantage of using a background subtraction (BS) model with low learning rate (LLR) or a BS model with high learning rate (HLR) depending on the video scene dynamics is presented in this paper. These BS models are based on a neural network architecture, the self-organized map (SOM), and the algorithm is termed temporal modular self-adaptive SOM, TMSA_SOM. Depending on the type of scenario, the TMSA_SOM automatically classifies and processes each video into one of four different specialized modules based on an initial sequence analysis. This approach is convenient because unlike state-of-the-art (SoA) models, our proposed model solves different situations that may occur in the video scene (severe dynamic background, initial frames with dynamic objects, static background, stationary objects, etc.) with a specialized module. Furthermore, TMSA_SOM automatically identifies whether the scene has drastically changed (e.g., stationary objects of interest become dynamic or drastic illumination changes have occurred) and automatically detects when the scene has become stable again and uses this information to update the background model in a fast way. The proposed model was validated with three different video databases: Change Detection, BMC, and Wallflower. Findings showed a very competitive performance considering metrics commonly used in the literature to compare SoA models. TMSA_SOM also achieved the best results on two perceptual metrics, Ssim and D-Score, and obtained the best performance on the global quality measure, FSD (based on F-Measure, Ssim, and D-Score), demonstrating its robustness with different and complicated non-controlled scenarios. TMSA_SOM was also compared against SoA neural network approaches obtaining the best average performance on Re, Pr, and F-Measure.  相似文献   

9.
提出了一种用于视觉监控中行为识别的新颖方法.该方法将相应于目标行为的场景事件建模为一组使用PCH(Pixel Cllange Histories)检测的自治像素级事件.结合基于改进的MDL (Minimum DescrlptIon Length)的自动模型规则选择,EM(Expectation-Maximisation)算法被采用来聚类这些像素级的自治事件成为语义上更有意义的区域级的场景事件.该方法是计算上有效的,实验结果验证了它在不需匹配目标轨迹的情况下自动识别场景事件的有效性.  相似文献   

10.
11.
This paper tackles the problem of surveillance video content modelling. Given a set of surveillance videos, the aims of our work are twofold: firstly a continuous video is segmented according to the activities captured in the video; secondly a model is constructed for the video content, based on which an unseen activity pattern can be recognised and any unusual activities can be detected. To segment a video based on activity, we propose a semantically meaningful video content representation method and two segmentation algorithms, one being offline offering high accuracy in segmentation, and the other being online enabling real-time performance. Our video content representation method is based on automatically detected visual events (i.e. ‘what is happening in the scene’). This is in contrast to most previous approaches which represent video content at the signal level using image features such as colour, motion and texture. Our segmentation algorithms are based on detecting breakpoints on a high-dimensional video content trajectory. This differs from most previous approaches which are based on shot change detection and shot grouping. Having segmented continuous surveillance videos based on activity, the activity patterns contained in the video segments are grouped into activity classes and a composite video content model is constructed which is capable of generalising from a small training set to accommodate variations in unseen activity patterns. A run-time accumulative unusual activity measure is introduced to detect unusual behaviour while usual activity patterns are recognised based on an online likelihood ratio test (LRT) method. This ensures robust and reliable activity recognition and unusual activity detection at the shortest possible time once sufficient visual evidence has become available. Comparative experiments have been carried out using over 10 h of challenging outdoor surveillance video footages to evaluate the proposed segmentation algorithms and modelling approach.  相似文献   

12.
13.
Some theoretical and practical aspect of the score function (SF) approach for estimating the sensitivities of computer simulation models and solving the so-called “what if” problem (performance extrapolation) are considered. It is shown that both the sensitivities (gradients, Hessians, etc.) and the performance extrapolation can be derived simultaneously by simulating only a single sample path from the nominal system. It is also shown that the SF approach can be efficiently applied for DESS (discrete event static systems, example: reliability models and stochastic networks) and for DEDS (discrete events dynamic systems, example: queuing networks) under light traffics. Control variates procedure for variance reduction is presented as well  相似文献   

14.
《Pattern recognition letters》2003,24(1-3):113-128
This paper presents an efficient region-based motion segmentation method for segmentation of moving objects in a traffic scene with a focus on a video monitoring system (VMS). The presented method consists of two phases: first, in the motion detection phase, the positions of moving objects in a scene are determined using an adaptive thresholding method. To detect varying regions by moving objects, instead of determining the threshold value manually, we use an adaptive thresholding method to automatically choose the threshold value. Second, in the motion segmentation phase, pixels that have similar intensity and motion information are segmented using a weighted k-means clustering algorithm to the binary region of the motion mask obtained in the motion detection. In this way, we need not process a whole image so computation time is reduced. Experimental results demonstrate robustness not only in the variation of luminance conditions and changes in environmental conditions, but also for occlusions among multiple moving objects.  相似文献   

15.
Stitching motions in multiple videos into a single video scene is a challenging task in current video fusion and mosaicing research and film production. In this paper, we present a novel method of video motion stitching based on the similarities of trajectory and position of foreground objects. First, multiple video sequences are registered in a common reference frame, whereby we estimate the static and dynamic backgrounds, with the former responsible for distinguishing the foreground from the background and the static region from the dynamic region, and the latter functioning in mosaicing the warped input video sequences into a panoramic video. Accordingly, the motion similarity is calculated by reference to trajectory and position similarity, whereby the corresponding motion parts are extracted from multiple video sequences. Finally, using the corresponding motion parts, the foregrounds of different videos and dynamic backgrounds are fused into a single video scene through Poisson editing, with the motions involved being stitched together. Our major contributions are a framework of multiple video mosaicing based on motion similarity and a method of calculating motion similarity from the trajectory similarity and the position similarity. Experiments on everyday videos show that the agreement of trajectory and position similarities with the real motion similarity plays a decisive role in determining whether two motions can be stitched. We acquire satisfactory results for motion stitching and video mosaicing.  相似文献   

16.
In this paper, we present a framework for parsing video events with stochastic Temporal And–Or Graph (T-AOG) and unsupervised learning of the T-AOG from video. This T-AOG represents a stochastic event grammar. The alphabet of the T-AOG consists of a set of grounded spatial relations including the poses of agents and their interactions with objects in the scene. The terminal nodes of the T-AOG are atomic actions which are specified by a number of grounded relations over image frames. An And-node represents a sequence of actions. An Or-node represents a number of alternative ways of such concatenations. The And–Or nodes in the T-AOG can generate a set of valid temporal configurations of atomic actions, which can be equivalently represented as the language of a stochastic context-free grammar (SCFG). For each And-node we model the temporal relations of its children nodes to distinguish events with similar structures but different temporal patterns and interpolate missing portions of events. This makes the T-AOG grammar context-sensitive. We propose an unsupervised learning algorithm to learn the atomic actions, the temporal relations and the And–Or nodes under the information projection principle in a coherent probabilistic framework. We also propose an event parsing algorithm based on the T-AOG which can understand events, infer the goal of agents, and predict their plausible intended actions. In comparison with existing methods, our paper makes the following contributions. (i) We represent events by a T-AOG with hierarchical compositions of events and the temporal relations between the sub-events. (ii) We learn the grammar, including atomic actions and temporal relations, automatically from the video data without manual supervision. (iii) Our algorithm infers the goal of agents and predicts their intents by a top-down process, handles events insertion and multi-agent events, keeps all possible interpretations of the video to preserve the ambiguities, and achieves the globally optimal parsing solution in a Bayesian framework. (iv) The algorithm uses event context to improve the detection of atomic actions, segment and recognize objects in the scene. Extensive experiments, including indoor and out door scenes, single and multiple agents events, are conducted to validate the effectiveness of the proposed approach.  相似文献   

17.
《Real》2005,11(3):186-203
The accuracy of object tracking methodologies can be significantly improved by utilizing knowledge about the monitored scene. Such scene knowledge includes the homography between the camera and ground planes and the occlusion landscape identifying the depth map associated with the static occlusions in the scene. Using the ground plane, a simple method of relating the projected height and width of people objects to image location is used to constrain the dimensions of appearance models. Moreover, trajectory modeling can be greatly improved by performing tracking on the ground-plane tracking using global real-world noise models for the observation and dynamic processes. Finally, the occlusion landscape allows the tracker to predict the complete or partial occlusion of object observations. To facilitate plug and play functionality, this scene knowledge must be automatically learnt. The paper demonstrates how, over a sufficient length of time, observations from the monitored scene itself can be used to parameterize the semantic landscape.  相似文献   

18.
This paper presents a method of synchronizing video sequences that exploits the non-rigidity of sets of 3D point features (e.g., anatomical joint locations) within the scene. The theory is developed for homography, perspective and affine projection models within a unified rank constraint framework that is computationally cheap. An efficient method is then presented that recovers potential frame correspondences, estimates possible synchronization parameters via the Hough transform and refines these parameters using non-linear optimization methods in order to recover synchronization to sub-frame accuracy, even for sequences of unknown and different frame rates. The method is evaluated quantitatively using synthetic data and demonstrated qualitatively on several real sequences.  相似文献   

19.
This paper presents a robust technique for temporally aligning multiple video sequences that have no spatial overlap between their fields of view. It is assumed that (i) a moving target with known trajectory is viewed by all cameras at non-overlapping periods in time, (ii) the target trajectory is estimated with a limited error at a constant sampling rate, and (iii) the sequences are recorded by stationary cameras with constant frame rates and fixed intrinsic and extrinsic parameters. The proposed approach reduces the problem of synchronizing N non-overlapping sequences to the problem of robustly estimating a single line from a set of appropriately-generated points in RN+1. This line describes all temporal relations between the N sequences and the moving target. Our technique can handle arbitrarily-large misalignments between the sequences and does not require any a priori information about their temporal relations. Experimental results with real-world and synthetic sequences demonstrate that our method can accurately align the videos.  相似文献   

20.
Algorithms for the maximum flow problem can be grouped into two categories: augmenting path algorithms [Ford LR, Fulkerson DR. Flows in networks. Princeton University Press: Princeton, NJ: 1962], and preflow push algorithms [Goldberg AV, Tarjan RE. A new approach to the maximum flow problem. In: Proceedings of the 18th annual ACM symposium on theory of computing, 1986; p. 136–46]. Preflow push algorithms are characterized by a drawback known as ping pong effect. In this paper we present a technique that allows to avoid such an effect and can be considered as an approach combining the augmenting path and preflow push methods. An extended experimentation shows the effectiveness of the proposed approach.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号