共查询到20条相似文献,搜索用时 0 毫秒
2.
This paper demonstrates innovative techniques for estimating the trajectory of a soccer ball from multiple fixed cameras. Since the ball is nearly always moving and frequently occluded, its size and shape appearance varies over time and between cameras. Knowledge about the soccer domain is utilized and expressed in terms of field, object and motion models to distinguish the ball from other movements in the tracking and matching processes. Using ground plane velocity, longevity, normalized size and color features, each of the tracks obtained from a Kalman filter is assigned with a likelihood measure that represents the ball. This measure is further refined by reasoning through occlusions and back-tracking in the track history. This can be demonstrated to improve the accuracy and continuity of the results. Finally, a simple 3D trajectory model is presented, and the estimated 3D ball positions are fed back to constrain the 2D processing for more efficient and robust detection and tracking. Experimental results with quantitative evaluations from several long sequences are reported. 相似文献
4.
A new algorithm is introduced for tracking multiple features in an image sequence. First, the proposed method iteratively reduces the disparity of each possible match by relaxation labeling. It is assumed that all trajectories are smooth and the smoothness is used as the measure for correspondence. Some cases of wrong correspondences can be recovered by a proposed scheme called constraint-aided exchange during the tracking process. Occluded or missing feature points can be detected and predicted in the proposed algorithm. Finally, the algorithm is applied to data obtained from real world scenes. The human motion analysis can be achieved by the tracking algorithm. 相似文献
5.
Multimedia Tools and Applications - Particle filters have been proven very successful for non-linear and non-Gaussian estimation problems and extensively used in object tracking. However, high... 相似文献
7.
Tracking pedestrians is a vital component of many computer vision applications, including surveillance, scene understanding, and behavior analysis. Videos of crowded scenes present significant challenges to tracking due to the large number of pedestrians and the frequent partial occlusions that they produce. The movement of each pedestrian, however, contributes to the overall crowd motion (i.e., the collective motions of the scene's constituents over the entire video) that exhibits an underlying spatially and temporally varying structured pattern. In this paper, we present a novel Bayesian framework for tracking pedestrians in videos of crowded scenes using a space-time model of the crowd motion. We represent the crowd motion with a collection of hidden Markov models trained on local spatio-temporal motion patterns, i.e., the motion patterns exhibited by pedestrians as they move through local space-time regions of the video. Using this unique representation, we predict the next local spatio-temporal motion pattern a tracked pedestrian will exhibit based on the observed frames of the video. We then use this prediction as a prior for tracking the movement of an individual in videos of extremely crowded scenes. We show that our approach of leveraging the crowd motion enables tracking in videos of complex scenes that present unique difficulty to other approaches. 相似文献
8.
Multimedia Tools and Applications - Vehicle re-identification (re-ID) plays an important role in the automatic analysis of the increasing urban surveillance videos and has become a hot topic in... 相似文献
9.
This work presents an approach to behavior understanding using multiple cameras. This approach is appropriate for monitoring
people in an assistive environment for the purpose of issuing alerts in cases of abnormal behavior. The output of multiple
classifiers is used to model and extract abnormal behavior from both the target trajectory and the target short-term activity
(i.e., walking, running, abrupt motion, etc.). Spatial information is obtained after an offline camera registration using
homography information. The proposed approach is verified experimentally in an indoor environment. The experiments are performed
with a single moving target; however, the method can be generalized to multiple moving targets, which may occlude each other,
due to the use of multiple cameras. 相似文献
10.
针对SURF算法能够提取到的图像特征点较少的问题,基于保持亮度特性的双直方图均衡算法,通过重构SURF尺度空间提取图像特征。将这种方法与卡尔曼滤波相结合进行目标跟踪,用特征点的中心作为跟踪点;通过卡尔曼滤波预测出运动目标的位置,判断遮挡是否发生;最后,应用该方法进行目标特征向量匹配。实验结果表明,该算法对发生旋转、缩放以及遮挡的多运动目标都可进行稳定跟踪,其跟踪速度比R-SURF算法提高20%;在跟踪速度相当的情况下,跟踪精度要高于卡尔曼滤波跟踪算法。 相似文献
11.
This paper presents a multiple model real-time tracking technique for video sequences, based on the mean-shift algorithm.
The proposed approach incorporates spatial information from several connected regions into the histogram-based representation
model of the target, and enables multiple models to be used to represent the same object. The use of several regions to capture
the color spatial information into a single combined model, allow us to increase the object tracking efficiency. By using multiple models, we can make the tracking scheme more
robust in order to work with sequences with illumination and pose changes. We define a model selection function that takes
into account both the similarity of the model with the information present in the image, and the target dynamics. In the tracking
experiments presented, our method successfully coped with lighting changes, occlusion, and clutter. 相似文献
13.
This paper describes a visual tool for teleoperative experimentation involving remote manipulation and contact tasks. Using modest hardware, it recovers in real time the pose of moving polyhedral objects, and presents a synthetic view of the scene to the operator of a teleoperated robot using any chosen viewpoint and viewing direction. To recover pose, the method of line tracking first introduced by Harris (1992) is extended to multiple calibrated cameras, and its dynamic performance improved using robust methods and iterative filtering. Experiments are reported which determine the static and dynamic performance of the vision system, and its use in teleoperation is illustrated in two experiments, a peg-in-hole manipulation task and an impact control task 相似文献
14.
This paper presents an approach for detecting suspicious events in videos by using only the video itself as the training samples for valid behaviors. These salient events are obtained in real-time by detecting anomalous spatio-temporal regions in a densely sampled video. The method codes a video as a compact set of spatio-temporal volumes, while considering the uncertainty in the codebook construction. The spatio-temporal compositions of video volumes are modeled using a probabilistic framework, which calculates their likelihood of being normal in the video. This approach can be considered as an extension of the Bag of Video words (BOV) approaches, which represent a video as an order-less distribution of video volumes. The proposed method imposes spatial and temporal constraints on the video volumes so that an inference mechanism can estimate the probability density functions of their arrangements. Anomalous events are assumed to be video arrangements with very low frequency of occurrence. The algorithm is very fast and does not employ background subtraction, motion estimation or tracking. It is also robust to spatial and temporal scale changes, as well as some deformations. Experiments were performed on four video datasets of abnormal activities in both crowded and non-crowded scenes and under difficult illumination conditions. The proposed method outperformed all other approaches based on BOV that do not account for contextual information. 相似文献
15.
In recent years, several methods have been proposed to combine multiple kernels using a weighted linear sum of kernels. These different kernels may be using information coming from multiple sources or may correspond to using different notions of similarity on the same source. We note that such methods, in addition to the usual ones of the canonical support vector machine formulation, introduce new regularization parameters that affect the solution quality and, in this work, we propose to optimize them using response surface methodology on cross-validation data. On several bioinformatics and digit recognition benchmark data sets, we compare multiple kernel learning and our proposed regularized variant in terms of accuracy, support vector count, and the number of kernels selected. We see that our proposed variant achieves statistically similar or higher accuracy results by using fewer kernel functions and/or support vectors through suitable regularization; it also allows better knowledge extraction because unnecessary kernels are pruned and the favored kernels reflect the properties of the problem at hand. 相似文献
16.
We propose an algorithm for automatically obtaining a segmentation of a rigid object in a sequence of images that are calibrated for camera pose and intrinsic parameters. Until recently, the best segmentation results have been obtained by interactive methods that require manual labelling of image regions. Our method requires no user input but instead relies on the camera fixating on the object of interest during the sequence. We begin by learning a model of the object’s colour, from the image pixels around the fixation points. We then extract image edges and combine these with the object colour information in a volumetric binary MRF model. The globally optimal segmentation of 3D space is obtained by a graph-cut optimisation. From this segmentation an improved colour model is extracted and the whole process is iterated until convergence. 相似文献
17.
Real-time estimates of a crowd size is a central task in civilian surveillance. In this paper we present a novel system counting people in a crowd scene with overlapping cameras. This system fuses all single view foreground information to localize each person present on the scene. The purpose of our fusion strategy is to use the foreground pixels of each single views to improve real-time objects association between each camera of the network. The foreground pixels are obtained by using an algorithm based on codebook. In this work, we aggregate the resulting silhouettes over cameras network, and compute a planar homography projection of each camera’s visual hull into ground plane. The visual hull is obtained by finding the convex hull of the foreground pixels. After the projection into the ground plane, we fuse the obtained polygons by using the geometric properties of the scene and on the quality of each camera detection. We also suggest a region-based approach tracking strategy which keeps track of people movements and of their identities along time, also enabling tolerance to occasional misdetections. This tracking strategy is implemented on the result of the views fusion and allows to estimate the crowd size dependently on each frame. Assessment of experiments using public datasets proposed for the evaluation of counting people system demonstrates the performance of our fusion approach. These results prove that the fusion strategy can run in real-time and is efficient for making data association. We also prove that the combination of our fusion approach and the proposed tracking improve the people counting. 相似文献
18.
Matching objects across multiple cameras with non-overlapping views is a necessary but difficult task in the wide area video surveillance. Owing to the lack of spatio-temporal information, only the visual information can be used in some scenarios, especially when the cameras are widely separated. This paper proposes a novel framework based on multi-feature fusion and incremental learning to match the objects across disjoint views in the absence of space–time cues. We first develop a competitive major feature histogram fusion representation (CMFH 1) to formulate the appearance model for characterizing the potentially matching objects. The appearances of the objects can change over time and hence the models should be continuously updated. We then adopt an improved incremental general multicategory support vector machine algorithm (IGMSVM 2) to update the appearance models online and match the objects based on a classification method. Only a small amount of samples are needed for building an accurate classification model in our method. Several tests are performed on CAVIAR, ISCAPS and VIPeR databases where the objects change significantly due to variations in the viewpoint, illumination and poses. Experimental results demonstrate the advantages of the proposed methodology in terms of computational efficiency, computation storage, and matching accuracy over that of other state-of-the-art classification-based matching approaches. The system developed in this research can be used in real-time video surveillance applications. 相似文献
19.
We study the problem of rewriting queries using views in the presence of access patterns, integrity constraints, disjunction and negation. We provide asymptotically optimal algorithms for (1) finding minimally containing and (2) maximally contained rewritings respecting the access patterns (which we call executable) and for (3) deciding whether an exact executable rewriting exists. We show that rewriting queries using views in this case reduces (a) to rewriting queries with access patterns and constraints without views and also (b) to rewriting queries using views under constraints without access patterns. We show how to solve (a) directly and how to reduce (b) to rewriting queries under constraints only (semantic optimization). These reductions provide two separate routes to a unified solution for problems 1, 2 and 3 based on an extension of the relational chase theory to queries and constraints with disjunction and negation. We also handle equality and arithmetic comparisons. We also show that in an information integration setting, maximally contained rewritings are given by the certain answers (under the usual semantics) for a set of constraints derived from the binding patterns. That is, except for defining the appropriate constraints, binding patterns do not need special treatment. Finally, we show that if there is an exact executable rewriting, there is an executable rewriting which is a union of conjunctive queries with negation. 相似文献
20.
Accurate prediction of protein-ligand binding affinities for lead optimization in drug discovery remains an important and challenging problem on scoring functions for docking simulation. In this paper, we propose a data-driven approach that integrates multiple scoring functions to predict protein-ligand binding affinity directly. We then propose a new method called multiple instance regression based scoring (MIRS) that incorporates unbound ligand conformations using multiple scoring functions. We evaluated the predictive performance of MIRS using 100 protein-ligand complexes and their binding affinities. The experimental results showed that MIRS outperformed the 11 conventional scoring functions including LigScore, PLP, AutoDock, G-Score, D-Score, LUDI, F-Score, ChemScore, X-Score, PMF, and DrugScore. In addition, we confirmed that MIRS performed well on binding pose prediction. Our results reveal that it is indispensable to incorporate unbound ligand conformations in both binding affinity prediction and binding pose prediction. The proposed method will accelerate efficient lead optimization on structure-based drug design and provide a new direction to designing of new scoring score functions. 相似文献
|