首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Simultaneously tracking poses of multiple people is a difficult problem because of inter-person occlusions and self occlusions. This paper presents an approach that circumvents this problem by performing tracking based on observations from multiple wide-baseline cameras. The proposed global occlusion estimation approach can deal with severe inter-person occlusions in one or more views by exploiting information from other views. Image features from non-occluded views are given more weight than image features from occluded views. Self occlusion is handled by local occlusion estimation. The local occlusion estimation is used to update the image likelihood function by sorting body parts as a function of distance to the cameras. The combination of the global and the local occlusion estimation leads to accurate tracking results at much lower computational costs. We evaluate the performance of our approach on a pose estimation data set in which inter-person and self occlusions are present. The results of our experiments show that our approach is able to robustly track multiple people during large movement with severe inter-person occlusions and self occlusions, whilst maintaining near real-time performance.  相似文献   

2.
融合SPA遮挡分割的多目标跟踪方法   总被引:1,自引:0,他引:1       下载免费PDF全文
复杂环境下的多目标视频跟踪是计算机视觉领域的一个难点,有效处理目标间遮挡是解决多目标跟踪问题的关键。将运动分割方法引入目标跟踪领域,提出一种融合骨架点指派(SPA)遮挡分割的多目标跟踪方法。由底层光流信息得到骨架点,并估计骨架点遮挡状态;综合使用目标外观、运动、颜色信息等高级语义信息,将骨架点指派给各个目标;最后以骨架点为核,对运动前景密集分类,得到准确的目标前景像素;在粒子滤波器跟踪框架下,使用概率外观模型进行多目标跟踪。在PETS2009数据集上的实验结果表明,文中方法能够改进现有多目标跟踪方法对目标间交互适应性较差的缺点,更好地处理动态遮挡问题。  相似文献   

3.
AD-HOC (Appearance Driven Human tracking with Occlusion Classification) is a complete framework for multiple people tracking in video surveillance applications in presence of large occlusions. The appearance-based approach allows the estimation of the pixel-wise shape of each tracked person even during the occlusion. This peculiarity can be very useful for higher level processes, such as action recognition or event detection. A first step predicts the position of all the objects in the new frame while a MAP framework provides a solution for best placement. A second step associates each candidate foreground pixel to an object according to mutual object position and color similarity. A novel definition of non-visible regions accounts for the parts of the objects that are not detected in the current frame, classifying them as dynamic, scene or apparent occlusions. Results on surveillance videos are reported, using in-house produced videos and the PETS2006 test set.  相似文献   

4.
This work presents a new method for tracking and segmenting along time-interacting objects within an image sequence. One major contribution of the paper is the formalization of the notion of visible and occluded parts. For each object, we aim at tracking these two parts. Assuming that the velocity of each object is driven by a dynamical law, predictions can be used to guide the successive estimations. Separating these predicted areas into good and bad parts with respect to the final segmentation and representing the objects with their visible and occluded parts permit handling partial and complete occlusions. To achieve this tracking, a label is assigned to each object and an energy function representing the multilabel problem is minimized via a graph cuts optimization. This energy contains terms based on image intensities which enable segmenting and regularizing the visible parts of the objects. It also includes terms dedicated to the management of the occluded and disappearing areas, which are defined on the areas of prediction of the objects. The results on several challenging sequences prove the strength of the proposed approach.  相似文献   

5.
This paper deals with the problem of position-based visual servoing in a multiarm robotic cell equipped with a hybrid eye-in-hand/eye-to-hand multicamera system. The proposed approach is based on the real-time estimation of the pose of a target object by using the extended Kalman filter. The data provided by all the cameras are selected by a suitable algorithm on the basis of the prediction of the object self-occlusions, as well as of the mutual occlusions caused by the robot links and tools. Only an optimal subset of image features is considered for feature extraction, thus ensuring high estimation accuracy with a computational cost independent of the number of cameras. A salient feature of the paper is the implementation of the proposed approach to the case of a robotic cell composed of two industrial robot manipulators. Two different case studies are presented to test the effectiveness of the hybrid camera configuration and the robustness of the visual servoing algorithm with respect to the occurrence of occlusions  相似文献   

6.
Vision for Robotics: a tool for model-based object tracking   总被引:1,自引:0,他引:1  
Vision for Robotics (V4R) is a software package for tracking rigid objects in unknown surroundings. Its output is the 3-D pose of the target object, which can be further used as an input to control, e.g., the end effector of a robot. The major goals are tracking at camera frame rate and robustness. The latter is achieved by performing cue integration in order to compensate for weaknesses of individual cues. Therefore, features such as lines and ellipses are not only extracted from 2-D images, but the 3-D model and the pose of the object are exploited also.  相似文献   

7.
In this article we present the integration of 3-D shape knowledge into a variational model for level set based image segmentation and contour based 3-D pose tracking. Given the surface model of an object that is visible in the image of one or multiple cameras calibrated to the same world coordinate system, the object contour extracted by the segmentation method is applied to estimate the 3-D pose parameters of the object. Vice-versa, the surface model projected to the image plane helps in a top-down manner to improve the extraction of the contour. While common alternative segmentation approaches, which integrate 2-D shape knowledge, face the problem that an object can look very differently from various viewpoints, a 3-D free form model ensures that for each view the model can fit the data in the image very well. Moreover, one additionally solves the problem of determining the object’s pose in 3-D space. The performance is demonstrated by numerous experiments with a monocular and a stereo camera system.  相似文献   

8.
Tracking multiple objects is more challenging than tracking a single object. Some problems arise in multiple-object tracking that do not exist in single-object tracking, such as object occlusion, the appearance of a new object and the disappearance of an existing object, updating the occluded object, etc. In this article, we present an approach to handling multiple-object tracking in the presence of occlusions, background clutter, and changing appearance. The occlusion is handled by considering the predicted trajectories of the objects based on a dynamic model and likelihood measures. We also propose target-model-update conditions, ensuring the proper tracking of multiple objects. The proposed method is implemented in a probabilistic framework such as a particle filter in conjunction with a color feature. The particle filter has proven very successful for nonlinear and non-Gaussian estimation problems. It approximates a posterior probability density of the state, such as the object’s position, by using samples or particles, where each state is denoted as the hypothetical state of the tracked object and its weight. The observation likelihood of the objects is modeled based on a color histogram. The sample weight is measured based on the Bhattacharya coefficient, which measures the similarity between each sample’s histogram and a specified target model. The algorithm can successfully track multiple objects in the presence of occlusion and noise. Experimental results show the effectiveness of our method in tracking multiple objects.  相似文献   

9.
We propose an edge-based method for 6DOF pose tracking of rigid objects using a monocular RGB camera. One of the critical problem for edge-based methods is to search the object contour points in the image corresponding to the known 3D model points. However, previous methods often produce false object contour points in case of cluttered backgrounds and partial occlusions. In this paper, we propose a novel edge-based 3D objects tracking method to tackle this problem. To search the object contour points, foreground and background clutter points are first filtered out using edge color cue, then object contour points are searched by maximizing their edge confidence which combines edge color and distance cues. Furthermore, the edge confidence is integrated into the edge-based energy function to reduce the influence of false contour points caused by cluttered backgrounds and partial occlusions. We also extend our method to multi-object tracking which can handle mutual occlusions. We compare our method with the recent state-of-art methods on challenging public datasets. Experiments demonstrate that our method improves robustness and accuracy against cluttered backgrounds and partial occlusions.  相似文献   

10.
This paper presents a flexible framework to build a target-specific, part-based representation for arbitrary articulated or rigid objects. The aim is to successfully track the target object in 2D, through multiple scales and occlusions. This is realized by employing a hierarchical, iterative optimization process on the proposed representation of structure and appearance. Therefore, each rigid part of an object is described by a hierarchical spring system represented by an attributed graph pyramid. Hierarchical spring systems encode the spatial relationships of the features (attributes of the graph pyramid) describing the parts and enforce them by spring-like behavior during tracking. Articulation points connecting the parts of the object allow to transfer position information from reliable to ambiguous parts. Tracking is done in an iterative process by combining the hypotheses of simple trackers with the hypotheses extracted from the hierarchical spring systems.  相似文献   

11.
This paper presents a flexible framework to build a target-specific, part-based representation for arbitrary articulated or rigid objects. The aim is to successfully track the target object in 2D, through multiple scales and occlusions. This is realized by employing a hierarchical, iterative optimization process on the proposed representation of structure and appearance. Therefore, each rigid part of an object is described by a hierarchical spring system represented by an attributed graph pyramid. Hierarchical spring systems encode the spatial relationships of the features (attributes of the graph pyramid) describing the parts and enforce them by spring-like behavior during tracking. Articulation points connecting the parts of the object allow to transfer position information from reliable to ambiguous parts. Tracking is done in an iterative process by combining the hypotheses of simple trackers with the hypotheses extracted from the hierarchical spring systems.  相似文献   

12.
《Advanced Robotics》2013,27(10):1057-1072
It is an easy task for the human visual system to gaze continuously at an object moving in three-dimensional (3-D) space. While tracking the object, human vision seems able to comprehend its 3-D shape with binocular vision. We conjecture that, in the human visual system, the function of comprehending the 3-D shape is essential for robust tracking of a moving object. In order to examine this conjecture, we constructed an experimental system of binocular vision for motion tracking. The system is composed of a pair of active pan-tilt cameras and a robot arm. The cameras are for simulating the two eyes of a human while the robot arm is for simulating the motion of the human body below the neck. The two active cameras are controlled so as to fix their gaze at a particular point on an object surface. The shape of the object surface around the point is reconstructed in real-time from the two images taken by the cameras based on the differences in the image brightness. If the two cameras successfully gaze at a single point on the object surface, it is possible to reconstruct the local object shape in real-time. At the same time, the reconstructed shape is used for keeping a fixation point on the object surface for gazing, which enables robust tracking of the object. Thus these two processes, reconstruction of the 3-D shape and maintaining the fixation point, must be mutually connected and form one closed loop. We demonstrate the effectiveness of this framework for visual tracking through several experiments.  相似文献   

13.
A camera mounted on an aerial vehicle provides an excellent means for monitoring large areas of a scene. Utilizing several such cameras on different aerial vehicles allows further flexibility, in terms of increased visual scope and in the pursuit of multiple targets. In this paper, we address the problem of associating objects across multiple airborne cameras. Since the cameras are moving and often widely separated, direct appearance-based or proximity-based constraints cannot be used. Instead, we exploit geometric constraints on the relationship between the motion of each object across cameras, to test multiple association hypotheses, without assuming any prior calibration information. Given our scene model, we propose a likelihood function for evaluating a hypothesized association between observations in multiple cameras that is geometrically motivated. Since multiple cameras exist, ensuring coherency in association is an essential requirement, e.g. that transitive closure is maintained between more than two cameras. To ensure such coherency we pose the problem of maximizing the likelihood function as a k-dimensional matching and use an approximation to find the optimal assignment of association. Using the proposed error function, canonical trajectories of each object and optimal estimates of inter-camera transformations (in a maximum likelihood sense) are computed. Finally, we show that as a result of associating objects across the cameras, a concurrent visualization of multiple aerial video streams is possible and that, under special conditions, trajectories interrupted due to occlusion or missing detections can be repaired. Results are shown on a number of real and controlled scenarios with multiple objects observed by multiple cameras, validating our qualitative models, and through simulation quantitative performance is also reported.  相似文献   

14.
This paper proposes a robust tracking method by the combination of appearance modeling and sparse representation. In this method, the appearance of an object is modeled by multiple linear subspaces. Then within the sparse representation framework, we construct a similarity measure to evaluate the distance between a target candidate and the learned appearance model. Finally, tracking is achieved by Bayesian inference, in which a particle filter is used to estimate the target state sequentially over time. With the tracking result, the learned appearance model will be updated adaptively. The combination of appearance modeling and sparse representation makes our tracking algorithm robust to most of possible target variations due to illumination changes, pose changes, deformations and occlusions. Theoretic analysis and experiments compared with state-of-the-art methods demonstrate the effectivity of the proposed algorithm.  相似文献   

15.
监控系统中的多摄像机协同   总被引:8,自引:0,他引:8  
描述了一个用于室内场合对多个目标进行跟踪的分布式监控系统.该系统由多个廉价的固定镜头的摄像机构成,具有多个摄像机处理模块和一个中央模块用于协调摄像机间的跟踪任务.由于每个运动目标有可能被多个摄像机同时跟踪,因此如何选择最合适的摄像机对某一目标跟踪,特别是在系统资源紧张时,成为一个问题.提出的新算法能根据目标与摄像机之间的距离并考虑到遮挡的情况,把目标分配给相应的摄像机,因此在遮挡出现时,系统能把遮挡的目标分配给能看见目标并距离最近的那个摄像机.实验表明该系统能协调好多个摄像机进行目标跟踪,并处理好遮挡问题.  相似文献   

16.
Model-based 3-D object tracking has earned significant importance in areas such as augmented reality, surveillance, visual servoing, robotic object manipulation and grasping. Key problems to robust and precise object tracking are the outliers caused by occlusion, self-occlusion, cluttered background, reflections and complex appearance properties of the object. Two of the most common solutions to the above problems have been the use of robust estimators and the integration of visual cues. The tracking system presented in this paper achieves robustness by integrating model-based and model-free cues together with robust estimators. As a model-based cue, a wireframe edge model is used. As model-free cues, automatically generated surface texture features are used. The particular contribution of this work is the integration framework where not only polyhedral objects are considered. In particular, we deal also with spherical, cylindrical and conical objects for which the complete pose cannot be estimated using only wireframe models. Using the integration with the model-free features, we show how a full pose estimate can be obtained. Experimental evaluation demonstrates robust system performance in realistic settings with highly textured objects and natural backgrounds.  相似文献   

17.
Robust visual tracking remains a technical challenge in real-world applications, as an object may involve many appearance variations. In existing tracking frameworks, objects in an image are often represented as vector observations, which discounts the 2-D intrinsic structure of the image. By considering an image in its actual form as a matrix, we construct the 3rd order tensor based object representation to preserve the spatial correlation within the 2-D image and fully exploit the useful temporal information. We perform incremental update of the object template using the N-mode SVD to model the appearance variations, which reduces the influence of template drifting and object occlusions. The proposed scheme efficiently learns a low-dimensional tensor representation through adaptively updating the eigenbasis of the tensor. Tensor based Bayesian inference in the particle filter framework is then utilized to realize tracking. We present the validation of the proposed tracking system by conducting the real-time facial expression recognition with video data and a live camera. Experiment evaluation on challenging benchmark image sequences undergoing appearance variations demonstrates the significance and effectiveness of the proposed algorithm.  相似文献   

18.
The majority of existing tracking algorithms are based on the maximum a posteriori solution of a probabilistic framework using a Hidden Markov Model, where the distribution of the object state at the current time instance is estimated based on current and previous observations. However, this approach is prone to errors caused by distractions such as occlusions, background clutters and multi-object confusions. In this paper, we propose a multiple object tracking algorithm that seeks the optimal state sequence that maximizes the joint multi-object state-observation probability. We call this algorithm trajectory tracking since it estimates the state sequence or “trajectory” instead of the current state. The algorithm is capable of tracking unknown time-varying number of multiple objects. We also introduce a novel observation model which is composed of the original image, the foreground mask given by background subtraction and the object detection map generated by an object detector. The image provides the object appearance information. The foreground mask enables the likelihood computation to consider the multi-object configuration in its entirety. The detection map consists of pixel-wise object detection scores, which drives the tracking algorithm to perform joint inference on both the number of objects and their configurations efficiently. The proposed algorithm has been implemented and tested extensively in a complete CCTV video surveillance system to monitor entries and detect tailgating and piggy-backing violations at access points for over six months. The system achieved 98.3% precision in event classification. The violation detection rate is 90.4% and the detection precision is 85.2%. The results clearly demonstrate the advantages of the proposed detection based trajectory tracking framework.  相似文献   

19.
Intelligent visual surveillance — A survey   总被引:3,自引:0,他引:3  
Detection, tracking, and understanding of moving objects of interest in dynamic scenes have been active research areas in computer vision over the past decades. Intelligent visual surveillance (IVS) refers to an automated visual monitoring process that involves analysis and interpretation of object behaviors, as well as object detection and tracking, to understand the visual events of the scene. Main tasks of IVS include scene interpretation and wide area surveillance control. Scene interpretation aims at detecting and tracking moving objects in an image sequence and understanding their behaviors. In wide area surveillance control task, multiple cameras or agents are controlled in a cooperative manner to monitor tagged objects in motion. This paper reviews recent advances and future research directions of these tasks. This article consists of two parts: The first part surveys image enhancement, moving object detection and tracking, and motion behavior understanding. The second part reviews wide-area surveillance techniques based on the fusion of multiple visual sensors, camera calibration and cooperative camera systems.  相似文献   

20.
In the spirit of recent work on contextual recognition and estimation, we present a method for estimating the pose of human hands, employing information about the shape of the object in the hand. Despite the fact that most applications of human hand tracking involve grasping and manipulation of objects, the majority of methods in the literature assume a free hand, isolated from the surrounding environment. Occlusion of the hand from grasped objects does in fact often pose a severe challenge to the estimation of hand pose. In the presented method, object occlusion is not only compensated for, it contributes to the pose estimation in a contextual fashion; this without an explicit model of object shape. Our hand tracking method is non-parametric, performing a nearest neighbor search in a large database (.. entries) of hand poses with and without grasped objects. The system that operates in real time, is robust to self occlusions, object occlusions and segmentation errors, and provides full hand pose reconstruction from monocular video. Temporal consistency in hand pose is taken into account, without explicitly tracking the hand in the high-dim pose space. Experiments show the non-parametric method to outperform other state of the art regression methods, while operating at a significantly lower computational cost than comparable model-based hand tracking methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号