首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
How far can human detection and tracking go in real world crowded scenes? Many algorithms often fail in such scenes due to frequent and severe occlusions as well as viewpoint changes. In order to handle these difficulties, we propose Scene Aware Detection (SAD) and Block Assignment Tracking (BAT) that incorporate with some available scene models (e.g. background, layout, ground plane and camera models). The SAD is proposed for accurate detection through utilizing 1) camera model to deal with viewpoint changes by rectifying sub-images, 2) a structural filter approach to handle occlusions based on a feature sharing mechanism in which a three-level hierarchical structure is built for humans, and 3) foregrounds for pruning negative and false positive samples and merging intermediate detection results. Many detection or appearance based tracking systems are prone to errors in occluded scenes because of failures of detectors and interactions of multiple objects. Differently, the BAT formulates tracking as a block assignment process, where blocks with the same label form the appearance of one object. In the BAT, we model objects on two levels, one is the ensemble level to measure how it is like an object by discriminative models, and the other one is the block level to measure how it is like a target object by appearance and motion models. The main advantage of BAT is that it can track an object even when all the part detectors fail as long as the object has assigned blocks. Extensive experiments in many challenging real world scenes demonstrate the efficiency and effectiveness of our approach.  相似文献   

2.
目的 目前已有的人体姿态跟踪算法的跟踪精度仍有待提高,特别是对灵活运动的手臂部位的跟踪。为提高人体姿态的跟踪精度,本文首次提出一种将视觉时空信息与深度学习网络相结合的人体姿态跟踪方法。方法 在人体姿态跟踪过程中,利用视频时间信息计算出人体目标区域的运动信息,使用运动信息对人体部位姿态模型在帧间传递;考虑到基于图像空间特征的方法对形态较为固定的人体部位如躯干和头部能够较好地检测,而对手臂的检测效果较差,构造并训练一种轻量级的深度学习网络,用于生成人体手臂部位的附加候选样本;利用深度学习网络生成手臂特征一致性概率图,与视频空间信息结合计算得到最优部位姿态,并将各部位重组为完整人体姿态跟踪结果。结果 使用两个具有挑战性的人体姿态跟踪数据集VideoPose2.0和YouTubePose对本文算法进行验证,得到的手臂关节点平均跟踪精度分别为81.4%和84.5%,与现有方法相比有明显提高;此外,通过在VideoPose2.0数据集上的实验,验证了本文提出的对下臂附加采样的算法和手臂特征一致性计算的算法能够有效提高人体姿态关节点的跟踪精度。结论 提出的结合时空信息与深度学习网络的人体姿态跟踪方法能够有效提高人体姿态跟踪的精度,特别是对灵活运动的人体姿态下臂关节点的跟踪精度有显著提高。  相似文献   

3.
Multiple human tracking in high-density crowds   总被引:1,自引:0,他引:1  
In this paper, we introduce a fully automatic algorithm to detect and track multiple humans in high-density crowds in the presence of extreme occlusion. Typical approaches such as background modeling and body part-based pedestrian detection fail when most of the scene is in motion and most body parts of most of the pedestrians are occluded. To overcome this problem, we integrate human detection and tracking into a single framework and introduce a confirmation-by-classification method for tracking that associates detections with tracks, tracks humans through occlusions, and eliminates false positive tracks. We use a Viola and Jones AdaBoost detection cascade, a particle filter for tracking, and color histograms for appearance modeling. To further reduce false detections due to dense features and shadows, we introduce a method for estimation and utilization of a 3D head plane that reduces false positives while preserving high detection rates. The algorithm learns the head plane from observations of human heads incrementally, without any a priori extrinsic camera calibration information, and only begins to utilize the head plane once confidence in the parameter estimates is sufficiently high. In an experimental evaluation, we show that confirmation-by-classification and head plane estimation together enable the construction of an excellent pedestrian tracker for dense crowds.  相似文献   

4.
Detection and Tracking of Occluded People   总被引:2,自引:0,他引:2  
We consider the problem of detection and tracking of multiple people in crowded street scenes. State-of-the-art methods perform well in scenes with relatively few people, but are severely challenged by scenes with many subjects that partially occlude each other. This limitation is due to the fact that current people detectors fail when persons are strongly occluded. We observe that typical occlusions are due to overlaps between people and propose a people detector tailored to various occlusion levels. Instead of treating partial occlusions as distractions, we leverage the fact that person/person occlusions result in very characteristic appearance patterns that can help to improve detection results. We demonstrate the performance of our occlusion-aware person detector on a new dataset of people with controlled but severe levels of occlusion and on two challenging publicly available benchmarks outperforming single person detectors in each case.  相似文献   

5.
王磊  吴俊  周志敏  赵旭  刘允才 《软件学报》2015,26(S2):128-136
在计算机视觉和多媒体领域,利用视觉信息进行语义层面人体运动分析非常重要且具有挑战性.提出一种利用检测信息的底层响应来描述人体动作的语义信息方法.在特定的人体动作下,可变形部分模型的检测结果隐含人体部分的关键信息,可以形成人体动作识别的特征.利用检测器的滤波器响应生成人体描述特征,对人体整体和部分的位置以及表观信息进行编码,由于该特征利用了人体部分相对于整体位置的统计信息,对检测过程中的误检部分具有较强的鲁棒性,基于该特征可将人体检测和动作识别融合成统一框架.在3个数据库上的实验结果显示了方法的有效性,取得了与其他方法相近或者更优的效果.  相似文献   

6.
The field of Human Robot Interaction (HRI) encompasses many difficult challenges as robots need a better understanding of human actions. Human detection and tracking play a major role in such scenarios. One of the main challenges is to track them with long term occlusions due to agile nature of human navigation. However, in general humans do not make random movements. They tend to follow common motion patterns depending on their intentions and environmental/physical constraints. Therefore, knowledge of such common motion patterns could allow a robotic device to robustly track people even with long term occlusions. On the other hand, once a robust tracking is achieved, they can be used to enhance common motion pattern models allowing robots to adapt to new motion patterns that could appear in the environment. Therefore, this paper proposes to learn human motion patterns based on Sampled Hidden Markov Model (SHMM) and simultaneously track people using a particle filter tracker. The proposed simultaneous people tracking and human motion pattern learning has not only improved the tracking robustness compared to more conservative approaches, it has also proven robustness to prolonged occlusions and maintaining identity. Furthermore, the integration of people tracking and on-line SHMM learning have led to improved learning performance. These claims are supported by real world experiments carried out on a robot with suite of sensors including a laser range finder.  相似文献   

7.
Simultaneously tracking poses of multiple people is a difficult problem because of inter-person occlusions and self occlusions. This paper presents an approach that circumvents this problem by performing tracking based on observations from multiple wide-baseline cameras. The proposed global occlusion estimation approach can deal with severe inter-person occlusions in one or more views by exploiting information from other views. Image features from non-occluded views are given more weight than image features from occluded views. Self occlusion is handled by local occlusion estimation. The local occlusion estimation is used to update the image likelihood function by sorting body parts as a function of distance to the cameras. The combination of the global and the local occlusion estimation leads to accurate tracking results at much lower computational costs. We evaluate the performance of our approach on a pose estimation data set in which inter-person and self occlusions are present. The results of our experiments show that our approach is able to robustly track multiple people during large movement with severe inter-person occlusions and self occlusions, whilst maintaining near real-time performance.  相似文献   

8.
Robust detection and tracking of pedestrians in image sequences are essential for many vision applications. In this paper, we propose a method to detect and track multiple pedestrians using motion, color information and the AdaBoost algorithm. Our approach detects pedestrians in a walking pose from a single camera on a mobile or stationary system. In the case of mobile systems, ego-motion of the camera is compensated for by corresponding feature sets. The region of interest is calculated by the difference image between two consecutive images using the compensated image. Pedestrian detector is learned by boosting a number of weak classifiers which are based on Histogram of Oriented Gradient (HOG) features. Pedestrians are tracked by block matching method using color information. Our tracking system can track pedestrians with possibly partial occlusions and without misses using information stored in advance even after occlusion is ended. The proposed approach has been tested on a number of image sequences, and was shown to detect and track multiple pedestrians very well.  相似文献   

9.
目的 复杂场景下目标频繁且长时间的遮挡、跟踪目标外观相似引起身份转换等问题给多目标跟踪带来许多挑战。针对多目标跟踪在复杂场景中因长时间遮挡引起身份转换和轨迹分段的问题,提出一种基于自适应在线判别外观学习的分层关联多目标跟踪算法。方法 利用轨迹置信度将多目标跟踪分为局部关联和全局关联两个层次。在局部关联中,置信度高的可靠轨迹利用外观、位置-大小相似度与当前帧检测点进行关联;在全局关联中,置信度低的不可靠轨迹引入运动模型和有效关联范围进一步关联分段的轨迹。在提取目标外观特征时引入增量线性可判别分析方法以解决身份转换问题,依据新增样本与目标样本均值的外观特征差异自适应地更新目标外观模型。结果 在公开数据集2D MOT2015中的PETS09-S2L1、TUD-Stadmitte、Town-Center 3个数据集中与当前10种多目标跟踪算法进行比较,该方法对各个数据集身份转换和轨迹分段都有减少,其中在Town-Center数据集中,身份转换减少了60个,轨迹分段减少了84个,跟踪准确度提高了5.2%以上。结论 本文多目标跟踪方法,能够在复杂场景中稳定有效地实现多目标跟踪,减少轨迹分段现象,其中引入的在线线性可判别外观学习对遮挡产生的身份转换具有良好的解决效果。  相似文献   

10.
目的 基于卡尔曼滤波的视频目标跟踪算法需要事先获得过程噪声和观测噪声方差,但在实际应用中,无法得知这两种噪声方差的准确值。此外,由于目标运动的随机性和视频场景中背景的复杂性,噪声方差也会随时间发生动态变化。如果设定的噪声方差不准确,跟踪精度会受影响,严重时会导致目标跟踪失败。考虑到上述问题,提出一种新的解决方法。方法 将带遗忘因子的推广递推最小二乘法(EFRLS)运用到视频目标跟踪研究领域。在该算法中,无需使用噪声方差,首先利用Mean Shift算法获得目标位置的初步估计,再利用EFRLS算法估计下一帧目标的位置。结果 该算法明显好于传统Mean Shift算法,并且与Kalman结合Mean Shift算法的跟踪性能相当。此外,在目标发生严重遮挡时,该算法优于Kalman结合Mean Shift算法,具有较好的跟踪性能。结论 本文算法无需设置噪声参数,可以实现目标在发生严重遮挡和遮挡后目标重新出现的情况下的准确跟踪,提高了跟踪的鲁棒性,具有一定的工程使用价值。  相似文献   

11.
目前, 我国青藏高原地区的牦牛养殖方式以传统的人工放牧为主. 为解决人力养殖方式无法快速跟踪统计牦牛数量的问题, 本文提出了一种改进YOLOv5和Bytetrack的牦牛跟踪方法, 以实现在视频输入情况下快速检测跟踪牦牛. 采用基于深度学习的YOLOv5目标检测网络, 结合CA注意力、跨尺度特征融合和空洞卷积池化金字塔等优化方法, 减少牦牛检测中因遮挡而导致检测难度大、误检漏检的问题, 实现对视频中牦牛更精确的检测; 使用Bytetrack跟踪器通过卡尔曼滤波和匈牙利算法实现帧间目标关联, 并为目标匹配ID; 使用ImageNet中的部分牦牛数据和青海玉树地区采集的牦牛样本图像来训练模型. 实验结果表明: 本文改进模型的平均检测精确度为98.7%, 比原YOLOv5s、SSD、YOLOX和Faster RCNN模型分别提高1.1、1.89、8.33、0.4个百分点, 能快速收敛, 检测性能最优; 改进的YOLOv5s和Bytetrack跟踪结果最优, MOTA提高了7.1646%. 本研究改进的模型能够更加快速准确地检测和跟踪统计牦牛, 为青海地区畜牧业的智慧化发展提供技术支持.  相似文献   

12.
Abstract. The use of hand gestures provides an attractive means of interacting naturally with a computer-generated display. Using one or more video cameras, the hand movements can potentially be interpreted as meaningful gestures. One key problem in building such an interface without a restricted setup is the ability to localize and track the human arm robustly in video sequences. This paper proposes a multiple-cue localization scheme combined with a tracking framework to reliably track the dynamics of the human arm in unconstrained environments. The localization scheme integrates the multiple cues of motion, shape, and color for locating a set of key image features. Using constraint fusion, these features are tracked by a modified extended Kalman filter that exploits the articulated structure of the human arm. Moreover, an interaction scheme between tracking and localization is used for improving the estimation process while reducing the computational requirements. The performance of the localization/tracking framework is validated with the help of extensive experiments and simulations. These experiments include tracking with calibrated stereo camera and uncalibrated broadcast video. Received: 19 January 2001 / Accepted: 27 December 2001 Correspondence to: R. Sharma  相似文献   

13.
This paper describes an approach to human action recognition based on a probabilistic optimization model of body parts using hidden Markov model (HMM). Our method is able to distinguish between similar actions by only considering the body parts having major contribution to the actions, for example, legs for walking, jogging and running; arms for boxing, waving and clapping. We apply HMMs to model the stochastic movement of the body parts for action recognition. The HMM construction uses an ensemble of body‐part detectors, followed by grouping of part detections, to perform human identification. Three example‐based body‐part detectors are trained to detect three components of the human body: the head, legs and arms. These detectors cope with viewpoint changes and self‐occlusions through the use of ten sub‐classifiers that detect body parts over a specific range of viewpoints. Each sub‐classifier is a support vector machine trained on features selected for the discriminative power for each particular part/viewpoint combination. Grouping of these detections is performed using a simple geometric constraint model that yields a viewpoint‐invariant human detector. We test our approach on three publicly available action datasets: the KTH dataset, Weizmann dataset and HumanEva dataset. Our results illustrate that with a simple and compact representation we can achieve robust recognition of human actions comparable to the most complex, state‐of‐the‐art methods.  相似文献   

14.
We present a system that tracks an articulated body performing 3D movement with occlusions using a combination of cameras and mirrors. By integrating cameras and mirrors we get a simultaneous coverage of almost every point on the target and avoid occlusions. The suggested setup is much simpler and easier to handle compared to the equivalent, camera-based setup. Our tracking algorithm is model-based, and errors in the model are treated using the bundle adjustment procedure. In order to deal with the problem of feature visibility, each feature is set to be valid or invalid based on the model and on its expected appearance; this ensures that the system always tracks a set of distinguishable features. The proposed algorithm was able to track targets in 3D using the Gauss–Newton method to minimize geometric errors. We tested our setup by tracking the chameleon’s eyes. Tracking the eyes of a chameleon can be considered as the estimation of the 3D pose of an articulated body, where the head of the chameleon is considered as a rigid body, and each of the two eyes has additional two degrees of freedom. The algorithm proposed can be easily expanded to cope with more complex objects.  相似文献   

15.
In this paper, we address the problem of 2D–3D pose estimation. Specifically, we propose an approach to jointly track a rigid object in a 2D image sequence and to estimate its pose (position and orientation) in 3D space. We revisit a joint 2D segmentation/3D pose estimation technique, and then extend the framework by incorporating a particle filter to robustly track the object in a challenging environment, and by developing an occlusion detection and handling scheme to continuously track the object in the presence of occlusions. In particular, we focus on partial occlusions that prevent the tracker from extracting an exact region properties of the object, which plays a pivotal role for region-based tracking methods in maintaining the track. To this end, a dynamical choice of how to invoke the objective functional is performed online based on the degree of dependencies between predictions and measurements of the system in accordance with the degree of occlusion and the variation of the object’s pose. This scheme provides the robustness to deal with occlusions of an obstacle with different statistical properties from that of the object of interest. Experimental results demonstrate the practical applicability and robustness of the proposed method in several challenging scenarios.  相似文献   

16.
Multiple-target tracking is a challenging field specially when dealing with uncontrolled scenarios. Two common approaches are often used, one based on low-level techniques to detect each object size, position and velocity, and other based on high-level techniques that deal with object appearance. None of these methods can deal with all possible problems in multiple-target tracking: environment occlusions, both total and partial, and collisions, such as grouping and splitting events. So one solution is to merge these techniques to improve their performance. Based on an existing hierarchical architecture, we present a novel technique that can deal with all the mentioned problems in multiple tracking targets. Blob detection, low-level tracking using adaptive filters, high-level tracking based on a fixed pool of histograms and an event management that can detect every collision event and performs occlusion recovery are used to be able to track every object during the time they appear within the scene. Experimental results show the performance of this technique under multiple situations, being able to track every object in the scene without losing their initial identification. The speed processing is higher than 50 frames, which allows it to be used under real-time scenarios.  相似文献   

17.
We propose a method that detects and segments multiple, partially occluded objects in images. A part hierarchy is defined for the object class. Both the segmentation and detection tasks are formulated as binary classification problem. A whole-object segmentor and several part detectors are learned by boosting local shape feature based weak classifiers. Given a new image, the part detectors are applied to obtain a number of part responses. All the edge pixels in the image that positively contribute to the part responses are extracted. A joint likelihood of multiple objects is defined based on the part detection responses and the object edges. Computation of the joint likelihood includes an inter-object occlusion reasoning that is based on the object silhouettes extracted with the whole-object segmentor. By maximizing the joint likelihood, part detection responses are grouped, merged, and assigned to multiple object hypotheses. The proposed approach is demonstrated with the class of pedestrians. The experimental results show that our method outperforms the previous ones.  相似文献   

18.
Abstract. Conventional tracking methods encounter difficulties as the number of objects, clutter, and sensors increase, because of the requirement for data association. Statistical tracking, based on the concept of network tomography, is an alternative that avoids data association. It estimates the number of trips made from one region to another in a scene based on interregion boundary traffic counts accumulated over time. It is not necessary to track an object through a scene to determine when an object crosses a boundary. This paper describes statistical tracing and presents an evaluation based on the estimation of pedestrian and vehicular traffic intensities at an intersection over a period of 1 month. We compare the results with those from a multiple-hypothesis tracker and manually counted ground-truth estimates. Received: 30 August 2001 / Accepted: 28 May 2002 Correspondence to: J.E. Boyd  相似文献   

19.
多运动目标跟踪是智能视觉监控系统中的关键性的亟待解决的问题,采用二维视觉特征跟踪会在目标相互遮挡时丢失目标特征造成跟踪困难,近年来三维视觉跟踪系统越来越成为热点,利用三维特征能在多目标相互遮挡时更好地识别、跟踪目标,实现多目标遮挡时的精确跟踪,从而全面提高智能视觉监控系统的精确性。总结了近年来基于三维视觉系统的多运动目标跟踪方法,根据采用的不同三维视觉系统分为三类,将每类中具有代表性的方法进行了论述,分析了各典型方法的优缺点,最后提出了进一步研究的主要发展内容和趋势。  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号