首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 46 毫秒
为了预测行人在复杂场景中的行走轨迹,提出了一种基于生成对抗网络的可解释模型。该模型以场景中行人的历史轨迹信息和场景环境信息作为模型的输入,并在生成对抗网络中引入了物理注意力机制和社会注意力机制对行人轨迹进行预测。其中,物理注意力机制有助于建模复杂场景的整体布局并提取图像中与路径相关的显著特征,社会注意力机制能够建模不同行人之间的交互对未来轨迹的影响。在生成对抗网络的整体框架下,物理和社会注意力机制的结合使得该模型能够预测出符合物理限制和社会行为规范的多条可接受的未来路径。通过在仿真数据和真实的标准数据集上的实验,可以证明该模型能够实现对行人未来轨迹的有效预测。  相似文献   

基于双向非线性学习的轨迹跟踪和识别   总被引:1,自引:0,他引:1  
目标的运动轨迹是跟踪和识别目标行为的重要特征之一,在视觉跟踪等领域得到了广泛的应用.然而,由于轨迹数据具有高维和非线性等特点,因而直接建模目标的运动轨迹比较困难.为此,引入一种称为自编码(autoencoder)的双向深层神经网络,并结合粒子滤波提出一种轨迹跟踪识别算法.首先,自编码网络按照一定的学习规则将高维轨迹嵌人到二维平面上,通过该网络的逆向映射得到轨迹的生成模型,由轨迹生成模型可得到一系列可行性轨迹.跟踪过程中,每时刻粒子滤波器的粒子便从这些可行性轨迹中进行抽样,并利用颜色似然函数对抽取的粒子进行加权以及再抽样从而实现对目标状态的估计,最后在二维平面中利用"最小距离分类器"对跟踪轨迹进行识别.特别地,自编码网络提供了高维轨迹空间和低维嵌套结构的双向映射,有效解决了大多数非线性降维方法(例如局部线性嵌入算法(LLE)和等度规映射(ISOMAP))所不具备的逆向映射问题.跟踪和识别手写数字实验表明所提出的方法能在复杂背景下精确跟踪目标并正确识别目标轨迹.  相似文献   

Natural scenes contain a wide range of textured motion phenomena which are characterized by the movement of a large amount of particle and wave elements, such as falling snow, wavy water, and dancing grass. In this paper, we present a generative model for representing these motion patterns and study a Markov chain Monte Carlo algorithm for inferring the generative representation from observed video sequences. Our generative model consists of three components. The first is a photometric model which represents an image as a linear superposition of image bases selected from a generic and overcomplete dictionary. The dictionary contains Gabor and LoG bases for point/particle elements and Fourier bases for wave elements. These bases compete to explain the input images and transfer them to a token (base) representation with an O(10(2))-fold dimension reduction. The second component is a geometric model which groups spatially adjacent tokens (bases) and their motion trajectories into a number of moving elements--called "motons." A moton is a deformable template in time-space representing a moving element, such as a falling snowflake or a flying bird. The third component is a dynamic model which characterizes the motion of particles, waves, and their interactions. For example, the motion of particle objects floating in a river, such as leaves and balls, should be coupled with the motion of waves. The trajectories of these moving elements are represented by coupled Markov chains. The dynamic model also includes probabilistic representations for the birth/death (source/sink) of the motons. We adopt a stochastic gradient algorithm for learning and inference. Given an input video sequence, the algorithm iterates two steps: 1) computing the motons and their trajectories by a number of reversible Markov chain jumps, and 2) learning the parameters that govern the geometric deformations and motion dynamics. Novel video sequences are synthesized from the learned models and, by editing the model parameters, we demonstrate the controllability of the generative model.  相似文献   

In this work we propose algorithms to learn the locations of static occlusions and reason about both static and dynamic occlusion scenarios in multi-camera scenes for 3D surveillance (e.g., reconstruction, tracking). We will show that this leads to a computer system which is able to more effectively track (follow) objects in video when they are obstructed from some of the views. Because of the nature of the application area, our algorithm will be under the constraints of using few cameras (no more than 3) that are configured wide-baseline. Our algorithm consists of a learning phase, where a 3D probabilistic model of occlusions is estimated per-voxel, per-view over time via an iterative framework. In this framework, at each frame the visual hull of each foreground object (person) is computed via a Markov Random Field that integrates the occlusion model. The model is then updated at each frame using this solution, providing an iterative process that can accurately estimate the occlusion model over time and overcome the few-camera constraint. We demonstrate the application of such a model to a number of areas, including visual hull reconstruction, the reconstruction of the occluding structures themselves, and 3D tracking.  相似文献   

The multi-target tracking problem is challenging when there exist occlusions, tracking failures of the detector and severe interferences between detections. In this paper, we propose a novel detection based tracking method that links detections into tracklets and further forms long trajectories. Unlike many previous hierarchical frameworks which split the data association into two separate optimization problems (linking detections locally and linking tracklets globally), we introduce a unified algorithm that can automatically relearn the trajectory models from the local and global information for finding the joint optimal assignment. In each temporal window, the trajectory models are initialized by the local information to link those easy-to-connect detections into a set of tracklets. Then the trajectory models are updated by the reliable tracklets and reused to link separated tracklets into long trajectories. We iteratively update the trajectory models by more information from more frames until the result converges. The iterative process gradually improves the accuracy of the trajectory models, which in turn improves the target ID inferences for all detections by the MRF model. Experiment results revealed that our proposed method achieved state-of-the-art multi-target tracking performance.  相似文献   

在临床医学中,确定疼痛如何从急性转变为慢性是有效预防和治疗的关键。因此目前迫切需要一种定量和预测的方法来评估疼痛的分类,从而能够更好地干预治疗。本文采用计算机建模的方法,利用下丘脑-垂体-肾上腺轴(HPA)模型来模仿疼痛轨迹,我们根据cortisol的取值,把它分为高低两个状态,cortisol取值高的状态用来模拟高强度的疼痛,取值低的状态用来模拟轻微疼痛或者没有痛苦。通过对其模拟数据进行研究,分析了急性慢性疼痛的转变。其仿真结果说明,神经计算科学在评估病人模型方面具有可行性和很大的潜力。  相似文献   

孔锐  黄钢 《自动化学报》2020,46(1):94-107
生成式对抗网络(Generative adversarial networks,GAN)是主要的以无监督方式学习深度生成模型的方法之一.基于可微生成器网络的生成式建模方法,是目前最热门的研究领域,但由于真实样本分布的复杂性,导致GAN生成模型在训练过程稳定性、生成质量等方面均存在不少问题.在生成式建模领域,对网络结构的探索是重要的一个研究方向,本文利用胶囊神经网络(Capsule networks,CapsNets)重构生成对抗网络模型结构,在训练过程中使用了Wasserstein GAN(WGAN)中提出的基于Earth-mover距离的损失函数,并在此基础上加以条件约束来稳定模型生成过程,从而建立带条件约束的胶囊生成对抗网络(Conditional-CapsuleGAN,C-CapsGAN).通过在MNIST和CIF AR-10数据集上的多组实验,结果表明将CapsNets应用到生成式建模领域是可行的,相较于现有类似模型,C-CapsGAN不仅能在图像生成任务中稳定生成高质量图像,同时还能更有效地抑制模式坍塌情况的发生.  相似文献   

王星  杜伟  陈吉  陈海涛 《控制与决策》2020,35(8):1887-1894
作为样本生成的重要方法之一,生成式对抗网络(GAN)可以根据任意给定数据集中的数据分布生成样本,但它在实际的训练过程中存在生成样本纹理模糊、训练过程不稳定以及模式坍塌等问题.针对以上问题,在深度卷积生成式对抗网络(DCGAN)的基础上,结合残差网络,设计一种基于深度残差生成式对抗网络的样本生成方法RGAN.该样本生成方法利用残差网络和卷积网络分别构建生成模型和判别模型,并结合正负样本融合训练的学习优化策略进行优化训练.其中:深度残差网络可以恢复出丰富的图像纹理;正负样本融合训练的方式可以增加对抗网络的鲁棒性,有效缓解对抗网络训练不稳定和模式坍塌现象的发生.在102 Category Flower Dataset数据集上设计多个仿真实验,实验结果表明RGAN能有效提高生成样本的质量.  相似文献   

We propose a framework for tracking multiple targets, where the input is a set of candidate regions in each frame, as obtained from a state-of-the-art background segmentation module, and the goal is to recover trajectories of targets over time. Due to occlusions by targets and static objects, as also by noisy segmentation and false alarms, one foreground region may not correspond to one target faithfully. Therefore, the one-to-one assumption used in most data association algorithms is not always satisfied. Our method overcomes the one-to-one assumption by formulating the visual tracking problem in terms of finding the best spatial and temporal association of observations, which maximizes the consistency of both motion and appearance of trajectories. To avoid enumerating all possible solutions, we take a data-driven Markov Chain Monte Carlo (DD-MCMC) approach to sample the solution space efficiently. The sampling is driven by an informed proposal scheme controlled by a joint probability model combining motion and appearance. Comparative experiments with quantitative evaluations are provided.  相似文献   

本文针对不同场景图像之间的转换问题,提出了一种改进的生成对抗网络模型,能够生成高质量的目标场景图像.在生成目标图像过程中存在因为向下采样而丢失原图像空间位置信息的现象,因此本文设计了一个包含跳跃连接和残差块的生成网络,通过在网络中加入多个跳跃连接部分,将图像的空间位置信息在网络中保持传递.同时为提高训练过程中生成图像在结构上的稳定性,引入SSIM图像结构相似指数,作为结构重建损失,以指导模型生成更优结构的目标图像.此外,为使得转换后的目标场景图像保留更多的色彩细节,加入了身份保持损失,明显增强了目标生成图像的色彩表现力.实验结果表明,本文所提的改进生成对抗网络模型能够在场景图像转换中得到有效地应用.  相似文献   

This paper presents a computer vision system for tracking high-speed non-rigid skaters over a larger rink in short track speed skating competitions. The outputs of the tracking system are spatio-temporal trajectories of the skaters which can be further processed and analyzed by sports experts. To capture highly complex and dynamic scenes, the camera pans very fast, therefore, tracking amorphous skaters becomes a challenging task. We propose a new method for (1) automatically computing the transformation matrices to map each frame to the globally consistent model of the rink; (2) incorporating the hierarchical model based on the contextual knowledge and multiple cues into the unscented Kalman filter to improve the tracking performance when occlusions occur; (3) evaluating the precision of our practical system objectively. Experimental results show that the proposed algorithm is very efficient and effective on the video recorded in the World Short Track Speed Skating Championships.  相似文献   

Data-driven generation of spatio-temporal routines in human mobility   总被引:1,自引:0,他引:1  
The generation of realistic spatio-temporal trajectories of human mobility is of fundamental importance in a wide range of applications, such as the developing of protocols for mobile ad-hoc networks or what-if analysis in urban ecosystems. Current generative algorithms fail in accurately reproducing the individuals’ recurrent schedules and at the same time in accounting for the possibility that individuals may break the routine during periods of variable duration. In this article we present Ditras (DIary-based TRAjectory Simulator), a framework to simulate the spatio-temporal patterns of human mobility. Ditras operates in two steps: the generation of a mobility diary and the translation of the mobility diary into a mobility trajectory. We propose a data-driven algorithm which constructs a diary generator from real data, capturing the tendency of individuals to follow or break their routine. We also propose a trajectory generator based on the concept of preferential exploration and preferential return. We instantiate Ditras with the proposed diary and trajectory generators and compare the resulting algorithm with real data and synthetic data produced by other generative algorithms, built by instantiating Ditras with several combinations of diary and trajectory generators. We show that the proposed algorithm reproduces the statistical properties of real trajectories in the most accurate way, making a step forward the understanding of the origin of the spatio-temporal patterns of human mobility.  相似文献   

情感分类是通过分析数据中的情感信息,来预测数据所传递的情感倾向.其中结合语言学词典与产生式分类器构造带有先验知识的分类模型,是一类重要的研究课题.通过研究情感词的领域性和不同权重的特性,提出了一种新的融入情感先验知识的情感分类方法.通过自动分析构造领域相关的情感词及其权重信息,将其作为情感先验知识,融入到产生式分类模型...  相似文献   

针对长短期记忆网络(LSTM)在行人轨迹预测问题中孤立考虑单个行人,且无法进行多种可能性预测的问题,提出基于注意力机制的行人轨迹预测生成模型(AttenGAN),来对行人交互模式进行建模和概率性地对多种合理可能性进行预测。AttenGAN包括一个生成器和一个判别器,生成器根据行人过去的轨迹概率性地对未来进行多种可能性预测,判别器用来判断一个轨迹是真实的还是由生成器伪造生成的,进而促进生成器生成符合社会规范的预测轨迹。生成器由一个编码器和一个解码器组成,在每一个时刻,编码器的LSTM综合注意力机制给出的其他行人的状态,将当前行人个体的信息编码为隐含状态。预测时,首先用编码器LSTM的隐含状态和一个高斯噪声连接来对解码器LSTM的隐含状态初始化,解码器LSTM将其解码为对未来的轨迹预测。在ETH和UCY数据集上的实验结果表明,AttenGAN模型不仅能够给出符合社会规范的多种合理的轨迹预测,并且在预测精度上相比传统的线性模型(Linear)、LSTM模型、社会长短期记忆网络模型(S-LSTM)和社会对抗网络(S-GAN)模型有所提高,尤其在行人交互密集的场景下具有较高的精度性能。对生成器多次采样得到的预测轨迹的可视化结果表明,所提模型具有综合行人交互模式,对未来进行联合性、多种可能性预测的能力。  相似文献   

Person re-identification aims to recognize the same person viewed by disjoint cameras at different time instants and locations. In this paper, after an extensive review of state-of-the-art approaches, we propose a re-identification method that takes into account the appearance of people, the spatial location of cameras and potential paths a person can choose to follow. This choice is modeled with a set of areas of interest (landmarks) that constrain the propagation of people trajectories in non-observed regions between the field-of-view of cameras. We represent people with a selective patch around their upper body to work in crowded scenes when occlusions are frequent. We demonstrate the proposed method in a challenging scenario from London Gatwick airport and compare it to well-known person re-identification methods, highlighting their strengths and limitations. Finally, we show by Cumulative Matching Characteristic curve that the best performance results by modeling people movements in non-observed regions combined with appearance methods, achieving an average improvement of 6% when only appearance is used and 15% when only motion is used for the association of people across cameras.  相似文献   

The integration of reinforcement learning (RL) and imitation learning (IL) is an important problem that has long been studied in the field of intelligent robotics. RL optimizes policies to maximize the cumulative reward, whereas IL attempts to extract general knowledge about the trajectories demonstrated by experts, i.e, demonstrators. Because each has its own drawbacks, many methods combining them and compensating for each set of drawbacks have been explored thus far. However, many of these methods are heuristic and do not have a solid theoretical basis. This paper presents a new theory for integrating RL and IL by extending the probabilistic graphical model (PGM) framework for RL, control as inference. We develop a new PGM for RL with multiple types of rewards, called probabilistic graphical model for Markov decision processes with multiple optimality emissions (pMDP-MO). Furthermore, we demonstrate that the integrated learning method of RL and IL can be formulated as a probabilistic inference of policies on pMDP-MO by considering the discriminator in generative adversarial imitation learning (GAIL) as an additional optimality emission. We adapt the GAIL and task-achievement reward to our proposed framework, achieving significantly better performance than policies trained with baseline methods.  相似文献   

实际生活中目标间存在的遮挡会造成待检测目标的特征缺失,进而使得检测准确度降低.鉴于此,提出一种用于被遮挡特征学习的生成对抗网络(generative adversarial networks for learning occluded features,GANLOF).被遮挡特征学习网络分为被遮挡特征生成器、鉴别器两个...  相似文献   

Action recognition is one of the most important components for video analysis. In addition to objects and atomic actions, temporal relationships are important characteristics for many actions and are not fully exploited in many approaches. We model the temporal structures of midlevel actions (referred to as components) based on dense trajectory components, obtained by clustering individual trajectories. The trajectory components are a higher level and a more stable representation than raw individual trajectories. Based on the temporal ordering of trajectory components, we describe the temporal structure using Allen's temporal relationships in a discriminative manner and combine it with a generative model using bag of components. The main idea behind the model is to extract midlevel features from domain‐independent dense trajectories and classify the actions by exploring the temporal structure among these midlevel features based on a set of relationships. We evaluate the proposed approach on public data sets and compare it with a bag‐of‐words–based approach and state‐of‐the‐art application of the Markov logic network for action recognition. The results demonstrate that the proposed approach produces better recognition accuracy.  相似文献   

针对原始扩展目标高斯混合概率假设密度(Extended Target Gaussian Mixture Probability Hypothesis Density,ET-GM-PHD)滤波算法不能解决机动目标跟踪问题,在高斯混合概率假设密度(Gaussian Mixture Probability Hypothesis Density,GM-PHD)滤波框架下,引入修正的输入估计算法(Modified Input Estimation,MIE),可以有效地处理多扩展目标的机动问题。此外,提出的算法虽然可以实现对未知数目的多机动扩展目标进行跟踪,但无法获得各个目标的航迹。针对此问题,进一步引入高斯分量标记方法,有效地将多机动扩展目标的航迹进行准确关联,获取各个目标的航迹。实验结果表明,提出的算法在弱机动扩展目标跟踪中具有较好的跟踪性能,同时能够有效地估计多扩展目标的航迹。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号