首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
We consider the problem of predicting a sequence of real-valued multivariate states that are correlated by some unknown dynamics, from a given measurement sequence. Although dynamic systems such as the State-Space Models are popular probabilistic models for the problem, their joint modeling of states and observations, as well as the traditional generative learning by maximizing a joint likelihood may not be optimal for the ultimate prediction goal. In this paper, we suggest two novel discriminative approaches to the dynamic state prediction: 1) learning generative state-space models with discriminative objectives and 2) developing an undirected conditional model. These approaches are motivated by the success of recent discriminative approaches to the structured output classification in discrete-state domains, namely, discriminative training of Hidden Markov Models and Conditional Random Fields (CRFs). Extending CRFs to real multivariate state domains generally entails imposing density integrability constraints on the CRF parameter space, which can make the parameter learning difficult. We introduce an efficient convex learning algorithm to handle this task. Experiments on several problem domains, including human motion and robot-arm state estimation, indicate that the proposed approaches yield high prediction accuracy comparable to or better than state-of-the-art methods.  相似文献   

2.
3.
This correspondence presents a two-stage classification learning algorithm. The first stage approximates the class-conditional distribution of a discrete space using a separate mixture model, and the second stage investigates the class posterior probabilities by training a network. The first stage explores the generative information that is inherent in each class by using the Chow-Liu (CL) method, which approximates high-dimensional probability with a tree structure, namely, a dependence tree, whereas the second stage concentrates on discriminative learning to distinguish between classes. The resulting learning algorithm integrates the advantages of both generative learning and discriminative learning. Because it uses CL dependence-tree estimation, we call our algorithm CL-Net. Empirical tests indicate that the proposed learning algorithm makes significant improvements when compared with the related classifiers that are constructed by either generative learning or discriminative learning.  相似文献   

4.
Tracking People on a Torus   总被引:1,自引:0,他引:1  
We present a framework for monocular 3D kinematic pose tracking and viewpoint estimation of periodic and quasi-periodic human motions from an uncalibrated camera. The approach we introduce here is based on learning both the visual observation manifold and the kinematic manifold of the motion using a joint representation. We show that the visual manifold of the observed shape of a human performing a periodic motion, observed from different viewpoints, is topologically equivalent to a torus manifold. The approach we introduce here is based on the supervised learning of both the visual and kinematic manifolds. Instead of learning an embedding of the manifold, we learn the geometric deformation between an ideal manifold (conceptual equivalent topological structure) and a twisted version of the manifold (the data). Experimental results show accurate estimation of the 3D body posture and the viewpoint from a single uncalibrated camera.  相似文献   

5.
In this paper, we propose a robust tracking algorithm to handle drifting problem. This algorithm consists of two parts: the first part is the G&D part that combines Generative model and Discriminative model for tracking, and the second part is the View-Based model for target appearance that corrects the result of the G&D part if necessary. In G&D part, we use the Maximum Margin Projection (MMP) to construct a graph model to preserve both local geometrical and discriminant structures of the data manifold in low dimensions. Therefore, such discriminative subspace combined with traditional generative subspace can benefit from both models. In addition, we address the problem of learning maximum margin projection under the Spectral Regression (SR) which results in significant savings in computational time. To further solve the drift, an online learned sparsely represented view-based model of the target is complementary to the G&D part. When the result of G&D part is unreliable, the view-based model can rectify the result in order to avoid drifting. Experimental results on several challenging video sequences demonstrate the effectiveness and robustness of our approach.  相似文献   

6.
Natural scenes contain a wide range of textured motion phenomena which are characterized by the movement of a large amount of particle and wave elements, such as falling snow, wavy water, and dancing grass. In this paper, we present a generative model for representing these motion patterns and study a Markov chain Monte Carlo algorithm for inferring the generative representation from observed video sequences. Our generative model consists of three components. The first is a photometric model which represents an image as a linear superposition of image bases selected from a generic and overcomplete dictionary. The dictionary contains Gabor and LoG bases for point/particle elements and Fourier bases for wave elements. These bases compete to explain the input images and transfer them to a token (base) representation with an O(10(2))-fold dimension reduction. The second component is a geometric model which groups spatially adjacent tokens (bases) and their motion trajectories into a number of moving elements--called "motons." A moton is a deformable template in time-space representing a moving element, such as a falling snowflake or a flying bird. The third component is a dynamic model which characterizes the motion of particles, waves, and their interactions. For example, the motion of particle objects floating in a river, such as leaves and balls, should be coupled with the motion of waves. The trajectories of these moving elements are represented by coupled Markov chains. The dynamic model also includes probabilistic representations for the birth/death (source/sink) of the motons. We adopt a stochastic gradient algorithm for learning and inference. Given an input video sequence, the algorithm iterates two steps: 1) computing the motons and their trajectories by a number of reversible Markov chain jumps, and 2) learning the parameters that govern the geometric deformations and motion dynamics. Novel video sequences are synthesized from the learned models and, by editing the model parameters, we demonstrate the controllability of the generative model.  相似文献   

7.
Minyoung Kim 《Pattern recognition》2011,44(10-11):2325-2333
We introduce novel discriminative semi-supervised learning algorithms for dynamical systems, and apply them to the problem of 3D human motion estimation. Our recent work on discriminative learning of dynamical systems has been proven to achieve superior performance than traditional generative learning approaches. However, one of the main issues of learning the dynamical systems is to gather labeled output sequences which are typically obtained from precise motion capture tools, hence expensive. In this paper we utilize a large amount of unlabeled (input) video data to improve the prediction performance of the dynamical systems significantly. We suggest two discriminative semi-supervised learning approaches that extend the well-known algorithms in static domains to the sequential, real-valued multivariate output domains: (i) self-training which we derive as coordinate ascent optimization of a proper discriminative objective over both model parameters and the unlabeled state sequences, (ii) minimum entropy approach which maximally reduces the model's uncertainty in state prediction for unlabeled data points. These approaches are shown to achieve significant improvement against the traditional generative semi-supervised learning methods. We demonstrate the benefits of our approaches on the 3D human motion estimation problems.  相似文献   

8.
The manifold regularization (MR) based semi-supervised learning could explore structural relationships from both labeled and unlabeled data. However, the model selection of MR seriously affects its predictive performance due to the inherent additional geometry regularizer of labeled and unlabeled data. In this paper, two continuous and two inherent discrete hyperparameters are selected as optimization variables, and a leave-one-out cross-validation (LOOCV) based Predicted REsidual Sum of Squares (PRESS) criterion is first presented for model selection of MR to choose appropriate regularization coefficients and kernel parameters. Considering the inherent discontinuity of the two hyperparameters, the minimization process is implemented by using a improved Nelder-Mead simplex algorithm to solve the inherent discrete and continues hybrid variables set. The manifold regularization and model selection algorithm are applied to six synthetic and real-life benchmark dataset. The proposed approach, leveraged by effectively exploiting the embedded intrinsic geometric manifolds and unbiased LOOCV estimation, outperforms the original MR and supervised learning approaches in the empirical study.  相似文献   

9.
Recognizing and tracking multiple activities are all extremely challenging machine vision tasks due to diverse motion types included and high-dimensional (HD) state space. To overcome these difficulties, a novel generative model called composite motion model (CMM) is proposed. This model contains a set of independent, low-dimensional (LD), and activity-specific manifold models that effectively constrain the state search space for 3D human motion recognition and tracking. This separate modeling of activity-specific movements can not only allow each manifold model to be optimized in accordance with only its respective movement, but also improve the scalability of the models. For accurate tracking with our CMM, a particle filter (PF) method is thus employed and then the particles can be distributed in all manifold models at each time step. In addition, an efficient activity switching strategy is proposed to dominate the particle distribution on all LD manifolds. To diffuse the particles amongst manifold models and respond quickly to the sudden changes in the activity, a set of visually-reasonable and kinematically-realistic transition bridges are synthesized by using the good properties of LD latent space and HD observation space, which enables the inter-activity motions seem more natural and realistic. Finally, a pose hypothesis that can best interpret the visual observation is selected and then used to recognize the activity that is currently observed. Extensive experiments, via qualitative and quantitative analyses, verify the effectiveness and robustness of our proposed CMM in the tasks of multi-activity 3D human motion recognition and tracking.  相似文献   

10.
This work is focused on the assessment of the use of GPU computation in dynamic texture segmentation under the mixture of dynamic textures (MDT) model. In this generative video model, the observed texture is a time-varying process commanded by a hidden state process. The use of mixtures in this model allows simultaneously handling of different visual processes. Nowadays, the use of GPU computing is growing in high-performance applications, but the adaptation of existing algorithms in such a way as to obtain a benefit from its use is not an easy task. In this paper, we made two implementations, one in CPU and the other in GPU, of a known segmentation algorithm based on MDT. In the MDT algorithm, there is a matrix inversion process that is highly demanding in terms of computing power. We make a comparison between the gain in performance obtained by porting to GPU this matrix inversion process and the gain obtained by porting to GPU the whole MDT segmentation process. We also study real-time motion segmentation performance by separating the learning part of the algorithm from the segmentation part, leaving the learning stage as an off-line process and keeping the segmentation as an online process. The results of performance analyses allow us to decide the cases in which the full GPU implementation of the motion segmentation process is worthwhile.  相似文献   

11.
局部线性嵌入(LLE)是一种经典流形学习方法,直接应用这种非监督的传统LLE估计图像中的头部姿态存在两点不足:未考虑图像像素空间信息和未利用样本标记信息.因此,本文结合图像欧式距离和偏置LLE流形学习方法,对头部姿态图像降维,并通过广义回归神经网络(GRNN)和多元线性回归的方法,估计头部图像的姿态.在FacePix头部姿态数据库的对比实验表明,本方法具有较好的头部姿态估计效果.  相似文献   

12.
We develop a method for the estimation of articulated pose, such as that of the human body or the human hand, from a single (monocular) image. Pose estimation is formulated as a statistical inference problem, where the goal is to find a posterior probability distribution over poses as well as a maximum a posteriori (MAP) estimate. The method combines two modeling approaches, one discriminative and the other generative. The discriminative model consists of a set of mapping functions that are constructed automatically from a labeled training set of body poses and their respective image features. The discriminative formulation allows for modeling ambiguous, one-to-many mappings (through the use of multi-modal distributions) that may yield multiple valid articulated pose hypotheses from a single image. The generative model is defined in terms of a computer graphics rendering of poses. While the generative model offers an accurate way to relate observed (image features) and hidden (body pose) random variables, it is difficult to use it directly in pose estimation, since inference is computationally intractable. In contrast, inference with the discriminative model is tractable, but considerably less accurate for the problem of interest. A combined discriminative/generative formulation is derived that leverages the complimentary strengths of both models in a principled framework for articulated pose inference. Two efficient MAP pose estimation algorithms are derived from this formulation; the first is deterministic and the second non-deterministic. Performance of the framework is quantitatively evaluated in estimating articulated pose of both the human hand and human body. Most of this work was done while the first author was with Boston University.  相似文献   

13.
This paper presents a new procedure for learning mixtures of independent component analyzers. The procedure includes non-parametric estimation of the source densities, supervised-unsupervised learning of the model parameters, incorporation of any independent component analysis (ICA) algorithm into the learning of the ICA mixtures, and estimation of residual dependencies after training for correction of the posterior probability of every class to the testing observation vector. We demonstrate the performance of the procedure in the classification of ICA mixtures of two, three, and four classes of synthetic data, and in the classification of defective materials, consisting of 3D finite element models and lab specimens, in non-destructive testing using the impact-echo technique. The application of the proposed posterior probability correction demonstrates an improvement in the classification accuracy. Semi-supervised learning shows that unlabeled data can degrade the performance of the classifier when they do not fit the generative model. Comparative results of the proposed method and standard ICA algorithms for blind source separation in one and multiple ICA data mixtures show the suitability of the non-parametric ICA mixture-based method for data modeling.  相似文献   

14.
运动遮挡边界处的运动估计是一种困难的问题,外极面图像方法将运动估计转化为转迹线的检测,人造物体的轨迹线容易通过边缘跟踪的方法获得,但对于纹理复杂的自然景物,轨迹跟踪较为困难。  相似文献   

15.
H.264作为新一代的视频标准,对基于块匹配法的运动估计做了新的改进。运动估计是视频编码过程中极其重要的一个环节,它的算法效率对整个编码效率有很大的影响。针对H.264标准中提出的1/4像素精度的运动估计,对其基本算法以及几种快速算法进行分析比较,并在此基础上提出了一种快速的矢量合成算法。实验结果证明,这种算法在提高搜索速度的同时,也保证了图像质量。  相似文献   

16.
Electricity spot prices are complex processes characterized by nonlinearity and extreme volatility. Previous work on nonlinear modeling of electricity spot prices has shown encouraging results, and we build on this area by proposing an Expectation Maximization algorithm for maximum likelihood estimation of recurrent neural networks utilizing the Kalman filter and smoother. This involves inference of both parameters and hyper-parameters of the model which takes into account the model uncertainty and noise in the data. The Expectation Maximization algorithm uses a forward filtering and backward smoothing (Expectation) step, followed by a hyper-parameter estimation (Maximization) step. The model is validated across two data sets of different power exchanges. It is found that after learning a posteriori hyper-parameters, the proposed algorithm outperforms the real-time recurrent learning and the extended Kalman Filtering algorithm for recurrent networks, as well as other contemporary models that have been previously applied to the modeling of electricity spot prices.  相似文献   

17.
Recent algorithms for sparse coding and independent component analysis (ICA) have demonstrated how localized features can be learned from natural images. However, these approaches do not take image transformations into account. We describe an unsupervised algorithm for learning both localized features and their transformations directly from images using a sparse bilinear generative model. We show that from an arbitrary set of natural images, the algorithm produces oriented basis filters that can simultaneously represent features in an image and their transformations. The learned generative model can be used to translate features to different locations, thereby reducing the need to learn the same feature at multiple locations, a limitation of previous approaches to sparse coding and ICA. Our results suggest that by explicitly modeling the interaction between local image features and their transformations, the sparse bilinear approach can provide a basis for achieving transformation-invariant vision.  相似文献   

18.
In this paper, we consider modeling data lying on multiple continuous manifolds. In particular, we model the shape manifold of a person performing a motion observed from different viewpoints along a view circle at a fixed camera height. We introduce a model that ties together the body configuration (kinematics) manifold and visual (observations) manifold in a way that facilitates tracking the 3D configuration with continuous relative view variability. The model exploits the low-dimensionality nature of both the body configuration manifold and the view manifold, where each of them are represented separately. The resulting representation is used for tracking complex motions within a Bayesian framework, in which the model provides a low-dimensional state representation as well as a constrained dynamic model for both body configuration and view variations. Experimental results estimating the 3D body posture from a single camera are presented for the HUMANEVA dataset and other complex motion video sequences.  相似文献   

19.
多层随机神经网络em算法   总被引:3,自引:1,他引:2  
本文讨论了基于微分流形框架随机神经网络学习算法,称为em学习算法;对于多层随机神经网络模型,我们从微分流形的角度分析它的对偶平坦流形结构,描述em算法对于多层前馈随机神经网络模型学习算法实现和加速技术。  相似文献   

20.
Probabilistic Detection and Tracking of Motion Boundaries   总被引:5,自引:1,他引:4  
We propose a Bayesian framework for representing and recognizing local image motion in terms of two basic models: translational motion and motion boundaries. Motion boundaries are represented using a non-linear generative model that explicitly encodes the orientation of the boundary, the velocities on either side, the motion of the occluding edge over time, and the appearance/disappearance of pixels at the boundary. We represent the posterior probability distribution over the model parameters given the image data using discrete samples. This distribution is propagated over time using a particle filtering algorithm. To efficiently represent such a high-dimensional space we initialize samples using the responses of a low-level motion discontinuity detector. The formulation and computational model provide a general probabilistic framework for motion estimation with multiple, non-linear, models.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号