首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The ill-posed nature of missing variable models offers a challenging testing ground for new computational techniques. This is the case for the mean-field variational Bayesian inference. The behavior of this approach in the setting of the Bayesian probit model is illustrated. It is shown that the mean-field variational method always underestimates the posterior variance and, that, for small sample sizes, the mean-field variational approximation to the posterior location could be poor.  相似文献   

2.
Convex multi-task feature learning   总被引:1,自引:1,他引:1  
We present a method for learning sparse representations shared across multiple tasks. This method is a generalization of the well-known single-task 1-norm regularization. It is based on a novel non-convex regularizer which controls the number of learned features common across the tasks. We prove that the method is equivalent to solving a convex optimization problem for which there is an iterative algorithm which converges to an optimal solution. The algorithm has a simple interpretation: it alternately performs a supervised and an unsupervised step, where in the former step it learns task-specific functions and in the latter step it learns common-across-tasks sparse representations for these functions. We also provide an extension of the algorithm which learns sparse nonlinear representations using kernels. We report experiments on simulated and real data sets which demonstrate that the proposed method can both improve the performance relative to learning each task independently and lead to a few learned features common across related tasks. Our algorithm can also be used, as a special case, to simply select—not learn—a few common variables across the tasks. Editors: Daniel Silver, Kristin Bennett, Richard Caruana. This is a longer version of the conference paper (Argyriou et al. in Advances in neural information processing systems, vol. 19, 2007a). It includes new theoretical and experimental results.  相似文献   

3.
We propose a simulation-based method for calculating maximum likelihood estimators in latent variable models. The proposed method integrates a recently developed sampling strategy, the so-called Sample Average Approximation method, to efficiently compute high quality solutions of the estimation problem. Theoretical and algorithmic properties of the method are discussed. A computational study, involving two numerical examples, is presented to highlight a significant improvement of the proposed approach over existing methods.  相似文献   

4.
In the behavioral, biomedical, and social-psychological sciences, mixed data types such as continuous, ordinal, count, and nominal are common. Subpopulations also often exist and contribute to heterogeneity in the data. In this paper, we propose a mixture of generalized latent variable models (GLVMs) to handle mixed types of heterogeneous data. Different link functions are specified to model data of multiple types. A Bayesian approach, together with the Markov chain Monte Carlo (MCMC) method, is used to conduct the analysis. A modified DIC is used for model selection of mixture components in the GLVMs. A simulation study shows that our proposed methodology performs satisfactorily. An application of mixture GLVM to a data set from the National Longitudinal Surveys of Youth (NLSY) is presented.  相似文献   

5.
In this paper, we propose a novel visual tracking algorithm using the collaboration of generative and discriminative trackers under the particle filter framework. Each particle denotes a single task, and we encode all the tasks simultaneously in a structured multi-task learning manner. Then, we implement generative and discriminative trackers, respectively. The discriminative tracker considers the overall information of object to represent the object appearance; while the generative tracker takes the local information of object into account for handling partial occlusions. Therefore, two models are complementary during the tracking. Furthermore, we design an effective dictionary updating mechanism. The dictionary is composed of fixed and variational parts. The variational parts are progressively updated using Metropolis–Hastings strategy. Experiments on different challenging video sequences demonstrate that the proposed tracker performs favorably against several state-of-the-art trackers.  相似文献   

6.
Parameter constraints in generalized linear latent variable models are discussed. Both linear equality and inequality constraints are considered. Maximum likelihood estimators for the parameters of the constrained model and corrected standard errors are derived. A significant reduction in the dimension of the optimization problem is achieved with the proposed methodology for fitting models subject to linear equality constraints.  相似文献   

7.
The identification of linear parameter-varying systems in an input-output setting is investigated, focusing on the case when the noise part of the data generating system is an additive colored noise. In the Box-Jenkins and output-error cases, it is shown that the currently available linear regression and instrumental variable methods from the literature are far from being optimal in terms of bias and variance of the estimates. To overcome the underlying problems, a refined instrumental variable method is introduced. The proposed approach is compared to the existing methods via a representative simulation example.  相似文献   

8.
针对小数据集情况下贝叶斯网络(BN)参数学习结果精度较低的问题,分析了小数据集情况下BN参数变权重设计的必要性,提出一种基于变权重融合的BN参数学习算法VWPL。首先根据专家经验确定不等式约束条件,计算参数学习最小样本数据集阈值,设计了随样本量变化的变权重因子函数;然后根据样本计算出初始参数集,通过Bootstrap方法进行参数扩展得到满足约束条件的候选参数集,将其代入BN变权重参数计算模型即可获取最终的BN参数。实验结果表明,当学习数据量较小时,VWPL算法的学习精度高于MLE算法和QMAP算法的,也优于定权重学习算法的。另外,将VWPL算法成功应用到了轴承故障诊断实验中,为在小数据集上进行BN参数估计提供了一种方法。  相似文献   

9.
Antti  Jeremias  Esa 《Neurocomputing》2008,71(7-9):1311-1320
Independent variable group analysis (IVGA) is a method for grouping dependent variables together while keeping mutually independent or weakly dependent variables in separate groups. In this paper two variants of an agglomerative method for learning a hierarchy of IVGA groupings are presented. The method resembles hierarchical clustering, but the choice of clusters to merge is based on variational Bayesian model comparison. This is approximately equivalent to using a distance measure based on a model-based approximation of mutual information between groups of variables. The approach also allows determining optimal cutoff points for the hierarchy. The method is demonstrated to find sensible groupings of variables that can be used for feature selection and ease construction of a predictive model.  相似文献   

10.
A Bayesian approach to variable selection which is based on the expected Kullback-Leibler divergence between the full model and its projection onto a submodel has recently been suggested in the literature. For generalized linear models an extension of this idea is proposed by considering projections onto subspaces defined via some form of L1 constraint on the parameter in the full model. This leads to Bayesian model selection approaches related to the lasso. In the posterior distribution of the projection there is positive probability that some components are exactly zero and the posterior distribution on the model space induced by the projection allows exploration of model uncertainty. Use of the approach in structured variable selection problems such as ANOVA models is also considered, where it is desired to incorporate main effects in the presence of interactions. Projections related to the non-negative garotte are able to respect the hierarchical constraints. A consistency result is given concerning the posterior distribution on the model induced by the projection, showing that for some projections related to the adaptive lasso and non-negative garotte the posterior distribution concentrates on the true model asymptotically.  相似文献   

11.
Transfer in variable-reward hierarchical reinforcement learning   总被引:1,自引:1,他引:1  
Transfer learning seeks to leverage previously learned tasks to achieve faster learning in a new task. In this paper, we consider transfer learning in the context of related but distinct Reinforcement Learning (RL) problems. In particular, our RL problems are derived from Semi-Markov Decision Processes (SMDPs) that share the same transition dynamics but have different reward functions that are linear in a set of reward features. We formally define the transfer learning problem in the context of RL as learning an efficient algorithm to solve any SMDP drawn from a fixed distribution after experiencing a finite number of them. Furthermore, we introduce an online algorithm to solve this problem, Variable-Reward Reinforcement Learning (VRRL), that compactly stores the optimal value functions for several SMDPs, and uses them to optimally initialize the value function for a new SMDP. We generalize our method to a hierarchical RL setting where the different SMDPs share the same task hierarchy. Our experimental results in a simplified real-time strategy domain show that significant transfer learning occurs in both flat and hierarchical settings. Transfer is especially effective in the hierarchical setting where the overall value functions are decomposed into subtask value functions which are more widely amenable to transfer across different SMDPs.  相似文献   

12.
The Bayesian neural networks are useful tools to estimate the functional structure in the nonlinear systems. However, they suffer from some complicated problems such as controlling the model complexity, the training time, the efficient parameter estimation, the random walk, and the stuck in the local optima in the high-dimensional parameter cases. In this paper, to alleviate these mentioned problems, a novel hybrid Bayesian learning procedure is proposed. This approach is based on the full Bayesian learning, and integrates Markov chain Monte Carlo procedures with genetic algorithms and the fuzzy membership functions. In the application sections, to examine the performance of proposed approach, nonlinear time series and regression analysis are handled separately, and it is compared with the traditional training techniques in terms of their estimation and prediction abilities.  相似文献   

13.
We propose a model for a point-referenced spatially correlated ordered categorical response and methodology for inference. Models and methods for spatially correlated continuous response data are widespread, but models for spatially correlated categorical data, and especially ordered multi-category data, are less developed. Bayesian models and methodology have been proposed for the analysis of independent and clustered ordered categorical data, and also for binary and count point-referenced spatial data. We combine and extend these methods to describe a Bayesian model for point-referenced (as opposed to lattice) spatially correlated ordered categorical data. We include simulation results and show that our model offers superior predictive performance as compared to a non-spatial cumulative probit model and a more standard Bayesian generalized linear spatial model. We demonstrate the usefulness of our model in a real-world example to predict ordered categories describing stream health within the state of Maryland.  相似文献   

14.
Recently, there has been an increasing interest in directed probabilistic logical models and a variety of formalisms for describing such models has been proposed. Although many authors provide high-level arguments to show that in principle models in their formalism can be learned from data, most of the proposed learning algorithms have not yet been studied in detail. We introduce an algorithm, generalized ordering-search, to learn both structure and conditional probability distributions (CPDs) of directed probabilistic logical models. The algorithm is based on the ordering-search algorithm for Bayesian networks. We use relational probability trees as a representation for the CPDs. We present experiments on a genetics domain, blocks world domains and the Cora dataset. Editors: Stephen Muggleton, Ramon Otero, Simon Colton.  相似文献   

15.
The success of Semantic Web will heavily rely on the availability of formal ontologies to structure machine understanding data. However, there is still a lack of general methodologies for ontology automatic learning and population, i.e. the generation of domain ontologies from various kinds of resources by applying natural language processing and machine learning techniques In this paper, the authors present an ontology learning and population system that combines both statistical and semantic methodologies. Several experiments have been carried out, demonstrating the effectiveness of the proposed approach.  相似文献   

16.
In transfer learning the aim is to solve new learning tasks using fewer examples by using information gained from solving related tasks. Existing transfer learning methods have been used successfully in practice and PAC analysis of these methods have been developed. But the key notion of relatedness between tasks has not yet been defined clearly, which makes it difficult to understand, let alone answer, questions that naturally arise in the context of transfer, such as, how much information to transfer, whether to transfer information, and how to transfer information across tasks. In this paper, we look at transfer learning from the perspective of Algorithmic Information Theory/Kolmogorov complexity theory, and formally solve these problems in the same sense Solomonoff Induction solves the problem of inductive inference. We define universal measures of relatedness between tasks, and use these measures to develop universally optimal Bayesian transfer learning methods. We also derive results in AIT that are interesting by themselves. To address a concern that arises from the theory, we also briefly look at the notion of Kolmogorov complexity of probability measures. Finally, we present a simple practical approximation to the theory to do transfer learning and show that even these are quite effective, allowing us to transfer across tasks that are superficially unrelated. The latter is an experimental feat which has not been achieved before, and thus shows the theory is also useful in constructing practical transfer algorithms.  相似文献   

17.
Transfer learning is the ability to apply previously learned knowledge to new problems or domains. In qualitative reasoning, model formulation is the process of moving from the unruly, broad set of concepts used in everyday life to a concise, formal vocabulary of abstractions, assumptions, causal relationships, and models that support problem-solving. Approaching transfer learning from a model formulation perspective, we found that analogy with examples can be used to learn how to solve AP Physics style problems. We call this process analogical model formulation and implement it in the Companion cognitive architecture. A Companion begins with some basic mathematical skills, a broad common sense ontology, and some qualitative mechanics, but no equations. The Companion uses worked solutions, explanations of example problems at the level of detail appearing in textbooks, to learn what equations are relevant, how to use them, and the assumptions necessary to solve physics problems. We present an experiment, conducted by the Educational Testing Service, demonstrating that analogical model formulation enables a Companion to learn to solve AP Physics style problems. Across six different variations of relationships between base and target problems, or transfer levels, a Companion exhibited a 63% improvement in initial performance. While already a significant result, we describe an in-depth analysis of this experiment to pinpoint the causes of failures. Interestingly, the sources of failures were primarily due to errors in the externally generated problem and worked solution representations as well as some domain-specific problem-solving strategies, not analogical model formulation. To verify this, we describe a second experiment which was performed after fixing these problems. In this second experiment, a Companion achieved a 95.8% improvement in initial performance due to transfer, which is nearly perfect. We know of no other problem-solving experiments which demonstrate performance of analogical learning over systematic variations of relationships between problems at this scale.  相似文献   

18.
This article presents an overview of Probabilistic Automata (PA) and discrete Hidden Markov Models (HMMs), and aims at clarifying the links between them. The first part of this work concentrates on probability distributions generated by these models. Necessary and sufficient conditions for an automaton to define a probabilistic language are detailed. It is proved that probabilistic deterministic automata (PDFA) form a proper subclass of probabilistic non-deterministic automata (PNFA). Two families of equivalent models are described next. On one hand, HMMs and PNFA with no final probabilities generate distributions over complete finite prefix-free sets. On the other hand, HMMs with final probabilities and probabilistic automata generate distributions over strings of finite length. The second part of this article presents several learning models, which formalize the problem of PA induction or, equivalently, the problem of HMM topology induction and parameter estimation. These learning models include the PAC and identification with probability 1 frameworks. Links with Bayesian learning are also discussed. The last part of this article presents an overview of induction algorithms for PA or HMMs using state merging, state splitting, parameter pruning and error-correcting techniques.  相似文献   

19.
This article addresses some problems in outlier detection and variable selection in linear regression models. First, in outlier detection there are problems known as smearing and masking. Smearing means that one outlier makes another, non-outlier observation appear as an outlier, and masking that one outlier prevents another one from being detected. Detecting outliers one by one may therefore give misleading results. In this article a genetic algorithm is presented which considers different possible groupings of the data into outlier and non-outlier observations. In this way all outliers are detected at the same time. Second, it is known that outlier detection and variable selection can influence each other, and that different results may be obtained, depending on the order in which these two tasks are performed. It may therefore be useful to consider these tasks simultaneously, and a genetic algorithm for a simultaneous outlier detection and variable selection is suggested. Two real data sets are used to illustrate the algorithms, which are shown to work well. In addition, the scalability of the algorithms is considered with an experiment using generated data.I would like to thank Dr Tero Aittokallio and an anonymous referee for useful comments.  相似文献   

20.
We present a novel hybrid algorithm for Bayesian network structure learning, called H2PC. It first reconstructs the skeleton of a Bayesian network and then performs a Bayesian-scoring greedy hill-climbing search to orient the edges. The algorithm is based on divide-and-conquer constraint-based subroutines to learn the local structure around a target variable. We conduct two series of experimental comparisons of H2PC against Max–Min Hill-Climbing (MMHC), which is currently the most powerful state-of-the-art algorithm for Bayesian network structure learning. First, we use eight well-known Bayesian network benchmarks with various data sizes to assess the quality of the learned structure returned by the algorithms. Our extensive experiments show that H2PC outperforms MMHC in terms of goodness of fit to new data and quality of the network structure with respect to the true dependence structure of the data. Second, we investigate H2PC’s ability to solve the multi-label learning problem. We provide theoretical results to characterize and identify graphically the so-called minimal label powersets that appear as irreducible factors in the joint distribution under the faithfulness condition. The multi-label learning problem is then decomposed into a series of multi-class classification problems, where each multi-class variable encodes a label powerset. H2PC is shown to compare favorably to MMHC in terms of global classification accuracy over ten multi-label data sets covering different application domains. Overall, our experiments support the conclusions that local structural learning with H2PC in the form of local neighborhood induction is a theoretically well-motivated and empirically effective learning framework that is well suited to multi-label learning. The source code (in R) of H2PC as well as all data sets used for the empirical tests are publicly available.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号