首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We present a unified probabilistic framework for statistical language modeling which can simultaneously incorporate various aspects of natural language, such as local word interaction, syntactic structure and semantic document information. Our approach is based on a recent statistical inference principle we have proposed—the latent maximum entropy principle—which allows relationships over hidden features to be effectively captured in a unified model. Our work extends previous research on maximum entropy methods for language modeling, which only allow observed features to be modeled. The ability to conveniently incorporate hidden variables allows us to extend the expressiveness of language models while alleviating the necessity of pre-processing the data to obtain explicitly observed features. We describe efficient algorithms for marginalization, inference and normalization in our extended models. We then use these techniques to combine two standard forms of language models: local lexical models (Markov N-gram models) and global document-level semantic models (probabilistic latent semantic analysis). Our experimental results on the Wall Street Journal corpus show that we obtain a 18.5% reduction in perplexity compared to the baseline tri-gram model with Good-Turing smoothing.Editors: Dan Roth and Pascale Fung  相似文献   

2.
We study the problem of learning to infer hidden-state sequences of processes whose states and observations are propositionally or relationally factored. Unfortunately, standard exact inference techniques such as Viterbi and graphical model inference exhibit exponential complexity for these processes. The main motivation behind our work is to identify a restricted space of models, which facilitate efficient inference, yet are expressive enough to remain useful in many applications. In particular, we present the penalty-logic simple-transition model, which utilizes a very simple-transition structure where the transition cost between any two states is constant. While not appropriate for all complex processes, we argue that it is often rich enough in many applications of interest, and when it is applicable there can be inference and learning advantages compared to more general models. In particular, we show that sequential inference for this model, that is, finding a minimum-cost state sequence, efficiently reduces to a single-state minimization (SSM) problem. We then show how to define atemporal-cost models in terms of penalty logic, or weighted logical constraints, and how to use this representation for practically efficient SSM computation. We present a method for learning the weights of our model from labeled training data based on Perceptron updates. Finally, we give experiments in both propositional and relational video-interpretation domains showing advantages compared to more general models.  相似文献   

3.
Latent variable models, such as the GPLVM and related methods, help mitigate overfitting when learning from small or moderately sized training sets. Nevertheless, existing methods suffer from several problems: 1) complexity, 2) the lack of explicit mappings to and from the latent space, 3) an inability to cope with multimodality, and 4) the lack of a well-defined density over the latent space. We propose an LVM called the Kernel Information Embedding (KIE) that defines a coherent joint density over the input and a learned latent space. Learning is quadratic, and it works well on small data sets. We also introduce a generalization, the shared KIE (sKIE), that allows us to model multiple input spaces (e.g., image features and poses) using a single, shared latent representation. KIE and sKIE permit missing data during inference and partially labeled data during learning. We show that with data sets too large to learn a coherent global model, one can use the sKIE to learn local online models. We use sKIE for human pose inference.  相似文献   

4.
An Introduction to Variational Methods for Graphical Models   总被引:20,自引:0,他引:20  
This paper presents a tutorial introduction to the use of variational methods for inference and learning in graphical models (Bayesian networks and Markov random fields). We present a number of examples of graphical models, including the QMR-DT database, the sigmoid belief network, the Boltzmann machine, and several variants of hidden Markov models, in which it is infeasible to run exact inference algorithms. We then introduce variational methods, which exploit laws of large numbers to transform the original graphical model into a simplified graphical model in which inference is efficient. Inference in the simpified model provides bounds on probabilities of interest in the original model. We describe a general framework for generating variational transformations based on convex duality. Finally we return to the examples and demonstrate how variational algorithms can be formulated in each case.  相似文献   

5.
In this work, different global optimization techniques are assessed for the automated development of molecular force fields, as used in molecular dynamics and Monte Carlo simulations. The quest of finding suitable force field parameters is treated as a mathematical minimization problem. Intricate problem characteristics such as extremely costly and even abortive simulations, noisy simulation results, and especially multiple local minima naturally lead to the use of sophisticated global optimization algorithms. Five diverse algorithms (pure random search, recursive random search, CMA-ES, differential evolution, and taboo search) are compared to our own tailor-made solution named CoSMoS. CoSMoS is an automated workflow. It models the parameters’ influence on the simulation observables to detect a globally optimal set of parameters. It is shown how and why this approach is superior to other algorithms. Applied to suitable test functions and simulations for phosgene, CoSMoS effectively reduces the number of required simulations and real time for the optimization task.  相似文献   

6.
Renewable energy technologies are developing rapidly, while in the last decade great interest is encountered in the use of wind energy, especially due to the energy crisis and serious environmental problems appeared from the use of fossil fuels and therefore a large number of wind farms have been installed around the world. On the other hand the ability of nature inspired algorithms to efficiently handle combinatorial optimization problems was proved by their successful implementation in many fields of engineering sciences. In this study, a new problem formulation for the optimum layout design of onshore wind farms is presented, where the wind load is implemented using stochastic fields. For this purpose, a metaheuristic search algorithm based on a discrete variant of the harmony search method is used for solving the problem at hand. The farm layout problem is by nature a constrained optimization problem, and the contribution of the wake effects is significant; therefore, in two formulations presented in this study the influence of wind direction is also taken into account and compared with the scenario that the wake effect is ignored. The results of this study proved the applicability of the proposed formulations and the efficiency of combining metaheuristic optimization with stochastic wind loading for dealing with the problem of optimal layout design of wind farms.  相似文献   

7.
Until recently, the lack of ground truth data has hindered the application of discriminative structured prediction techniques to the stereo problem. In this paper we use ground truth data sets that we have recently constructed to explore different model structures and parameter learning techniques. To estimate parameters in Markov random fields (MRFs) via maximum likelihood one usually needs to perform approximate probabilistic inference. Conditional random fields (CRFs) are discriminative versions of traditional MRFs. We explore a number of novel CRF model structures including a CRF for stereo matching with an explicit occlusion model. CRFs require expensive inference steps for each iteration of optimization and inference is particularly slow when there are many discrete states. We explore belief propagation, variational message passing and graph cuts as inference methods during learning and compare with learning via pseudolikelihood. To accelerate approximate inference we have developed a new method called sparse variational message passing which can reduce inference time by an order of magnitude with negligible loss in quality. Learning using sparse variational message passing improves upon previous approaches using graph cuts and allows efficient learning over large data sets when energy functions violate the constraints imposed by graph cuts.  相似文献   

8.
We present the design and implementation of a parallel exact inference algorithm on the Cell Broadband Engine (Cell BE) processor, a heterogeneous multicore architecture. Exact inference is a key problem in exploring probabilistic graphical models, where the computation complexity increases dramatically with the network structure and clique size. In this paper, we exploit parallelism in exact inference at multiple levels. We propose a rerooting method to minimize the critical path for exact inference, and an efficient scheduler to dynamically allocate SPEs. In addition, we explore potential table representation and layout to optimize DMA transfer between local store and main memory. We implemented the proposed method and conducted experiments on the Cell BE processor in the IBM QS20 Blade. We achieved speedup up to 10 × on the Cell, compared to state-of-the-art processors. The methodology proposed in this paper can be used for online scheduling of directed acyclic graph (DAG) structured computations.  相似文献   

9.
多模块贝叶斯网络中推理的简化   总被引:3,自引:0,他引:3  
多模块贝叶斯网络(MSBN)引入了模块化和面向对象思想,是复杂大系统建模的有力工具.目前,如何简化MSBN中局部和全局推理的时空复杂度已成为影响其应用的关键问题.首先分析了用于局部贝叶斯网络推理的两类经典算法的时空复杂度,证明了它们本质上的一致性,并给出了统一的理论解释;进而用实验证明了影响推理复杂度的决定性因素是网络模型相应导出图的导出宽度,并指出了可以精确推理的贝叶斯网络族.最后,分析了降低MSBN全局推理复杂度的可行性,给出了简化MSBN全局推理的指导性原则.  相似文献   

10.
We describe a method to use Spherical Gaussians with free directions and arbitrary sharpness and amplitude to approximate the precomputed local light field for any point on a surface in a scene. This allows for a high-quality reconstruction of these light fields in a manner that can be used to render the surfaces with precomputed global illumination in real-time with very low cost both in memory and performance. We also extend this concept to represent the illumination-weighted environment visibility, allowing for high-quality reflections of the distant environment with both surface-material properties and visibility taken into account. We treat obtaining the Spherical Gaussians as an optimization problem for which we train a Convolutional Neural Network to produce appropriate values for each of the Spherical Gaussians' parameters. We define this CNN in such a way that the produced parameters can be interpolated between adjacent local light fields while keeping the illumination in the intermediate points coherent.  相似文献   

11.
Objects can exhibit different dynamics at different spatio-temporal scales, a property that is often exploited by visual tracking algorithms. A local dynamic model is typically used to extract image features that are then used as inputs to a system for tracking the object using a global dynamic model. Approximate local dynamics may be brittle—point trackers drift due to image noise and adaptive background models adapt to foreground objects that become stationary—and constraints from the global model can make them more robust. We propose a probabilistic framework for incorporating knowledge about global dynamics into the local feature extraction processes. A global tracking algorithm can be formulated as a generative model and used to predict feature values thereby influencing the observation process of the feature extractor, which in turn produces feature values that are used in high-level inference. We combine such models utilizing a multichain graphical model framework. We show the utility of our framework for improving feature tracking as well as shape and motion estimates in a batch factorization algorithm. We also propose an approximate filtering algorithm appropriate for online applications and demonstrate its application to tasks in background subtraction, structure from motion and articulated body tracking.  相似文献   

12.
This paper deals with the wind speed prediction in wind farms, using spatial information from remote measurement stations. Owing to the temporal complexity of the problem, we employ local recurrent neural networks with internal dynamics, as advanced forecast models. To improve the prediction performance, the training task is accomplished using on-line learning algorithms based on the recursive prediction error (RPE) approach. A global RPE (GRPE) learning scheme is first developed where all adjustable weights are simultaneously updated. In the following, through weight grouping we devise a simplified method, the decoupled RPE (DRPE), with reduced computational demands. The partial derivatives required by the learning algorithms are derived using the adjoint model approach, adapted to the architecture of the networks being used. The efficiency of the proposed approach is tested on a real-world wind farm problem, where multi-step ahead wind speed estimates from 15 min to 3 h are sought. Extensive simulation results demonstrate that our models exhibit superior performance compared to other network types suggested in the literature. Furthermore, it is shown that the suggested learning algorithms outperform three gradient descent algorithms, in training of the recurrent forecast models.  相似文献   

13.
Accurate software estimation such as cost estimation, quality estimation and risk analysis is a major issue in software project management. In this paper, we present a soft computing framework to tackle this challenging problem. We first use a preprocessing neuro-fuzzy inference system to handle the dependencies among contributing factors and decouple the effects of the contributing factors into individuals. Then we use a neuro-fuzzy bank to calibrate the parameters of contributing factors. In order to extend our framework into fields that lack of an appropriate algorithmic model of their own, we propose a default algorithmic model that can be replaced when a better model is available. One feature of this framework is that the architecture is inherently independent of the choice of algorithmic models or the nature of the estimation problems. By integrating neural networks, fuzzy logic and algorithmic models into one scheme, this framework has learning ability, integration capability of both expert knowledge and project data, good interpretability, and robustness to imprecise and uncertain inputs. Validation using industry project data shows that the framework produces good results when used to predict software cost.  相似文献   

14.
Several algorithms have been proposed to retrieve near-surface wind fields from C-band synthetic aperture radar (SAR) images acquired over the ocean. They mainly differ in the way they retrieve the wind direction. Conventionally, the wind direction is taken from atmospheric models or is extracted from the linear features sometimes visible in SAR images. Recently, a new wind retrieval algorithm has been proposed, which also includes the Doppler shift induced by motions of the sea surface. In this article, we apply three wind retrieval algorithms, including the one using Doppler information, to three complex wind events encountered over the Black Sea and compare the SAR-derived wind fields with model wind fields calculated using the high-resolution weather research and forecasting (WRF) model. It is shown that the new algorithm is very efficient in resolving the 180° ambiguity in the wind direction, which is often a problem in the streak-based wind retrieval algorithms. However, the Doppler-based algorithm only yields good results for wind directions that have a significant component in the look direction of the SAR antenna. Furthermore, it is dependent on good separation of the contributions to the Doppler shift induced by surface currents and wind-related effects (wind drift and wind-sea components of the ocean wave spectrum). We conclude that an optimum wind retrieval algorithm should consist of a combination of the algorithms based on linear features and Doppler information.  相似文献   

15.
We consider in this paper the nonconvex mixed-integer nonlinear programming problem. We present a mixed local search method to find a local minimizer of an unconstrained nonconvex mixed-integer nonlinear programming problem. Then an auxiliary function which has the same global minimizers and the same global minimal value as the original problem is constructed. Minimization of the auxiliary function using our local search method can escape successfully from previously converged local minimizers by taking increasing values of parameters. For the constrained nonconvex mixed-integer nonlinear programming problem, we develop a penalty based method to convert the problem into an unconstrained one, and then use the above method to solve the later problem. Numerical experiments and comparisons on a set of MINLP benchmark problems show the effectiveness of the proposed algorithm.  相似文献   

16.
Probabilistic graphical models have had a tremendous impact in machine learning and approaches based on energy function minimization via techniques such as graph cuts are now widely used in image segmentation. However, the free parameters in energy function-based segmentation techniques are often set by hand or using heuristic techniques. In this paper, we explore parameter learning in detail. We show how probabilistic graphical models can be used for segmentation problems to illustrate Markov random fields (MRFs), their discriminative counterparts conditional random fields (CRFs) as well as kernel CRFs. We discuss the relationships between energy function formulations, MRFs, CRFs, hybrids based on graphical models and their relationships to key techniques for inference and learning. We then explore a series of novel 3D graphical models and present a series of detailed experiments comparing and contrasting different approaches for the complete volumetric segmentation of multiple organs within computed tomography imagery of the abdominal region. Further, we show how these modeling techniques can be combined with state of the art image features based on histograms of oriented gradients to increase segmentation performance. We explore a wide variety of modeling choices, discuss the importance and relationships between inference and learning techniques and present experiments using different levels of user interaction. We go on to explore a novel approach to the challenging and important problem of adrenal gland segmentation. We present a 3D CRF formulation and compare with a novel 3D sparse kernel CRF approach we call a relevance vector random field. The method yields state of the art performance and avoids the need to discretize or cluster input features. We believe our work is the first to provide quantitative comparisons between traditional MRFs with edge-modulated interaction potentials and CRFs for multi-organ abdominal segmentation and the first to explore the 3D adrenal gland segmentation problem. Finally, along with this paper we provide the labeled data used for our experiments to the community.  相似文献   

17.
The Markov and Conditional random fields (CRFs) used in computer vision typically model only local interactions between variables, as this is generally thought to be the only case that is computationally tractable. In this paper we consider a class of global potentials defined over all variables in the CRF. We show how they can be readily optimised using standard graph cut algorithms at little extra expense compared to a standard pairwise field. This result can be directly used for the problem of class based image segmentation which has seen increasing recent interest within computer vision. Here the aim is to assign a label to each pixel of a given image from a set of possible object classes. Typically these methods use random fields to model local interactions between pixels or super-pixels. One of the cues that helps recognition is global object co-occurrence statistics, a measure of which classes (such as chair or motorbike) are likely to occur in the same image together. There have been several approaches proposed to exploit this property, but all of them suffer from different limitations and typically carry a high computational cost, preventing their application on large images. We find that the new model we propose produces a significant improvement in the labelling compared to just using a pairwise model and that this improvement increases as the number of labels increases.  相似文献   

18.
We present a closed‐form solution for the symmetrization problem, solving for the optimal deformation that reconciles a set of local bilateral symmetries. Given as input a set of point‐pairs which should be symmetric, we first compute for each local neighborhood a transformation which would produce an approximate bilateral symmetry. We then solve for a single global symmetry which includes all of these local symmetries, while minimizing the deformation within each local neighborhood. Our main motivation is the symmetrization of digitized fossils, which are often deformed by a combination of compression and bending. In addition, we use the technique to symmetrize articulated models.  相似文献   

19.
The fuzzy inference system proposed by Takagi, Sugeno, and Kang, known as the TSK model in fuzzy system literature, provides a powerful tool for modeling complex nonlinear systems. Unlike conventional modeling where a single model is used to describe the global behavior of a system, TSK modeling is essentially a multimodel approach in which simple submodels (typically linear models) are combined to describe the global behavior of the system. Most existing learning algorithms for identifying the TSK model are based on minimizing the square of the residual between the overall outputs of the real system and the identified model. Although these algorithms can generate a TSK model with good global performance (i.e., the model is capable of approximating the given system with arbitrary accuracy, provided that sufficient rules are used and sufficient training data are available), they cannot guarantee the resulting model to have a good local performance. Often, the submodels in the TSK model may exhibit an erratic local behavior, which is difficult to interpret. Since one of the important motivations of using the TSK model (also other fuzzy models) is to gain insights into the model, it is important to investigate the interpretability issue of the TSK model. We propose a new learning algorithm that integrates global learning and local learning in a single algorithmic framework. This algorithm uses the idea of local weighed regression and local approximation in nonparametric statistics, but remains the component of global fitting in the existing learning algorithms. The algorithm is capable of adjusting its parameters based on the user's preference, generating models with good tradeoff in terms of global fitting and local interpretation. We illustrate the performance of the proposed algorithm using a motorcycle crash modeling example  相似文献   

20.
Linear discriminant analysis (LDA) is one of the most popular techniques for extracting features in face recognition. LDA captures the global geometric structure. However, local geometric structure has recently been shown to be effective for face recognition. In this paper, we propose a novel feature extraction algorithm which integrates both global and local geometric structures. We first cast LDA as a least square problem based on the spectral regression, then regularization technique is used to model the global and local geometric structures. Furthermore, we impose penalty on parameters to tackle the singularity problem and design an efficient model selection algorithm to choose the optimal tuning parameter which balances the tradeoff between the global and local structures. Experimental results on four well-known face data sets show that the proposed integration framework is competitive with traditional face recognition algorithms, which use either global or local structure only.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号