首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
We estimate parameters in the context of a discrete-time hidden Markov model with two latent states and two observed states through a Bayesian approach. We provide a Gibbs sampling algorithm for longitudinal data that ensures parameter identifiability. We examine two approaches to start the algorithm for estimation. The first approach generates the initial latent data from transition probability estimates under the false assumption of perfect classification. The second approach requires an initial guess of the classification probabilities and obtains bias-adjusted approximated estimators of the latent transition probabilities based on the observed data. These probabilities are then used to generate the initial latent data set based on the observed data set. Both approaches are illustrated on medical data and the performance of estimates is examined through simulation studies. The approach using bias-adjusted estimators is the best choice of the two options, since it generates a plausible initial latent data set. Our situation is particularly applicable to diagnostic testing, where specifying the range of plausible classification rates may be more feasible than specifying initial values for transition probabilities.  相似文献   

2.
We study the problem of answering spatial queries in databases where objects exist with some uncertainty and they are associated with an existential probability. The goal of a thresholding probabilistic spatial query is to retrieve the objects that qualify the spatial predicates with probability that exceeds a threshold. Accordingly, a ranking probabilistic spatial query selects the objects with the highest probabilities to qualify the spatial predicates. We propose adaptations of spatial access methods and search algorithms for probabilistic versions of range queries, nearest neighbors, spatial skylines, and reverse nearest neighbors and conduct an extensive experimental study, which evaluates the effectiveness of proposed solutions.  相似文献   

3.
In the process of learning the naive Bayes, estimating probabilities from a given set of training samples is crucial. However, when the training samples are not adequate, probability estimation method will inevitably suffer from the zero-frequency problem. To avoid this problem, Laplace-estimate and M-estimate are the two main methods used to estimate probabilities. The estimation of two important parameters m (integer variable) and p (probability variable) in these methods has a direct impact on the underlying experimental results. In this paper, we study the existing probability estimation methods and carry out a parameter Cross-test by experimentally analyzing the performance of M-estimate with different settings for the two parameters m and p. This part of experimental result shows that the optimal parameter values vary corresponding to different data sets. Motivated by these analysis results, we propose an estimation model based on self-adaptive differential evolution. Then we propose an approach to calculate the optimal m and p value for each conditional probability to avoid the zero-frequency problem. We experimentally test our approach in terms of classification accuracy using the 36 benchmark machine learning repository data sets, and compare it to a naive Bayes with Laplace-estimate and M-estimate with a variety of setting of parameters from literature and those possible optimal settings via our experimental analysis. The experimental results show that the estimation model is efficient and our proposed approach significantly outperforms the traditional probability estimation approaches especially for large data sets (large number of instances and attributes).  相似文献   

4.
Spatio-temporal predicates   总被引:10,自引:0,他引:10  
Investigates temporal changes of topological relationships and thereby integrates two important research areas: first, 2D topological relationships that have been investigated quite intensively, and second, the change of spatial information over time. We investigate spatio-temporal predicates, which describe developments of well-known spatial topological relationships. A framework is developed in which spatio-temporal predicates can be obtained by temporal aggregation of elementary spatial predicates and sequential composition. We compare our framework with two other possible approaches: one is based on the observation that spatio-temporal objects correspond to 3D spatial objects for which existing topological predicates can be exploited. The other approach is to consider possible transitions between spatial configurations. These considerations help to identify a canonical set of spatio-temporal predicates  相似文献   

5.
We propose a model for a point-referenced spatially correlated ordered categorical response and methodology for inference. Models and methods for spatially correlated continuous response data are widespread, but models for spatially correlated categorical data, and especially ordered multi-category data, are less developed. Bayesian models and methodology have been proposed for the analysis of independent and clustered ordered categorical data, and also for binary and count point-referenced spatial data. We combine and extend these methods to describe a Bayesian model for point-referenced (as opposed to lattice) spatially correlated ordered categorical data. We include simulation results and show that our model offers superior predictive performance as compared to a non-spatial cumulative probit model and a more standard Bayesian generalized linear spatial model. We demonstrate the usefulness of our model in a real-world example to predict ordered categories describing stream health within the state of Maryland.  相似文献   

6.
We consider the problem of semi-supervised segmentation of textured images. Existing model-based approaches model the intensity field of textured images as a Gauss-Markov random field to take into account the local spatial dependencies between the pixels. Classical Bayesian segmentation consists of also modeling the label field as a Markov random field to ensure that neighboring pixels correspond to the same texture class with high probability. Well-known relaxation techniques are available which find the optimal label field with respect to the maximum a posteriori or the maximum posterior mode criterion. But, these techniques are usually computationally intensive because they require a large number of iterations to converge. In this paper, we propose a new Bayesian framework by modeling two-dimensional textured images as the concatenation of two one-dimensional hidden Markov autoregressive models for the lines and the columns, respectively. A segmentation algorithm, which is similar to turbo decoding in the context of error-correcting codes, is obtained based on a factor graph approach. The proposed method estimates the unknown parameters using the Expectation-Maximization algorithm.  相似文献   

7.
We give an overview of two approaches to probability theory where lower and upper probabilities, rather than probabilities, are used: Walley's behavioural theory of imprecise probabilities, and Shafer and Vovk's game-theoretic account of probability. We show that the two theories are more closely related than would be suspected at first sight, and we establish a correspondence between them that (i) has an interesting interpretation, and (ii) allows us to freely import results from one theory into the other. Our approach leads to an account of probability trees and random processes in the framework of Walley's theory. We indicate how our results can be used to reduce the computational complexity of dealing with imprecision in probability trees, and we prove an interesting and quite general version of the weak law of large numbers.  相似文献   

8.
Feedforward neural networks, particularly multilayer perceptrons, are widely used in regression and classification tasks. A reliable and practical measure of prediction confidence is essential. In this work three alternative approaches to prediction confidence estimation are presented and compared. The three methods are the maximum likelihood, approximate Bayesian, and the bootstrap technique. We consider prediction uncertainty owing to both data noise and model parameter misspecification. The methods are tested on a number of controlled artificial problems and a real, industrial regression application, the prediction of paper "curl". Confidence estimation performance is assessed by calculating the mean and standard deviation of the prediction interval coverage probability. We show that treating data noise variance as a function of the inputs is appropriate for the curl prediction task. Moreover, we show that the mean coverage probability can only gauge confidence estimation performance as an average over the input space, i.e., global performance and that the standard deviation of the coverage is unreliable as a measure of local performance. The approximate Bayesian approach is found to perform better in terms of global performance.  相似文献   

9.
The need to reason with imprecise probabilities arises in a wealth of situations ranging from pooling of knowledge from multiple experts to abstraction-based probabilistic planning. Researchers have typically represented imprecise probabilities using intervals and have developed a wide array of different techniques to suit their particular requirements. In this paper we provide an analysis of some of the central issues in representing and reasoning with interval probabilities. At the focus of our analysis is the probability cross-product operator and its interval generalization, the cc-operator. We perform an extensive study of these operators relative to manipulation of sets of probability distributions. This study provides insight into the sources of the strengths and weaknesses of various approaches to handling probability intervals. We demonstrate the application of our results to the problems of inference in interval Bayesian networks and projection and evaluation of abstract probabilistic plans. This revised version was published online in June 2006 with corrections to the Cover Date.  相似文献   

10.
现有研究工作没有确定概率向量模型的混合部分比例,所以无法解决MCMC方法的迭代收敛性问题。在具有空间平滑约束的高斯混合模型GMM基础上提出新型贝叶斯网络模型并应用于图像分割领域。模型应用隐Dirichlet分布LDA的概率密度模型和Gauss-Markov随机域MRF的隐Dirichlet参数混合过程来实现参数平滑过程,具有如下优点:针对空间平滑约束规范概率向量模型比例;使用最大后验概率MAP和期望最大化算法EM完成闭合参数的更新操作过程。实验表明,本模型比其他应用GMM方法的图像分割效果好。该模型已成功应用到自然图像和有噪声干扰的自然艺术图像分割过程中。  相似文献   

11.
一种资源共享系统的模型和近似性能分析   总被引:18,自引:1,他引:18  
林闯 《计算机学报》1997,20(10):865-871
本文提出一种随机Petri网(SPN)的资源共享系统的模型,并给出了模型分解和子模型迭代近似求解的两种方法:标识概率交换和平均标志个数交换。例子显示了这两种方法的有效性和相对误差。本文还证明了主述两种方法在固定迭代求解中,固定点解的存在。本文的复杂模型近似性能求解方法可以应用到很多复杂系统的性能分析中。  相似文献   

12.
《Advanced Robotics》2013,27(1):45-69
We present a Bayesian CAD modeler for robotic applications. We address the problem of taking into account the propagation of geometric uncertainties when solving inverse geometric problems. The proposed method may be seen as a generalization of constraint-based approaches in which we explicitly model geometric uncertainties. Using our methodology, a geometric constraint is expressed as a probability distribution on the system parameters and the sensor measurements, instead of a simple equality or inequality. Tosolve geometric problems in this framework, we propose an original resolution method able to adapt to problem complexity. Using two examples, we show how to apply our approach by providing simulation results using our modeler.  相似文献   

13.
We describe approaches for positive data modeling and classification using both finite inverted Dirichlet mixture models and support vector machines (SVMs). Inverted Dirichlet mixture models are used to tackle an outstanding challenge in SVMs namely the generation of accurate kernels. The kernels generation approaches, grounded on ideas from information theory that we consider, allow the incorporation of data structure and its structural constraints. Inverted Dirichlet mixture models are learned within a principled Bayesian framework using both Gibbs sampler and Metropolis-Hastings for parameter estimation and Bayes factor for model selection (i.e., determining the number of mixture’s components). Our Bayesian learning approach uses priors, which we derive by showing that the inverted Dirichlet distribution belongs to the family of exponential distributions, over the model parameters, and then combines these priors with information from the data to build posterior distributions. We illustrate the merits and the effectiveness of the proposed method with two real-world challenging applications namely object detection and visual scenes analysis and classification.  相似文献   

14.
An algorithm using the unsupervised Bayesian online learning process is proposed for the segmentation of object-based video images. The video image segmentation is solved using a classification method. First, different visual features (the spatial location, colour and optical-flow vectors) are fused in a probability framework for image pixel clustering. The appropriate modelling of the probability distribution function (PDF) for each feature-cluster is obtained through a Gaussian distribution. The image pixel is then assigned a cluster number in a maximum a posteriori probability framework. Different from the previous segmentation methods, the unsupervised Bayesian online learning algorithm has been developed to understand a cluster's PDF parameters through the image sequence. This online learning process uses the pixels of the previous clustered image and information from the feature-cluster to update the PDF parameters for segmentation of the current image. The unsupervised Bayesian online learning algorithm has shown satisfactory experimental results on different video sequences.  相似文献   

15.
To effectively evaluate and analyze R&D performance, it is necessary to measure the relative importance of performance analysis factors and quantitative analysis methods that consider the objectivity and relevance of detail factors that constitute performance evaluation. This study suggests a framework for R&D performance evaluations by computing weights through an AHP (Analytical Hierarchy Process) expert survey and by applying a Bayesian Network approach whereby, through which, giving objectivity and allowing inference analyses. This framework can be used as a performance analysis indicator, which uses input and output performance factors in order to perform quantitative analysis for projects. We can quantitatively define the satisfactory level of each project and each performance analysis factor by assigning probability values. It is possible to analyze the relationship between project evaluation results (qualitative evaluation) and performance analysis indicator (quantitative performance). This performance analysis framework can infer posteriori probability using the prior probability and the likelihood function of each performance factor. In addition, by inferring the relationships among performance factors, it allows performing probability analyses on the successful and unsuccessful factors, which can provide further feedback. In conclusion, the framework would improve the national R&D program in terms of financial investment efficiency by aligning budget allocation and performance evaluation.  相似文献   

16.
Naive Bayesian Classification of Structured Data   总被引:3,自引:0,他引:3  
  相似文献   

17.
In fault diagnosis intermittent failure models are an important tool to adequately deal with realistic failure behavior. Current model-based diagnosis approaches account for the fact that a component cj may fail intermittently by introducing a parameter gj that expresses the probability the component exhibits correct behavior. This component parameter gj, in conjunction with a priori fault probability, is used in a Bayesian framework to compute the posterior fault candidate probabilities. Usually, information on gj is not known a priori. While proper estimation of gj can be critical to diagnostic accuracy, at present, only approximations have been proposed. We present a novel framework, coined Barinel, that computes estimations of the gj as integral part of the posterior candidate probability computation using a maximum likelihood estimation approach. Barinel's diagnostic performance is evaluated for both synthetic systems, the Siemens software diagnosis benchmark, as well as for real-world programs. Our results show that our approach is superior to reasoning approaches based on classical persistent failure models, as well as previously proposed intermittent failure models.  相似文献   

18.
A large number of distance metrics have been proposed to measure the difference of two instances. Among these metrics, Short and Fukunaga metric (SFM) and minimum risk metric (MRM) are two probability-based metrics which are widely used to find reasonable distance between each pair of instances with nominal attributes only. For simplicity, existing works use naive Bayesian (NB) classifiers to estimate class membership probabilities in SFM and MRM. However, it has been proved that the ability of NB classifiers to class probability estimation is poor. In order to scale up the classification performance of NB classifiers, many augmented NB classifiers are proposed. In this paper, we study the class probability estimation performance of these augmented NB classifiers and then use them to estimate the class membership probabilities in SFM and MRM. The experimental results based on a large number of University of California, Irvine (UCI) data-sets show that using these augmented NB classifiers to estimate the class membership probabilities in SFM and MRM can significantly enhance their generalisation ability.  相似文献   

19.
Control-loop performance assessment methods have been evolving over the past two decades, with many different monitor algorithms being used to single out specific problems and determine the operating mode. However, a change in operating mode may affect multiple monitors, resulting in the possibility of conflicting assessments. Data-driven Bayesian methods were previously proposed which use multiple monitors to yield probabilistic assessments; however, training data for Bayesian methods requires complete knowledge of underlying operational modes. This paper proposes an approach based on proportionality parameters θ to address the problem of incomplete mode information in the training data; values in θ can be used to fill in missing information, and by varying θ one can determine the boundaries on a probabilistic diagnosis. Two diagnostic approaches are considered: the first type is direct probability approach, which can only be applied when historical data on the operation mode is sufficient and representative. The second type is the likelihood approach which can be applied to more general cases, including when the historical data is too limited to adequately represent mode frequency. In order to represent mode frequency, the likelihood approach takes into account prior probabilities of operating modes. The proposed methods are evaluated in two simulated chemical processes.  相似文献   

20.
Consumer credit scoring is often considered a classification task where clients receive either a good or a bad credit status. Default probabilities provide more detailed information about the creditworthiness of consumers, and they are usually estimated by logistic regression. Here, we present a general framework for estimating individual consumer credit risks by use of machine learning methods. Since a probability is an expected value, all nonparametric regression approaches which are consistent for the mean are consistent for the probability estimation problem. Among others, random forests (RF), k-nearest neighbors (kNN), and bagged k-nearest neighbors (bNN) belong to this class of consistent nonparametric regression approaches. We apply the machine learning methods and an optimized logistic regression to a large dataset of complete payment histories of short-termed installment credits. We demonstrate probability estimation in Random Jungle, an RF package written in C++ with a generalized framework for fast tree growing, probability estimation, and classification. We also describe an algorithm for tuning the terminal node size for probability estimation. We demonstrate that regression RF outperforms the optimized logistic regression model, kNN, and bNN on the test data of the short-term installment credits.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号