首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Non-stationary fuzzy Markov chain   总被引:1,自引:0,他引:1  
This paper deals with a recent statistical model based on fuzzy Markov random chains for image segmentation, in the context of stationary and non-stationary data. On one hand, fuzzy scheme takes into account discrete and continuous classes through the modeling of hidden data imprecision and on the other hand, Markovian Bayesian scheme models the uncertainty on the observed data. A non-stationary fuzzy Markov chain model is proposed in an unsupervised way, based on a recent Markov triplet approach. The method is compared with the stationary fuzzy Markovian chain model. Both stationary and non-stationary methods are enriched with a parameterized joint density, which governs the attractiveness of the neighbored states. Segmentation task is processed with Bayesian tools, such as the well known MPM (Mode of Posterior Marginals) criterion. To validate both models, we perform and compare the segmentation on synthetic images and raw optical patterns which present diffuse structures.  相似文献   

2.
《Computers & chemistry》1994,18(3):259-267
Non-homogeneous Markov chain models can represent biologically important regions of DNA sequences. The statistical pattern that is described by these models is usually weak and was found primarily because of strong biological indications. The general method for extracting similar patterns is presented in the current paper. The algorithm incorporates cluster analysis, multiple alignment and entropy minimization.The method was first tested using the set of DNA sequences produced by Markov chain generators. It was shown that artificial gene sequences, which initially have been randomly set up along the multiple alignment panels, are aligned according to the hidden triplet phase. Then the method was applied to real protein-coding sequences and the resulting alignment clearly indicated the triplet phase and produced the parameters of the optimal 3-periodic non-homogeneous Markov chain model. These Markov models were already employed in the GeneMark gene prediction algorithm, which is used in genome sequencing projects.The algorithm can also handle the case in which the sequences to be aligned reveal different statistical patterns, such as Escherichia coli protein-coding sequences belonging to Class II and Class III. The algorithm accepts a random mix of sequences from different classes, and is able to separate them into two groups (clusters), align each cluster separately, and define a non-homogeneous Markov chain model for each sequence cluster.  相似文献   

3.
This paper presents an automatic identification of the defect spatial wafer map using a growing wavelet-based hidden Markov tree (gHMT) statistical model. The hierarchical tree-based model, gHMT, utilizes the growing and learning procedure to increase successively the size of the wavelet tree. It can characterize image processing masks from the defect spatial patterns. Like the standard hidden Markov tree, gHMT cannot only capture the statistical behavior of the real-world measurements at multiple scales in space and frequency but also has the ability to accurately identify the locations of the defect regions using the smallest possible size. These regions provide essential information and intrinsic features of each pattern. When all the possible defect patterns are modeled by gHMT, the maximum likelihood classifier is applied to the wavelet energy features extracted from each trained models. Accordingly, defect spatial patterns are identified. The effectiveness of the proposed classifier based on gHMT is illustrated through the experimental data from a wafer foundry plant. It can identify different defect patterns on wafers to help readers delve into the matter.  相似文献   

4.
With scientific data available at geocoded locations, investigators are increasingly turning to spatial process models for carrying out statistical inference. However, fitting spatial models often involves expensive matrix decompositions, whose computational complexity increases in cubic order with the number of spatial locations. This situation is aggravated in Bayesian settings where such computations are required once at every iteration of the Markov chain Monte Carlo (MCMC) algorithms. In this paper, we describe the use of Variational Bayesian (VB) methods as an alternative to MCMC to approximate the posterior distributions of complex spatial models. Variational methods, which have been used extensively in Bayesian machine learning for several years, provide a lower bound on the marginal likelihood, which can be computed efficiently. We provide results for the variational updates in several models especially emphasizing their use in multivariate spatial analysis. We demonstrate estimation and model comparisons from VB methods by using simulated data as well as environmental data sets and compare them with inference from MCMC.  相似文献   

5.
如何快速有效对历史数据进行统计建模和规律挖掘具有重要意义.鉴于模型在实际数据挖掘应用的局限及马尔科夫模型的良好统计特性,设计实现了基于后缀数组和后缀自动机的变阶马尔科夫模型.算法在后缀树形结构实现的基础上,引入后缀链,实现各状态子序列的快速跳转,能动态自适应计算不同阶长概率的需求.实验结果表明:相比传统马尔科夫模型,模型能在线性时间和空间复杂度内,构建历史数据的概率统计特征及各状态后缀子序列之间的链接关系,大大降低了存储空间和时间,能实现大规模数据的在线学习和应用.  相似文献   

6.
The quaternion wavelet transform is regarded as a new multi-scale tool for signal and image processing, which can effectively capture local shifts and image texture information. The marginal and joint distributions of the quaternion wavelet transform coefficients are measured by the histogram. The mutual information is utilized to measure the dependence between the coefficients. The authors have drawn the conclusion that the quaternion coefficients can be modeled by a Gaussian Mixture model conditioned to the magnitudes of generalized coefficients, with intensive analysis of the statistical properties of the decomposition coefficients. In this paper a new hidden Markov tree model utilizing quaternion wavelet transforms is proposed based on the authors’ findings. In order to demonstrate its effectiveness, the new statistical model was applied to image de-noising. The experimental results show that the proposed statistical model exhibits better performance than other related image de-noising algorithms that are also based on hidden Markov tree models.  相似文献   

7.
Markov models have been widely used to represent and analyze user Web navigation data. In previous work, we have proposed a method to dynamically extend the order of a Markov chain model and a complimentary method for assessing the predictive power of such a variable-length Markov chain. Herein, we review these two methods and propose a novel method for measuring the ability of a variable-length Markov model to summarize user Web navigation sessions up to a given length. Although the summarization ability of a model is important to enable the identification of user navigation patterns, the ability to make predictions is important in order to foresee the next link choice of a user after following a given trail so as, for example, to personalize a Web site. We present an extensive experimental evaluation providing strong evidence that prediction accuracy increases linearly with summarization ability  相似文献   

8.
We examine the parallel execution of a class of stochastic algorithms called Markov chain Monte-Carlo (MCMC) algorithms. We focus on MCMC algorithms in the context of image processing, using Markov random field models. Our parallelisation approach is based on several, concurrently running, instances of the same stochastic algorithm that deal with the whole data set. Firstly we show that the speed-up of the parallel algorithm is limited because of the statistical properties of the MCMC algorithm. We examine coupled MCMC as a remedy for this problem. Secondly, we exploit the parallel execution to monitor the convergence of the stochastic algorithms in a statistically reliable manner. This new convergence measure for MCMC algorithms performs well, and is an improvement on known convergence measures. We also link our findings with recent work in the statistical theory of MCMC.  相似文献   

9.
Input-output HMMs for sequence processing   总被引:2,自引:0,他引:2  
We consider problems of sequence processing and propose a solution based on a discrete-state model in order to represent past context. We introduce a recurrent connectionist architecture having a modular structure that associates a subnetwork to each state. The model has a statistical interpretation we call input-output hidden Markov model (IOHMM). It can be trained by the estimation-maximization (EM) or generalized EM (GEM) algorithms, considering state trajectories as missing data, which decouples temporal credit assignment and actual parameter estimation. The model presents similarities to hidden Markov models (HMMs), but allows us to map input sequences to output sequences, using the same processing style as recurrent neural networks. IOHMMs are trained using a more discriminant learning paradigm than HMMs, while potentially taking advantage of the EM algorithm. We demonstrate that IOHMMs are well suited for solving grammatical inference problems on a benchmark problem. Experimental results are presented for the seven Tomita grammars, showing that these adaptive models can attain excellent generalization.  相似文献   

10.
Markov chains provide quite attractive features for simulating a system’s behavior under consideration of uncertainties. However, their use is somewhat limited because of their deterministic transition matrices. Vague probabilistic information and imprecision appear in the modeling of real-life systems, thus causing difficulties in the pure probabilistic model set-up. Moreover, their accuracy suffers due to implementations on computers with floating point arithmetics. Our goal is to address these problems by extending the Dempster-Shafer with Intervals toolbox for MATLAB with novel verified algorithms for modeling that work with Markov chains with imprecise transition matrices, known as Markov set-chains. Additionally, in order to provide a statistical estimation tool that can handle imprecision to set up Markov chain models, we develop a new verified algorithm for computing relations between the mean and the standard deviation of fuzzy sets.  相似文献   

11.
In many machine learning settings, labeled examples are difficult to collect while unlabeled data are abundant. Also, for some binary classification problems, positive examples which are elements of the target concept are available. Can these additional data be used to improve accuracy of supervised learning algorithms? We investigate in this paper the design of learning algorithms from positive and unlabeled data only. Many machine learning and data mining algorithms, such as decision tree induction algorithms and naive Bayes algorithms, use examples only to evaluate statistical queries (SQ-like algorithms). Kearns designed the statistical query learning model in order to describe these algorithms. Here, we design an algorithm scheme which transforms any SQ-like algorithm into an algorithm based on positive statistical queries (estimate for probabilities over the set of positive instances) and instance statistical queries (estimate for probabilities over the instance space). We prove that any class learnable in the statistical query learning model is learnable from positive statistical queries and instance statistical queries only if a lower bound on the weight of any target concept f can be estimated in polynomial time. Then, we design a decision tree induction algorithm POSC4.5, based on C4.5, that uses only positive and unlabeled examples and we give experimental results for this algorithm. In the case of imbalanced classes in the sense that one of the two classes (say the positive class) is heavily underrepresented compared to the other class, the learning problem remains open. This problem is challenging because it is encountered in many real-world applications.  相似文献   

12.
杨震  王红军 《计算机应用》2019,39(3):675-680
针对Markov模型在位置预测中存在预测精度不高及匹配稀疏等问题,提出了一种基于Adaboost-Markov模型的移动用户位置预测方法。首先,通过基于转角偏移度与距离偏移量的轨迹划分方法对原始轨迹数据进行预处理,提取出特征点,并采用密度聚类算法将特征点聚类为用户的各个兴趣区域,把原始轨迹数据离散化为由兴趣区域组成的轨迹序列;然后,根据前缀轨迹序列与历史轨迹序列模式树的匹配程度来自适应地确定模型阶数k;最后,采用Adaboost算法根据1~k阶Markov模型的重要程度为其赋予相应的权重系数,组成多阶融合Markov模型,从而实现对移动用户未来兴趣区域的预测。在大规模真实用户轨迹数据集上的实验结果表明,与1阶Markov模型、2阶Markov模型、权重系数平均的多阶融合Markov模型相比,Adaboost-Markov模型的平均预测准确率分别提高了20.83%、11.3%以及5.38%,且具有良好的普适性与多步预测性能。  相似文献   

13.
Algorithms based on Markov chains are ubiquitous across scientific disciplines as they provide a method for extracting statistical information about large, complicated systems. For some self-assembly models, Markov chains can be used to predict both equilibrium and non-equilibrium dynamics. In fact, the efficiency of these self-assembly algorithms can be related to the rate at which simple chains converge to their stationary distribution. We give an overview of the theory of Markov chains and show how many natural chains, including some relevant in the context of self-assembly, undergo a phase transition as a parameter representing temperature is varied in the model. We illustrate this behavior for the non-saturated Ising model in which there are two types of tiles that prefer to be next to other tiles of the same type. Unlike the standard Ising model, we also allow empty spaces that are not occupied by either type of tile. We prove that for a local Markov chain that allows tiles to attach and detach from the lattice, the rate of convergence is fast at high temperature and slow at low temperature.  相似文献   

14.
Consider a buffer whose input is a superposition of L independent identical sources, and which is served at rate sL. Recent work has shown that, under very general circumstances, the stationary tail probabilities for the queue of unfinished work Q in the buffer have the asymptotics P[Q > Lb] ≈ eLI(b) for large L. Here the shape function, I, is obtained from a variational expression involving the transient log cumulant generating function of the arrival process.

In this paper, we extend this analysis to cover time-dependent asymptotics for Markov arrival processes subject to conditioning at some instant. In applications we envisage that such conditioning would arise due to knowledge of the queue at a coarse-grained level, for example of the number of current active sources. We show how such partial knowledge can be used to predict future tail probabilities by use of a time dependent, conditioned shape function. We develop some heuristics to describe the time-dependent shape function in terms of a reduced set of quantities associated with the underlying arrivals process and show how to calculate them for renewal arrivals and a class of ON-OFF arrivals. This bypasses the full variational calculation of the shape function for such models.  相似文献   


15.
Data augmentation and parameter expansion can lead to improved iterative sampling algorithms for Markov chain Monte Carlo (MCMC). Data augmentation allows for simpler and more feasible simulation from a posterior distribution. Parameter expansion accelerates convergence of iterative sampling algorithms by increasing the parameter space. Data augmentation and parameter-expanded data augmentation MCMC algorithms are proposed for fitting probit models for independent ordinal response data. The algorithms are extended for fitting probit linear mixed models for spatially correlated ordinal data. The effectiveness of data augmentation and parameter-expanded data augmentation is illustrated using the probit model and ordinal response data, however, the approach can be used broadly across model and data types.  相似文献   

16.
This paper deals with a comparison of recent statistical models based on fuzzy Markov random fields and chains for multispectral image segmentation. The fuzzy scheme takes into account discrete and continuous classes which model the imprecision of the hidden data. In this framework, we assume the dependence between bands and we express the general model for the covariance matrix. A fuzzy Markov chain model is developed in an unsupervised way. This method is compared with the fuzzy Markovian field model previously proposed by one of the authors. The segmentation task is processed with Bayesian tools, such as the well-known MPM (mode of posterior marginals) criterion. Our goal is to compare the robustness and rapidity for both methods (fuzzy Markov fields versus fuzzy Markov chains). Indeed, such fuzzy-based procedures seem to be a good answer, e.g., for astronomical observations when the patterns present diffuse structures. Moreover, these approaches allow us to process missing data in one or several spectral bands which correspond to specific situations in astronomy. To validate both models, we perform and compare the segmentation on synthetic images and raw multispectral astronomical data  相似文献   

17.
18.
A Markov chain model for statistical software testing   总被引:2,自引:0,他引:2  
Statistical testing of software establishes a basis for statistical inference about a software system's expected field quality. This paper describes a method for statistical testing based on a Markov chain model of software usage. The significance of the Markov chain is twofold. First, it allows test input sequences to be generated from multiple probability distributions, making it more general than many existing techniques. Analytical results associated with Markov chains facilitate informative analysis of the sequences before they are generated, indicating how the test is likely to unfold. Second, the test input sequences generated from the chain and applied to the software are themselves a stochastic model and are used to create a second Markov chain to encapsulate the history of the test, including any observed failure information. The influence of the failures is assessed through analytical computations on this chain. We also derive a stopping criterion for the testing process based on a comparison of the sequence generating properties of the two chains  相似文献   

19.
为了对在线实验系统产生的实验数据序列进行分析,引入一阶马尔可夫链. 通过人工分类把实验数据分为学习积极和懒散作弊两类,分别构建马尔可夫链模型. 根据输出概率判定测试数据来自哪一个模型的可能性较大. 最后讨论了状态的平稳分布情况. 实验结果表明,基于马尔可夫链的分类模型具有较高的正确率.  相似文献   

20.
Three automatic test case generation algorithms intended to test the resource allocation mechanisms of telecommunications software systems are introduced. Although these techniques were specifically designed for testing telecommunications software, they can be used to generate test cases for any software system that is modelable by a Markov chain provided operational profile data can either be collected or estimated. These algorithms have been used successfully to perform load testing for several real industrial software systems. Experience generating test suites for five such systems is presented. Early experience with the algorithms indicate that they are highly effective at detecting subtle faults that would have been likely to be missed if load testing had been done in the more traditional way, using hand-crafted test cases. A domain-based reliability measure is applied to systems after the load testing algorithms have been used to generate test data. Data are presented for the same five industrial telecommunications systems in order to track the reliability as a function of the degree of system degradation experienced  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号