首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
This paper is a sequel to the analysis of finite-step approximations in solving controlled Markov set-chains for infinite horizon discounted reward by the author. For average-reward-controlled Markov set-chains with finite state and action spaces, we develop a value-iteration-type algorithm and analyze an error bound relative to the optimal average reward that satisfies an optimality equation from the successive approximation under an ergodicity condition. We further analyze an error bound of the rolling horizon control policy defined from a finite-step approximate value by applying the value-iteration-type algorithm.   相似文献   

2.
In this paper, we propose to use a new optimization method, i.e., semidefinite programming (SDP), to solve the large-margin estimation (LME) problem of continuous-density hidden Markov model (CDHMM) in speech recognition. First, we introduce a new constraint for LME to guarantee the boundedness of the margin of CDHMM. Second, we show that the LME problem subject to this new constraint can be formulated as an SDP problem under some relaxation conditions. Therefore, it can be solved using many efficient optimization algorithms specially designed for SDP. The new LME/SDP method has been evaluated on a speaker independent E-set speech recognition task using the ISOLET database and a connected digit string recognition task using the TIDIGITS database. Experimental results clearly demonstrate that large-margin estimation via semidefinite programing (LME/SDP) can significantly reduce word error rate (WER) over other existing CDHMM training methods, such as MLE and MCE. It has also been shown that the new SDP-based method largely outperforms the previously proposed LME optimization methods using gradient descent search.  相似文献   

3.
The standard neoclassical growth model with quasi-geometric discounting is shown elsewhere (Krusell, P. and Smith, A., CEPR Discussion Paper No. 2651, 2000) to have multiple solutions. As a result, value-iterative methods fail to converge. The set of equilibria is however reduced if we restrict our attention to the interior (satisfying the Euler equation) solution. We study the performance of a grid-based Euler-equation methods in the given context. We find that such a method converges to an interior solution in a wide range of parameter values, not only in the “test” model with the closed-form solution but also in more general settings, including those with uncertainty. JEL Classification: C73, D90, E21  相似文献   

4.
Markov decision processes (MDPs) and their variants are widely studied in the theory of controls for stochastic discreteevent systems driven by Markov chains. Much of the literature focusses on the risk-neutral criterion in which the expected rewards, either average or discounted, are maximized. There exists some literature on MDPs that takes risks into account. Much of this addresses the exponential utility (EU) function and mechanisms to penalize different forms of variance of the rewards. EU functions have some numerical deficiencies, while variance measures variability both above and below the mean rewards; the variability above mean rewards is usually beneficial and should not be penalized/avoided. As such, risk metrics that account for pre-specified targets (thresholds) for rewards have been considered in the literature, where the goal is to penalize the risks of revenues falling below those targets. Existing work on MDPs that takes targets into account seeks to minimize risks of this nature. Minimizing risks can lead to poor solutions where the risk is zero or near zero, but the average rewards are also rather low. In this paper, hence, we study a risk-averse criterion, in particular the so-called downside risk, which equals the probability of the revenues falling below a given target, where, in contrast to minimizing such risks, we only reduce this risk at the cost of slightly lowered average rewards. A solution where the risk is low and the average reward is quite high, although not at its maximum attainable value, is very attractive in practice. To be more specific, in our formulation, the objective function is the expected value of the rewards minus a scalar times the downside risk. In this setting, we analyze the infinite horizon MDP, the finite horizon MDP, and the infinite horizon semi-MDP (SMDP). We develop dynamic programming and reinforcement learning algorithms for the finite and infinite horizon. The algorithms are tested in numerical studies and show encouraging performance.  相似文献   

5.
In this note, we introduce and study the notion of safety control of stochastic discrete-event systems (DESs), modeled as controlled Markov chains. For nonstochastic DESs modeled by state machines or automata, safety is specified as a set of forbidden states, or equivalently by a binary valued vector that imposes an upper bound on the set of states permitted to be visited. We generalize this notion of safety to the setting of stochastic DESs by specifying it as an unit-interval valued vector that imposes an upper bound on the state probability distribution vector. Under the assumption of complete state observation, we identify: 1) the set of all state feedback controllers that satisfy the safety requirement for any given safe initial state probability distribution, and 2) the set of all safe initial state probability distributions for a given state feedback controller.  相似文献   

6.
基于简化的受控Markov链软件自适应测试模型大多是研究如何以最小的期望成本检测并移除所有的缺陷,并在构建模型时对部分条件进行特殊化和理想化处理.针对受控Markov链软件测试模型适用范围小、效率低的缺陷,在软件控制论思想基础上,对制约条件进行了一系列新的转换,提出一种改进的、资源约束的受控Markov链模型,该模型能够在高效性、复杂性和适用性3方面达到一个平衡.根据该模型设计一种新的软件缺陷优化测试策略,再通过参数估计对优化测试策略进行在线调整的方法,以构造软件自适应测试策略.为了证明其有效,利用该模型得到的新的软件自适应测试策略进行仿真实验,进一步得到了有效结果.  相似文献   

7.
Boltzmann machines (BMs) are proposed as a computational model for the solution of the satisfiability (SAT) problem in the propositional calculus setting. Conditions that guarantee consensus function maxima for configurations of the BM associated with solutions to the satisfaction problem are given. Experimental results that show a linear behavior of BMs solving the satisfiability problem are presented and discussed  相似文献   

8.
Plume mapping via hidden Markov methods   总被引:3,自引:0,他引:3  
This paper addresses the problem of mapping likely locations of a chemical source using an autonomous vehicle operating in a fluid flow. The paper reviews biological plume-tracing concepts, reviews previous strategies for vehicle-based plume tracing, and presents a new plume mapping approach based on hidden Markov methods (HMM). HMM provide efficient algorithms for predicting the likelihood of odor detection versus position, the likelihood of source location versus position, the most likely path taken by the odor to a given location, and the path between two points most likely to result in odor detection. All four are useful for solving the odor source localization problem using an autonomous vehicle. The vehicle is assumed to be capable of detecting above threshold chemical concentration and sensing the fluid flow velocity at the vehicle location. The fluid flow is assumed to vary with space and time, and to have a high Reynolds number (Re>10).  相似文献   

9.
基于区间—遗传算法求解非线性方程组   总被引:1,自引:1,他引:0       下载免费PDF全文
将非线性方程组的求解转化为函数优化问题,结合遗传算法的群体搜索、全局收敛的优点,及区间算法特有的解的存在性检验准则,提出了一种区间—遗传算法。在迭代计算过程中,区间算法为遗传算法搜索提供可靠区域,同时遗传算法为区间算法提供安全的初始区域。数值实验表明,该算法能够在较大范围的初始区间内快速,可靠地迭代得到高精度的区间解,是求解非线性方程组的一种有效的算法。  相似文献   

10.
A new method for solving integral equations based on Walsh functions is established. The solution of a convolution integral via the Walsh transform is studied first. The extension of the technique to the integral equation of the first kind is naturally obtained. In literature, the integral equation of the second kind is considered to be much more difficult to solve than the first. In this new approach, these two kinds are solved with nearly the same ease. Several numerical examples are included for illustration.  相似文献   

11.
Games that model realistic systems can have very large state-spaces, making their direct solution difficult. We present a symbolic abstraction-refinement approach to the solution of two-player games with reachability or safety goals. Given a reachability or safety property, an initial set of states, and a game representation, our approach starts by constructing a simple abstraction of the game, guided by the predicates present in the property and in the initial set. The abstraction is then refined, until it is possible to either prove, or disprove, the property over the initial states. Specifically, we evaluate the property on the abstract game in three-valued fashion, computing an over-approximation (the may states), and an under-approximation (the must states), of the states that satisfy the property. If this computation fails to yield a certain yes/no answer to the validity of the property on the initial states, our algorithm refines the abstraction by splitting uncertain abstract states (states that are may-states, but not must-states). The approach lends itself to an efficient symbolic implementation. We discuss the property required of the abstraction scheme in order to achieve convergence and termination of our technique.  相似文献   

12.
To successfully interact with and learn from humans in cooperative modes, robots need a mechanism for recognizing, characterizing, and emulating human skills. In particular, it is our interest to develop the mechanism for recognizing and emulating simple human actions, i.e., a simple activity in a manual operation where no sensory feedback is available. To this end, we have developed a method to model such actions using a hidden Markov model (HMM) representation. We proposed an approach to address two critical problems in action modeling: classifying human action-intent, and learning human skill, for which we elaborated on the method, procedure, and implementation issues in this paper. This work provides a framework for modeling and learning human actions from observations. The approach can be applied to intelligent recognition of manual actions and high-level programming of control input within a supervisory control paradigm, as well as automatic transfer of human skills to robotic systems  相似文献   

13.
包晓安  姚澜  张晓文  曹建文 《计算机科学》2012,39(5):117-119,136
目前许多文献都讨论的受控马尔科夫链软件测试模型,是通过对部分假设条件进行特殊化处理后得到的,这将导致模型的适用范围较小且偏离实际应用。依据软件控制论思想,通过一系列新的制约条件的转换,提出一种改善的、测试资源约束下的受控马尔科夫链模型来消除已有模型的缺陷。同时,该模型能够在高效性、复杂性和适用性3方面达到一个平衡点。为了证明其有效,根据该模型设计了一种新的软件缺陷优化测试策略,并对该策略进行了仿真实验,将其与传统的随机测试策略进行了比较。实验结果表明,该模型具有较高的实用性和有效性。  相似文献   

14.
In this paper, the average case of finite state controlled Markov set-chains with R p -set-valued rewards are considered. The optimization is done by a pseudo-order relation on the set of all convex and compact subsets of R p induced by a closed convex cone. We introduce a v -step contractive property (minorization condition) for the average case and by use of this method the average expected reward set from a periodic policy is characterized. And, applying the scalarization technique, a Pareto optimal policy is obtained. Also, a numerical example is given to comprehend our results.  相似文献   

15.
In this paper,the distributed stochastic model predictive control(MPC) is proposed for the noncooperative game problem of the discrete-time multi-player systems(MPSs) with the undirected Markov jump graph.To reflect the reality,the state and input constraints have been considered along with the external disturbances.An iterative algorithm is designed such that model predictive noncooperative game could converge to the socalled ε-Nash equilibrium in a distributed manner.Sufficient conditions are ...  相似文献   

16.
We address the sequence classification problem using a probabilistic model based on hidden Markov models (HMMs). In contrast to commonly-used likelihood-based learning methods such as the joint/conditional maximum likelihood estimator, we introduce a discriminative learning algorithm that focuses on class margin maximization. Our approach has two main advantages: (i) As an extension of support vector machines (SVMs) to sequential, non-Euclidean data, the approach inherits benefits of margin-based classifiers, such as the provable generalization error bounds. (ii) Unlike many algorithms based on non-parametric estimation of similarity measures that enforce weak constraints on the data domain, our approach utilizes the HMM’s latent Markov structure to regularize the model in the high-dimensional sequence space. We demonstrate significant improvements in classification performance of the proposed method in an extensive set of evaluations on time-series sequence data that frequently appear in data mining and computer vision domains.  相似文献   

17.
Learning Probabilistic Automata and Markov Chains via Queries   总被引:1,自引:1,他引:0  
Tzeng  Wen-Guey 《Machine Learning》1992,8(2):151-166
We investigate the problem of learning probabilistic automata and Markov chains via queries in the teacher-student learning model. Probabilistic automata and Markov chains are probabilistic extensions of finite state automata and have similar structures. We discuss some natural oracles associated with probabilistic automata and Markov chains. We present polynomial-time algorithms for learning probabilistic automata and Markov Chains using these oracles.  相似文献   

18.
基于马尔科夫逻辑网络的实体解析改进算法   总被引:3,自引:0,他引:3  
实体解析(Entity Resolution,ER)是数据挖掘过程中关键而又费时的一个步骤.华盛顿大学的Domingos和Singla提出了基于马尔科夫逻辑网络(Markov Logic Networks,MLNs)的ER算法.基于此算法,在原有的MLNs体系中,引入了一个可变权重的规则,试图解决原有系统无法处理的实体二义性问题.实验证明,新算法能够有效缓解数据记录的二义性问题,并且在一定程度上提高了原始算法的精度.  相似文献   

19.
李庆华 《微型电脑应用》2000,16(9):62-62,42
Excel97通过窗口操作,提供了超强的数学运算、统计分析等程序。本文通过例子介绍了其中的线性规则程序的运用。用Excel工具栏的下拉菜单中的“规划求解”即可求解线性规划问题,并进行灵敏度分析。  相似文献   

20.
In this paper, the resilient model approximation problem for a class of discrete-time Markov jump time-delay systems with input sector-bounded nonlinearities is investigated. A linearised reduced-order model is determined with mode changes subject to domination by a hierarchical Markov chain containing two different nonhomogeneous Markov chains. Hence, the reduced-order model obtained not only reflects the dependence of the original systems but also model external influence that is related to the mode changes of the original system. Sufficient conditions formulated in terms of bilinear matrix inequalities for the existence of such models are established, such that the resulting error system is stochastically stable and has a guaranteed l2-l error performance. A linear matrix inequalities optimisation coupled with line search is exploited to solve for the corresponding reduced-order systems. The potential and effectiveness of the developed theoretical results are demonstrated via a numerical example.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号