首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
A two-phase optimization neural network   总被引:4,自引:0,他引:4  
A novel two-phase neural network that is suitable for solving a large class of constrained or unconstrained optimization problem is presented. For both types of problems with solutions lying in the interior of the feasible regions, the phase-one structure of the network alone is sufficient. When the solutions of constrained problems are on the boundary of the feasible regions, the proposed two-phase network is capable of achieving the exact solutions, in contrast to existing optimization neural networks which can obtain only approximate solutions. Furthermore, the network automatically provides the corresponding Lagrange multiplier associated with each constraint. Thus, for linear programming, the network solves both the primal problems and their dual problems simultaneously.  相似文献   

2.
We compare the computational performance of linear programming (LP) and the policy iteration algorithm (PIA) for solving discrete-time infinite-horizon Markov decision process (MDP) models with total expected discounted reward. We use randomly generated test problems as well as a real-life health-care problem to empirically show that, unlike previously reported, barrier methods for LP provide a viable tool for optimally solving such MDPs. The dimensions of comparison include transition probability matrix structure, state and action size, and the LP solution method.  相似文献   

3.
In this paper, we consider the combined distribution and assignment (CDA) problem with link capacity constraints modeled as a hierarchical logit choice problem based on random utility theory. The destination and route choices are calculated based on the multi-nominal logit probability function, which forms the basis for constructing the side constrained CDA (SC-CDA) problem as an equivalent mathematical programming (MP) formulation. A dual MP formulation of the SC-CDA problem is developed as a solution algorithm, which consists of an iterative balancing scheme and a column generation scheme, for solving the SC-CDA problem. Due to the entropy-type objective function, the dual formulation has a simple nonlinear constrained optimization structure, where the feasible set only consists of nonnegative orthants. The iterative balancing scheme explicitly makes use of the optimality conditions of the dual formulation to analytically adjust the dual variables and update the primal variables, while a column generation scheme is used to iteratively generate routes to the working route set as needed to satisfy the side constraints. Two numerical experiments are conducted to demonstrate the features of the SC-CDA model and the computational performance of the solution algorithm. The results reveal that imposing link capacity constraints can have a significant impact on the network equilibrium flow allocations, and the dual approach is a practical solution algorithm for solving the complex SC-CDA problem.  相似文献   

4.
The solution of Markov Decision Processes (MDPs) often relies on special properties of the processes. For two-level MDPs, the difference in the rates of state changes of the upper and lower levels has led to limiting or approximate solutions of such problems. In this paper, we solve a two-level MDP without making any assumption on the rates of state changes of the two levels. We first show that such a two-level MDP is a non-standard one where the optimal actions of different states can be related to each other. Then we give assumptions (conditions) under which such a specially constrained MDP can be solved by policy iteration. We further show that the computational effort can be reduced by decomposing the MDP. A two-level MDP with M upper-level states can be decomposed into one MDP for the upper level and M to M(M-1) MDPs for the lower level, depending on the structure of the two-level MDP. The upper-level MDP is solved by time aggregation, a technique introduced in a recent paper [Cao, X.-R., Ren, Z. Y., Bhatnagar, S., Fu, M., & Marcus, S. (2002). A time aggregation approach to Markov decision processes. Automatica, 38(6), 929-943.], and the lower-level MDPs are solved by embedded Markov chains.  相似文献   

5.
Markov Decision Processes (MDPs) are a formulation for optimization problems in sequential decision making. Solving MDPs often requires implementing a simulator for optimization algorithms to invoke when updating decision making rules known as policies. The combination of simulator and optimizer are subject to failures of specification, implementation, integration, and optimization that may produce invalid policies. We present these failures as queries for a visual analytic system (MDPVIS). MDPVIS addresses three visualization research gaps. First, the data acquisition gap is addressed through a general simulator-visualization interface. Second, the data analysis gap is addressed through a generalized MDP information visualization. Finally, the cognition gap is addressed by exposing model components to the user. MDPVIS generalizes a visualization for wildfire management. We use that problem to illustrate MDPVIS and show the visualization's generality by connecting it to two reinforcement learning frameworks that implement many different MDPs of interest in the research community.  相似文献   

6.
马尔可夫决策过程两种抽象模式   总被引:1,自引:1,他引:1  
抽象层次上马尔可夫决策过程的引入,使得人们可简洁地、陈述地表达复杂的马尔可夫决策过程,解决常规马尔可夫决策过程(MDPs)在实际中所遇到的大型状态空间的表达问题.介绍了结构型和概括型两种不同类型抽象马尔可夫决策过程基本概念以及在各种典型抽象MDPs中的最优策略的精确或近似算法,其中包括与常规MDPs根本不同的一个算法:把Bellman方程推广到抽象状态空间的方法,并且对它们的研究历史进行总结和对它们的发展做一些展望,使得人们对它们有一个透彻的、全面而又重点的理解.  相似文献   

7.
A new neural network for solving linear programming problems with bounded variables is presented. The network is shown to be completely stable and globally convergent to the solutions to the linear programming problems. The proposed new network is capable of achieving the exact solutions, in contrast to existing optimization neural networks which need a suitable choice of the network parameters and thus can obtain only approximate solutions. Furthermore, both the primal problems and their dual problems are solved simultaneously by the new network.  相似文献   

8.
We present in this paper a prox-dual regularization algorithm for solving generalized fractional programming problems. The algorithm combines the dual method of centres for generalized fractional programs and the proximal point algorithm and can handle nondifferentiable convex problems with possibly unbounded feasible constraints set. The proposed procedure generates two sequences of dual and primal values that approximate the optimal value of the considered problem respectively from below and from above at each step. It also generates a sequence of dual solutions that converges to a solution of the dual problem, and a sequence of primal solutions whose every accumulation point is a solution of the primal problem. For a class of problems, including linear fractional programs, the algorithm converges linearly.  相似文献   

9.
In this paper, we describe a new approach to increase the possibility of finding integer feasible columns to a set partitioning problem (SPP) directly in solving the linear programming (LP) relaxation using column generation. Traditionally, column generation is aimed to solve the LP‐relaxation as quickly as possible without any concern for the integer properties of the columns formed. In our approach, we aim to generate columns forming an optimal integer solution while simultaneously solving the LP‐relaxation. Using this approach, we can improve the possibility of finding integer solutions by heuristics at each node in the branch‐and‐bound search. In addition, we improve the possibility of finding high‐quality integer solutions in cases where only the columns in the root node are used to solve the problem. The basis of our approach is a subgradient technique applied to a Lagrangian dual formulation of the SPP extended with an additional surrogate constraint. This extra constraint is not relaxed and is used to better control the subgradient evaluations and how the multiplier values are computed. The column generation is then directed, via the multipliers, to construct columns that form feasible integer solutions. Computational experiments show that we can generate optimal integer columns in a large set of well‐known test problems as compared to both standard and stabilized column generation, and simultaneously keep the number of columns smaller than standard column generation. This is also supported by tests on a case study with work‐shift generation.  相似文献   

10.
本文提出了一种求解非线性约束优化的全局最优的新方法—它是基于利用非线性互补函数和不断增加新的约束来重复解库恩-塔克条件的非线性方程组的新方法。因为库恩-塔克条件是非线性约束优化的必要条件,得到的解未必是非线性约束优化的全局最优解,为此,本文首次给出了通过利用该优化问题的先验知识,不断地增加约束来限制全局最优解范围的方法,一些仿真例子表明提出的方法和理论有效的,并且可行的。  相似文献   

11.
This communique provides an exact iterative search algorithm for the NP-hard problem of obtaining an optimal feasible stationary Markovian pure policy that achieves the maximum value averaged over an initial state distribution in finite constrained Markov decision processes. It is based on a novel characterization of the entire feasible policy space and takes the spirit of policy iteration (PI) in that a sequence of monotonically improving feasible policies is generated and converges to an optimal policy in iterations of the size of the policy space at the worst case. Unlike PI, an unconstrained MDP needs to be solved at iterations involved with feasible policies and the current best policy improves all feasible policies included in the union of the policy spaces associated with the unconstrained MDPs.  相似文献   

12.
In this paper we provide a detailed analysis of the iteration complexity of dual first-order methods for solving conic convex problems. When it is difficult to project on the primal feasible set described by conic and convex constraints, we use the Lagrangian relaxation to handle the conic constraints and then, we apply dual first-order algorithms for solving the corresponding dual problem. We give convergence analysis for dual first-order algorithms (dual gradient and fast gradient algorithms): we provide sublinear or linear estimates on the primal suboptimality and feasibility violation of the generated approximate primal solutions. Our analysis relies on the Lipschitz property of the gradient of the dual function or an error bound property of the dual. Furthermore, the iteration complexity analysis is based on two types of approximate primal solutions: the last primal iterate or an average primal sequence.  相似文献   

13.
The admission control problem can be modelled as a Markov decision process (MDP) under the average cost criterion and formulated as a linear programming (LP) problem. The LP formulation is attractive in the present and future communication networks, which support an increasing number of classes of service, since it can be used to explicitly control class-level requirements, such as class blocking probabilities. On the other hand, the LP formulation suffers from scalability problems as the number C of classes increases. This article proposes a new LP formulation, which, even if it does not introduce any approximation, is much more scalable: the problem size reduction with respect to the standard LP formulation is O((C?+?1)2/2 C ). Theoretical and numerical simulation results prove the effectiveness of the proposed approach.  相似文献   

14.
戴帅  殷苌茗  张欣 《计算机工程》2009,35(13):190-192
提出一种新的基于因素法方法的TD(λ)算法。其基本思想是状态因素化表示,通过动态贝叶斯网络表示Markov决策过程(MDP)中的状态转移概率函数,结合决策树表示TD(λ)算法中的状态值函数,降低状态空间的搜索与计算复杂度,因而适用于求解大状态空间的MDPs问题,实验证明该表示方法是有效的。  相似文献   

15.
A parametrical representation of a linear problem of Pareto optimization is considered. The notion of a generalized solution is introduced for an appropriate pair of dual problems. The structure of a parametrical generalized solution is investigated. A relation between generalized and efficient solutions of problems is established.  相似文献   

16.
17.
Self-adaptive fitness formulation for constrained optimization   总被引:8,自引:0,他引:8  
A self-adaptive fitness formulation is presented for solving constrained optimization problems. In this method, the dimensionality of the problem is reduced by representing the constraint violations by a single infeasibility measure. The infeasibility measure is used to form a two-stage penalty that is applied to the infeasible solutions. The performance of the method has been examined by its application to a set of eleven test cases from the specialized literature. The results have been compared with previously published results from the literature. It is shown that the method is able to find the optimum solutions. The proposed method requires no parameter tuning and can be used as a fitness evaluator with any evolutionary algorithm. The approach is also robust in its handling of both linear and nonlinear equality and inequality constraint functions. Furthermore, the method does not require an initial feasible solution.  相似文献   

18.
This paper proposes a reinforcement learning-based lexicographic approach to the call admission control problem in communication networks. The admission control problem is modelled as a multi-constrained Markov decision process. To overcome the problems of the standard approaches to the solution of constrained Markov decision processes, based on the linear programming formulation or on a Lagrangian approach, a multi-constraint lexicographic approach is defined, and an online implementation based on reinforcement learning techniques is proposed. Simulations validate the proposed approach.  相似文献   

19.
We assess the potentials of the approximate dynamic programming (ADP) approach for process control, especially as a method to complement the model predictive control (MPC) approach. In the artificial intelligence (AI) and operations research (OR) research communities, ADP has recently seen significant activities as an effective method for solving Markov decision processes (MDPs), which represent a type of multi-stage decision problems under uncertainty. Process control problems are similar to MDPs with the key difference being the continuous state and action spaces as opposed to discrete ones. In addition, unlike in other popular ADP application areas like robotics or games, in process control applications first and foremost concern should be on the safety and economics of the on-going operation rather than on efficient learning. We explore different options within ADP design, such as the pre-decision state vs. post-decision state value function, parametric vs. nonparametric value function approximator, batch-mode vs. continuous-mode learning, and exploration vs. robustness. We argue that ADP possesses great potentials, especially for obtaining effective control policies for stochastic constrained nonlinear or linear systems and continually improving them towards optimality.  相似文献   

20.
The cyclic job-shop problem with transportation can be used to describe optimization problems in fully automated manufacturing systems or assembly lines. We study the problem where the machines have no buffers, which rapidly decreases the number of feasible solutions and, therefore, makes it a lot harder to find those feasible solutions. After formulating the problem, we will characterize feasible solutions based on the route of the robot and their properties. With the aim of minimizing the cycle time, we have developed a tree search method to construct feasible solutions and combined it with a bounding procedure. Computational results are reported and compared to those gained by solving the problem with an LP solver.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号