首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Consider the following sequential sampling problem: at each time, a choice must be made between obtaining an independent sample from one of a set of random reward variables or stopping the sampling. Sampling a random variable incurs a random cost at each time. The objective of the problem is to maximize the expected nett difference between the largest sample reward obtained before stopping and the accumulated costs incurred while sampling. In this paper, the authors prove that the optimal feedback strategies for this problem are index policies and provide an explicit expression for the optimal expected reward from any state. The problem is motivated by search methods for global optimization problems where the cost of computation is explicitly incorporated into the objective  相似文献   

2.
The properties of separably quasimonotone functions such that calculation of the minimum and maximum values for the variables belonging to the n-dimensional partially integer parallelepiped is reduced to solving simple problems are studied. Operators and iterative processes for identifying domains without admissible and optimal solutions for nonconvex constraints described by systems of inequalities, and systems of efficient boundary estimates of optimal solutions are proposed. This made it possible to reduce the search domain and the number of options involved and to improve the stopping rule of solving processes. For this class of functions, modifications and strategies for branch-and-bound and global random search methods that were not addressed in publications are developed.  相似文献   

3.
We will discuss an expected utility of rewards which are generated by Markov decision processes. This is applied to the optimal stopping problem with a utility treatment. Also a combined model of the decision processes and the stopping problem, called a stopped Markov decision, is considered under the utility.  相似文献   

4.
We consider the operation planning problem for autonomous tractors serving spatially distributed objects (farms). The working time distribution and routing are optimized in the sense of non-working time minimization. Task controller performs control within the framework of the information system on a farm. Two heuristic algorithms based on the continuoustime knapsack problem are described. A constraint on route duration is taken into account. Operation of the algorithms is illustrated by numerical examples.  相似文献   

5.
This paper deals with the optimal stopping problem for multiarmed bandit processes. Under the assumption of independence of arms we show that optimal strategies and stopping times are expressed by the dynamic allocation indices for each arm. This paper reduces this problem to several independent one-parameter optimal stopping problems. On the basis of these results, we characterize optimal strategies and stopping times. Moreover, this paper also extends those to the case allowing time constraints. In the case where arm's state evolve according to Markov chains with finite state, linear programming calculation of optimal strategies and stopping times is discussed.  相似文献   

6.
In this paper, we construct fuzzy renewal processes involving fuzzy random variables. We first extend the renewal processes to the fuzzy renewal processes where interarrival times, rewards, and stopping times are all fuzzy random variables. According to these fuzzy renewal processes, we then extend some theorems of renewal processes to those in fuzzy renewal processes. These are elementary renewal theorem, asymptotic expected average reward, and Wald's equation. In each part, we also give examples for applications. © 2010 Wiley Periodicals, Inc.  相似文献   

7.
The usual (non-stochastic stopping) control problem is extended to the case of random terminal time. The more general model presented hero should be particularly useful when a system will change while being controlled. Systems which are otherwise deterministic, and systems with additive noise in the dynamic equation are considered.

Results concerning the relevant aspects of reliability and Markov process theories are presented. We show that stochastically stopped control optimality conditions are simply extensions of the usual conditions and that the limits of the criteria and optimal control, as the variance of the stopping probability distribution approaches zero, are the corresponding quantities for the non-stochastic problem.  相似文献   

8.
Xun Li  Jie Shen  Qingshuo Song 《Automatica》2012,48(8):1898-1903
We study the sufficient conditions for the existence of a saddle point of a time-dependent discrete Markov zero-sum game up to a given stopping time. The stopping time is allowed to take either a finite or an infinite non-negative random variable with its associated objective function being well-defined. The result enables us to show the existence of the saddle points of discrete games constructed by Markov chain approximation of a class of stochastic differential games.  相似文献   

9.
This paper deals with approximation techniques for the optimal stopping of a piecewise-deterministic Markov process (P.D.P.). Such processes consist of a mixture of deterministic motion and random jumps. In the first part of the paper (Section 3) we study the optimal stopping problem with lower semianalytic gain function; our main result is the construction of ε-optimal stopping times. In the second part (Section 4) we consider a P.D.P. satisfying some smoothness conditions, and forN integer we construct a discretized P.D.P. which retains the main characteristics of the original process. By iterations of the single jump operator from ℝ N to ℝ N , each iteration consisting ofN one-dimensional minimizations, we can calculate the payoff function of the discretized process. We demonstrate the convergence of the payoff functions, and for the case when the state space is compact we construct ε-optimal stopping times for the original problem using the payoff function of the discretized problem. A numerical example is presented.  相似文献   

10.
In this paper it is demonstrated how the probabilistic concept of a stopping time in a random process may be used to generate an iterative method for solving a system of linear equations. Actually all known iterative approximation methods for solving linear equations are generated by various choices of a stopping time e. g. the point and block Jacobi methods, the point and block Gauss-Seidel Methods and overrelaxation methods are covered. The probabilistic approach offers—in a natural way—the possibility of adapting the solution technique to the special structure of the problem. Moreover, posterior bounds for the solution are constructed, which lead to faster convergence of the approximations than with usual prior bounds.  相似文献   

11.
The problem of transforming between continuoustime state variable feedback gains and equivalent discrete gains suitable for digital implementation is considered. The concepts of state and control equivalence yield two simple transformation rules, a pseudo-inverse method and an average gain method, respectively. As the sampling interval δ→0, these methods are contrasted with existing Taylor series based approaches. The new transformation rules are also studied numerically using a ship course-keeping example. Transformed optimal continuous gains are compared with optimal discrete gains over a wide range of sampling intervals.  相似文献   

12.
The problem of optimal filtering in stochastic differential systems with random structure whose switches are generated by a special class of Markov jump processes is considered. The equations of the conditional expectation and conditional probability density function of the signal process are obtained. Numerical methods for solving the corresponding analogues of the Fokker-Planck-Kolmogorov and Zakai equations are proposed.  相似文献   

13.
This paper considers the problem of robust disturbance attenuation for a class of uncertain nonlinear networked control systems. Takagi-Sugeno fuzzy models are firstly employed to describe the nonlinear plant. Markov processes are used to model the random network-induced delays and data packet dropouts. The Lyapunov-Razumikhin method has been used to derive such a controller for this class of nonlinear systems such that it is stochastically stabilizable with a disturbance attenuation level. Sufficient conditions for the existence of such a controller are derived in terms of the solvability of bilinear matrix inequalities. An iterative algorithm is proposed to change this non-convex problem into quasi-convex optimization problems, which can be solved effectively by available mathematical tools. The effectiveness of the proposed design methodology is verified by a numerical example.  相似文献   

14.
This note investigates the stabilization problem for a class of linear uncertain networked control systems with random communication time delays. Both sensor-to-controller and controller-to-actuator random-network-induced delays are considered. Markov processes are used to model these random-network-induced delays. Based on the Lyapunov-Razumikhin method a mode-dependent state feedback controller is proposed to stabilize this class of systems. The existence of such a controller is given in terms of the solvability of bilinear matrix inequalities, which are to be solved by a newly proposed algorithm. A numerical example is used to illustrate the validity of the design methodology.  相似文献   

15.
We consider the class of random processes having linear shift operators. This class is an extension of the class of wide-sense stationary processes (which have unitary shift operators). Conditions for a process to have linear shifts are formulated in terms of the covariance function of the process. Sufficient conditions for a purely nondeterministic process to have linear shifts are given. Theorems concerning superposition, products, and linear transformations of such processes are proved, and applications are indicated. A comparison with the class of locally stationary processes is made. The essential concepts involved are also extended to generalized random processes.  相似文献   

16.
We consider quadratic stabilization for a class of switched systems which are composed of a finite set of continuoustime linear subsystems with norm bounded uncertainties. Under the assumption that there is no single quadratically stable subsystem, if a convex combination of subsystems is quadratically stable, then we propose a state-dependent switching law, based on the convex combination of subsystems, such that the entire switched linear system is quadratically stable. When the state information is not available, we extend the discussion to designing an output-dependent switching law by constructing a robust Luenberger observer for each subsystem.   相似文献   

17.
In machine learning, positive-unlabelled (PU) learning is a special case within semi-supervised learning. In positive-unlabelled learning, the training set contains some positive examples and a set of unlabelled examples from both the positive and negative classes. Positive-unlabelled learning has gained attention in many domains, especially in time-series data, in which the obtainment of labelled data is challenging. Examples which originate from the negative class are especially difficult to acquire. Self-learning is a semi-supervised method capable of PU learning in time-series data. In the self-learning approach, observations are individually added from the unlabelled data into the positive class until a stopping criterion is reached. The model is retrained after each addition with the existent labels. The main problem in self-learning is to know when to stop the learning. There are multiple, different stopping criteria in the literature, but they tend to be inaccurate or challenging to apply. This publication proposes a novel stopping criterion, which is called Peak evaluation using perceptually important points, to address this problem for time-series data. Peak evaluation using perceptually important points is exceptional, as it does not have tunable hyperparameters, which makes it easily applicable to an unsupervised setting. Simultaneously, it is flexible as it does not make any assumptions on the balance of the dataset between the positive and the negative class.   相似文献   

18.
This paper investigates the problem of robust fault estimation for a class of uncertain networked control systems (NCSs) with random communication network-induced delays, which are described by Markov processes. Based on the Lyapunov-Razumikhin method, the existence of a delay-dependent fault estimator is given in terms of the solvability of bilinear matrix inequalities (BMIs). An iterative algorithm is proposed to change this non-convex BMI problem into quasi-convex optimization problems, which can be solved effectively by available mathematical tools. The effectiveness of the proposed design methodology is verified by an internet-based DC motor test rig.  相似文献   

19.
Schemes for system identification based on closed-loop experiments have attracted considerable interest lately. However, most of the existing approaches have been developed for discrete-time models. In this paper, the problem of continuoustime model identification is considered. A bias correction method without noise modelling associated with the Poisson moment functionals approach is presented for indirect identification of closed-loop systems. To illustrate the performances of the proposed method, the bias-eliminated least-squares algorithm is applied to the parameter estimation of a simulated system via Monte Carlo simulations.  相似文献   

20.
We consider the class of differential games with random duration. We show that a problem with random game duration can be reduced to a standard problem with an infinite time horizon. A Hamilton-Jacobi-Bellman-type equation is derived for finding optimal solutions in differential games with random duration. Results are illustrated by an example of a game-theoretic model of nonrenewable resource extraction. The problem is analyzed under the assumption of Weibull-distributed random terminal time of the game.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号