首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 656 毫秒
1.
AUTOMATIC COMPLEXITY REDUCTION IN REINFORCEMENT LEARNING   总被引:1,自引:0,他引:1  
High dimensionality of state representation is a major limitation for scale-up in reinforcement learning (RL). This work derives the knowledge of complexity reduction from partial solutions and provides algorithms for automated dimension reduction in RL. We propose the cascading decomposition algorithm based on the spectral analysis on a normalized graph Laplacian to decompose a problem into several subproblems and then conduct parameter relevance analysis on each subproblem to perform dynamic state abstraction. The elimination of irrelevant parameters projects the original state space into the one with lower dimension in which some subtasks are projected onto the same shared subtasks. The framework could identify irrelevant parameters based on performed action sequences and thus relieve the problem of high dimensionality in learning process. We evaluate the framework with experiments and show that the dimension reduction approach could indeed make some infeasible problem to become learnable.  相似文献   

2.
Exposing inconsistencies can uncover many defects in software specifications. One approach to exposing inconsistencies analyzes two redundant specifications, one operational and the other property-based, and reports discrepancies. This paper describes a “practical” formal method, based on this approach and the SCR (software cost reduction) tabular notation, that can expose inconsistencies in software requirements specifications. Because users of the method do not need advanced mathematical training or theorem-proving skills, most software developers should be able to apply the method without extraordinary effort. This paper also describes an application of the method which exposed a safety violation in the contractor-produced software requirements specification of a sizable, safety-critical control system. Because the enormous state space of specifications of practical software usually renders direct analysis impractical, a common approach is to apply abstraction to the specification. To reduce the state space of the control system specification, two “pushbutton” abstraction methods were applied, one which automatically removes irrelevant variables and a second which replaces the large, possibly infinite, type sets of certain variables with smaller type sets. Analyzing the reduced specification with the model checker Spin uncovered a possible safety violation. Simulation demonstrated that the safety violation was not spurious but an actual defect in the original specification  相似文献   

3.
In a model-based testing approach as well as for the verification of properties, B models provide an interesting modeling solution. However, for industrial applications, the size of their state space often makes them hard to handle. To reduce the amount of states, an abstraction function can be used. The abstraction is often a domain abstraction of the state variables that requires many proof obligations to be discharged, which can be very time-consuming for real applications. This paper presents a contribution to this problem that complements an approach based on domain abstraction for test generation, by adding a preliminary syntactic abstraction phase, based on variable elimination. We define a syntactic transformation that suppresses some variables from a B event model, in addition to three methods that choose relevant variables according to a test purpose. In this way, we propose a method that computes an abstraction of a source model ${\mathsf{M}}$ according to a set of selected relevant variables. Depending on the method used, the abstraction can be computed as a simulation or as a bisimulation of ${\mathsf{M}}$ . With this approach, the abstraction process produces a finite state system. We apply this abstraction computation to a model-based testing process. We evaluate experimentally the impact of the model simplification by variables' elimination on the size of the models, on the number of proof obligations to discharge, on the precision of the abstraction and on the coverage achieved by the test generation.  相似文献   

4.
A new methodology for learning the topology of a functional network from data, based on the ANOVA decomposition technique, is presented. The method determines sensitivity (importance) indices that allow a decision to be made as to which set of interactions among variables is relevant and which is irrelevant to the problem under study. This immediately suggests the network topology to be used in a given problem. Moreover, local sensitivities to small changes in the data can be easily calculated. In this way, the dual optimization problem gives the local sensitivities. The methods are illustrated by their application to artificial and real examples.  相似文献   

5.
基于支持向量机的可分离非线性动态系统辨识   总被引:3,自引:0,他引:3  
张莉  席裕庚 《自动化学报》2005,31(6):965-969
针对状态变量和控制变量可分离的非线性动态系统模型,通过引入两个非线性核函数重新设计了标准支持向量机的回归估计模型,使之适用于非线性动态系统的辨识. 它包含两个分别关于状态变量和控制变量的非线性函数,用于辨识可分离变量非线性动态系统中的两个非线性函数.文中的仿真实验验证了我们算法用于非线性动态系统辨识的有效性.  相似文献   

6.
In many applications,the system dynamics allows the decomposition into lower dimensional subsystems with interconnections among them.This decomposition is motivated by the ease and flexibility of the controller design for each subsystem.In this paper,a decentralized model reference adaptive iterative learning control scheme is developed for interconnected systems with model uncertainties.The interconnections in the dynamic equations of each subsystem are considered with unknown boundaries.The proposed controller of each subsystem depends only on local state variables without any information exchange with other subsystems.The adaptive parameters are updated along iteration axis to compensate the interconnections among subsystems.It is shown that by using the proposed decentralized controller,the states of the subsystems can track the desired reference model states iteratively.Simulation results demonstrate that,utilizing the proposed adaptive controller,the tracking error for each subsystem converges along the iteration axis.  相似文献   

7.
In this work we present a verification methodology for real-time distributed systems, based on their modular decomposition into processes. Given a distributed system, each of its components is reduced by abstracting away from details that are irrelevant for the required specification. The abstract components are then composed to form an abstract system to which a model checking procedure is applied. The abstraction relation and the specification language guarantee that if the abstract system satisfies a specification, then the original system satisfies it as well.The specification languageRTL is a branching-time version of the real-time temporal logicTPTL presented in Alur and Henzinger [1]. Its model checking is linear in the size of the system and exponential in the size of the formula. Two notions of abstraction for real-time systems are introduced, each preserving a sublanguage ofRTL.  相似文献   

8.
Dependency networks approximate a joint probability distribution over multiple random variables as a product of conditional distributions. Relational Dependency Networks (RDNs) are graphical models that extend dependency networks to relational domains. This higher expressivity, however, comes at the expense of a more complex model-selection problem: an unbounded number of relational abstraction levels might need to be explored. Whereas current learning approaches for RDNs learn a single probability tree per random variable, we propose to turn the problem into a series of relational function-approximation problems using gradient-based boosting. In doing so, one can easily induce highly complex features over several iterations and in turn estimate quickly a very expressive model. Our experimental results in several different data sets show that this boosting method results in efficient learning of RDNs when compared to state-of-the-art statistical relational learning approaches.  相似文献   

9.
Bayesian Models for Keyhole Plan Recognition in an Adventure Game   总被引:3,自引:1,他引:3  
We present an approach to keyhole plan recognition which uses a dynamic belief (Bayesian) network to represent features of the domain that are needed to identify users' plans and goals. The application domain is a Multi-User Dungeon adventure game with thousands of possible actions and locations. We propose several network structures which represent the relations in the domain to varying extents, and compare their predictive power for predicting a user's current goal, next action and next location. The conditional probability distributions for each network are learned during a training phase, which dynamically builds these probabilities from observations of user behaviour. This approach allows the use of incomplete, sparse and noisy data during both training and testing. We then apply simple abstraction and learning techniques in order to speed up the performance of the most promising dynamic belief networks without a significant change in the accuracy of goal predictions. Our experimental results in the application domain show a high degree of predictive accuracy. This indicates that dynamic belief networks in general show promise for predicting a variety of behaviours in domains which have similar features to those of our domain, while reduced models, obtained by means of learning and abstraction, show promise for efficient goal prediction in such domains.  相似文献   

10.
As we move toward developing object‐oriented (OO) programs, the complexity traditionally found in functions and procedures is moving to the connections among components. Different faults occur when components are integrated to form higher‐level structures that aggregate the behavior and state. Consequently, we need to place more effort on testing the connections among components. Although OO technologies provide abstraction mechanisms for building components that can then be integrated to form applications, it also adds new compositional relations that can contain faults. This paper describes techniques for analyzing and testing the polymorphic relationships that occur in OO software. The techniques adapt traditional data flow coverage criteria to consider definitions and uses among state variables of classes, particularly in the presence of inheritance, dynamic binding, and polymorphic overriding of state variables and methods. The application of these techniques can result in an increased ability to find faults and to create an overall higher quality software. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

11.
Sophisticated agents operating in open environments must make decisions that efficiently trade off the use of their limited resources between dynamic deliberative actions and domain actions. This is the meta-level control problem for agents operating in resource-bounded multi-agent environments. Control activities involve decisions on when to invoke and the amount to effort to put into scheduling and coordination of domain activities. The focus of this paper is how to make effective meta-level control decisions. We show that meta-level control with bounded computational overhead allows complex agents to solve problems more efficiently than current approaches in dynamic open multi-agent environments. The meta-level control approach that we present is based on the decision-theoretic use of an abstract representation of the agent state. This abstraction concisely captures critical information necessary for decision making while bounding the cost of meta-level control and is appropriate for use in automatically learning the meta-level control policies.  相似文献   

12.
13.
In this paper, we proposed a position-based control strategy for eliminating the vibration at the end of deformable linear objects (DLOs) during its manipulation. Using Schur decomposition of matrices and linear transform of variables, actuated and underactuated parts of the DLO dynamic model are separated. Based on the decoupled dynamic model of a DLO system, a sliding mode control with exponential approach law is designed to force the state variables to converge to an equilibrium and to allow vibration at the end of the DLO to be damped quickly. The DLO system, subjected to control input saturation, is further studied to solve the input saturation problem. An adaptive sliding mode control law is designed to suppress the damping at the end of the DLO. Proposed control strategies are verified by numerical simulations. The simulation results show that proposed methods can effectively damp the vibration at the end of the DLO.  相似文献   

14.
When modelling a complex system, such as one with distributed functionality, we need to choose an appropriate level of abstraction. When analysing quantitative properties of the system, this abstraction is typically probabilistic, since we introduce uncertainty about its state and therefore its behaviour. In particular, when we aggregate several concrete states into a single abstract state we would like to know the distribution over these states. In reality, any probability distribution may be possible, but this leads to an intractable analysis. Therefore, we must find a way to approximate these distributions in a safe manner.We present an abstract interpretation for a simple imperative language with message passing, where truncated multivariate normal distributions are used as the abstraction. This allows the probabilities of transient properties to be bounded, without needing to calculate the exact distribution. We describe the semantics of programs in terms of automata, whose transitions are linear operators on measures. Given an input measure, we generate a probabilistic trace whose states are labelled by measures, describing the distribution of the values of variables at that point. By the use of appropriate widening operators, we are able to abstract the behaviour of loops to various degrees of precision.  相似文献   

15.
This article addresses reinforcement learning problems based on factored Markov decision processes (MDPs) in which the agent must choose among a set of candidate abstractions, each build up from a different combination of state components. We present and evaluate a new approach that can perform effective abstraction selection that is more resource‐efficient and/or more general than existing approaches. The core of the approach is to make selection of an abstraction part of the learning agent's decision‐making process by augmenting the agent's action space with internal actions that select the abstraction it uses. We prove that under certain conditions this approach results in a derived MDP whose solution yields both the optimal abstraction for the original MDP and the optimal policy under that abstraction. We examine our approach in three domains of increasing complexity: contextual bandit problems, episodic MDPs, and general MDPs with context‐specific structure. © 2013 Wiley Periodicals, Inc.  相似文献   

16.
This article presents a theory for the bi-decomposition of functions in multi-valued logic (MVL). MVL functions are applied in logic design of multi-valued circuits and machine learning applications. Bi-decomposition is a method to decompose a function into two decomposition functions that are connected by a two-input operator called gate. Each of the decomposition functions depends on fewer variables than the original function. Recursive bi-decomposition represents a function as a structure of interconnected gates. For logic synthesis, the type of the gate can be chosen so that it has an efficient hardware representation. For machine learning, gates are selected to represent simple and understandable classification rules. Algorithms are presented for non-disjoint bi-decomposition, where the decomposition functions may share variables with each other. Bi-decomposition is discussed for the min- and max-operators. To describe the MVL bi-decomposition theory, the notion of incompletely specified functions is generalized to function intervals. The application of MVL differential calculus leads to particular efficient algorithms. To ensure complete recursive decomposition, separation is introduced as a new concept to simplify non-decomposable functions. Multi-decomposition is presented as an example of separation. The decomposition algorithms are implemented in a decomposition system called YADE. MVL test functions from logic synthesis and machine learning applications are decomposed. The results are compared to other decomposers. It is verified that YADE finds decompositions of superior quality by bi-decomposition of MVL function sets.  相似文献   

17.
Cascade Design of State Observers   总被引:1,自引:0,他引:1  
A block approach to designing state observers for nonlinear multivariate systems is developed. A block-observable form of nonlinear systems is elaborated, in which the design of dynamic observation devices is subdivided into sequentially and independently solved elementary subproblems of reduced dimension. Stepwise procedures for choosing state observer controls from high-gain feedback systems are developed. Lower estimates for the finite coefficients of state observers are used in estimating the state vector components with given accuracy via synthesis decomposition. The designed algorithms ensure the invariance of the control operator to parametric uncertainties. By way of application, the state variables of an asynchronous sensorless drive motor are estimated from stator current measurements.  相似文献   

18.
Explaining how engineering devices work is important to students, engineers, and operators. In general, machine generated explanations have been produced from a particular perspective. This paper introduces a system called automatic generation of explanations (AGE) capable of generating causal, behavioral, and functional explanations of physical devices in natural language. AGE explanations can involve different user selected state variables at different abstraction levels. AGE uses a library of engineering components as building blocks. Each component is associated with a qualitative model, information about the meaning of state variables and their possible values, information about substances, and information about the different functions each component can perform. AGE uses: (i) a compositional modeling approach to construct large qualitative models, (ii) causal analysis to build a causal dependency graph, (iii) a novel qualitative simulation approach to efficiently obtain the system's behavior on large systems, and (iv) decomposition analysis to automatically divide large devices into smaller subsystems. AGE effectiveness is demonstrated with different devices that range from a simple water tank to an industrial chemical plant.  相似文献   

19.
Abstract: Many real‐world visual tracking applications have a high dimensionality, i.e. the system state is defined by a large number of variables. This kind of problem can be modelled as a dynamic optimization problem, which involves dynamic variables whose values change in time. Most applied research on optimization methods have focused on static optimization problems but these static methods often lack explicit adaptive methodologies. Heuristics are specific methods for solving problems in the absence of an algorithm for formal proof. Metaheuristics are approximate optimization methods which have been applied to more general problems with significant success. However, particle filters are Monte Carlo algorithms which solve the sequential estimation problem by approximating the theoretical distributions in the state space by simulated random measures called particles. However, particle filters lack efficient search strategies. In this paper, we propose a general framework to hybridize heuristics/metaheuristics with particle filters properly. The aim of this framework is to devise effective hybrid visual tracking algorithms naturally, guided by the use of abstraction techniques. Resulting algorithms exploit the benefits of both complementary approaches. As a particular example, a memetic algorithm particle filter is derived from the proposed hybridization framework. Finally, we show the performance of the memetic algorithm particle filter when it is applied to a multiple object tracking problem.  相似文献   

20.
We present a model based approach to diagnosability analysis for interacting finite state systems where fault isolation is deferred until the system comes to a standstill. Local abstractions of the system model are used to alleviate the state space explosion. Pairs of closely coupled automata are merged and replaced by a single automaton with an equivalently behavior as seen from the rest of the system; interaction between the merged automata is internalized and the new equivalent automaton is subsequently abstracted from internal behavior irrelevant to fault isolation. In moderately concurrent systems these steps can often be iterated until the system consists of a single automaton providing a compact encoding of all possible fault scenarios of the original model. We illustrate how the resulting abstraction can be used as a basis for post mortem diagnosability analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号