首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 109 毫秒
1.
提出状态探索密度的概念,通过检测状态对智能体探索环境能力的影响来发现学习的子目标并构建对应的Option.用该算法创建Option的再励学习算法能有效提高学习速度.算法具有和任务无关、不需要先验知识等优点,构造出的Option在同一环境下不同任务间可以直接共享.  相似文献   

2.
一种新的分层强化学习方法   总被引:1,自引:0,他引:1  
沈晶  顾国昌  刘海波 《计算机应用》2006,26(8):1938-1939
提出一种集成Option和MAXQ的分层强化学习新方法——OMQ,该方法以MAXQ为基本框架利用先验知识对任务进行人工分层和在线学习,集成Option方法对难以预先细分的子任务进行自动分层。以出租车问题为背景对OMQ学习算法进行了仿真与对比分析,实验结果表明,在任务环境不完全可知条件下,OMQ比Option和MAXQ更适用。  相似文献   

3.
面向Option的k-聚类Subgoal发现算法   总被引:3,自引:0,他引:3  
在学习过程中自动发现有用的Subgoal并创建Option,对提高强化学习的学习性能有着重要意义.提出了一种基于k-聚类的Subgoal自动发现算法,该算法能通过对在线获取的少量路径数据进行聚类的方法抽取出Subgoal.实验表明,该算法能有效地发现所有符合要求的Subgoal,与Q-学习和基于多样性密度的强化学习算法相比,用该算法发现Subgoal并创建Option的强化学习算法能有效提高Agent的学习速度.  相似文献   

4.
基于多智能体的Option自动生成算法   总被引:2,自引:0,他引:2  
目前分层强化学习中的任务自动分层都是采用基于单智能体的串行学习算法,为解决串行算法学习速度较慢的问题,以Sutton的Option分层强化学习方法为基础框架,提出了一种基于多智能体的Option自动生成算法,该算法由多智能体合作对状态空间进行并行探测并集中应用aiNet实现免疫聚类产生状态子空间,然后并行学习生成各子空间上的内部策略,最终生成Option. 以二维有障碍栅格空间内2点间最短路径规划为任务背景给出了算法并进行了仿真实验和分析.结果表明,基于多智能体的Option自动生成算法速度明显快于基于单智能体的算法.  相似文献   

5.
现有的强化学习方法都不能很好地处理动态环境中的学习问题,当环境变化时需要重新学习最优策略,若环境变化的时间间隔小于策略收敛时间,学习算法则不能收敛.本文在Option分层强化学习方法的基础上提出一种适应动态环境的分层强化学习方法,该方法利用学习的分层特性,仅关注分层任务子目标状态及当前Option内部环境状态的变化,将策略更新过程限制在规模较小的局部空间或维数较低的高层空间上,从而加快学习速度.以二维动态栅格空间内两点间最短路径规划为背景进行了仿真实验,实验结果表明,该方法策略学习速度明显高于以往的方法,且学习算法收敛性对环境变化频率的依赖性有所降低.  相似文献   

6.
针对大规模或复杂的随机动态规划系统,可利用其分层结构特点或引入分层控制方式,借助分层强化学习(Hierarchical Reinforcement Learning,HRL)来解决其"维数灾"和"建模难"问题.HRL归属于样本数据驱动优化方法,通过空间/时间抽象机制,可有效加速策略学习过程.其中,Option方法可将系统目标任务分解成多个子目标任务来学习和执行,层次化结构清晰,是具有代表性的HRL方法之一.传统的Option算法主要是建立在离散时间半马尔可夫决策过程(Semi-Markov Decision Processes,SMDP)和折扣性能准则基础上,无法直接用于解决连续时间无穷任务问题.因此本文在连续时间SMDP框架及其性能势理论下,结合现有的Option算法思想,运用连续时间SMDP的相关学习公式,建立一种适用于平均或折扣性能准则的连续时间统一Option分层强化学习模型,并给出相应的在线学习优化算法.最后通过机器人垃圾收集系统为仿真实例,说明了这种HRL算法在解决连续时间无穷任务优化控制问题方面的有效性,同时也说明其与连续时间模拟退火Q学习相比,具有节约存储空间、优化精度高和优化速度快的优势.  相似文献   

7.
12月16日获悉:CA公司宣布交付支持Microsoft Exchange 5.5的Unicenter TNG Microsoft(MS)Exchange Option、ARCserve以及InocuLAN。CA公司的Unicenter TNG Microsoft(MS)Exchange Option是企业管理解决方案的一个扩展,它通过实现保持Microsoft Exchange服务器正常运行所需功能的自动化,增强了Microsoft Exchange 5.5环境的可靠性和可用性。  相似文献   

8.
注:文章所列题号均为笔试试卷相应题号一、选择题(13)在窗体上画两个单选按钮,名称分别为Option1、Option2,标题分别为"宋体"和"黑体";一个复选框,名称为Check1,标题为"粗体";一个文本框,名称为Text1,Text属性为"改变文字字体"。要求程序运行时,"宋体"单选按钮和"粗体"复选框被选中(窗体外观如下图),则能够实现上述要求的语句序列是A)Option1.Value=TrueCheck1.Value=FalseB)Option1.Value=TrueCheck1.Value=TrueC)Option2.Value=FalseCheck1.Value=TrueD)Option1.Value=TrueCheck1.Value=1正确答案:D)试…  相似文献   

9.
为加快分层强化学习中任务层次结构的自动生成速度,提出了一种基于多智能体系统的并行自动分层方法,该方法以Sutton提出的Option分层强化学习方法为理论框架,首先由多智能体合作对状态空间进行并行探测并集中聚类产生状态子空间,然后多智能体并行学习生成各子空间上内部策略,最终生成Option.以二维有障碍栅格空间内两点间最短路径规划为任务背景给出了算法并进行了仿真实验和分析,结果表明,并行自动分层方法生成任务层次结构的速度明显快于以往的串行自动分层方法.本文的方法适用于空间探测、路径规划、追逃等类问题领域.  相似文献   

10.
传统的玻璃密度计在数据处理和远距离检测方面不能满足在线的要求。近年来,国内外研制出新型密度检测系统(如放射性同位素密度计,超声密度计,振动密度计等)由于价昂、使用要求高,难于在工业生产过程中推广应用。为此,我们用电阻应变式传感器在检测介质密度方面作了探索。实践证明,采用沉子法密度传感器来检测介质密度,不论在检测精度,分辨能力和恶劣环境的适应性方面都具有其优越性,本文就有关问题做以介绍。  相似文献   

11.
分层强化学习中的动态分层方法研究   总被引:1,自引:0,他引:1  
分层强化学习中现有的自动分层方法均是在对状态空间进行一定程度探测之后一次性生成层次结构,不充分探测不能保证求解质量,过度探测则影响学习速度,为了克服学习算法性能高度依赖于状态空间探测程度这个问题,本文提出一种动态分层方法,该方法将免疫聚类及二次应答机制融入Sutton提出的Option分层强化学习框架,能对Option状态空间进行动态调整,并沿着学习轨迹动态生成Option内部策略,以二维有障碍栅格空间内两点间最短路径规划为学习任务进行了仿真实验,结果表明,动态分层方法对状态空间探测程度的依赖性很小,动态分层方法更适用于解决大规模强化学习问题.  相似文献   

12.
基于改进遗传算法的电力系统经济负荷分配   总被引:6,自引:1,他引:6       下载免费PDF全文
针对电力系统经济负荷分配问题,分析了遗传算法与传统数学优化方法的不同优势与特性,提出一种求解电力系统经济负荷分配问题的改进遗传算法.利用极大熵理论将经济负荷分配问题转化为可微问题,将BFGS法引入遗传算法,提出了BFGS算子,以提高遗传算法的寻优速度与局部搜索能力.同时,应用单纯形交叉算子将种群逐步向最优点进行引导,实现算法的快速寻优.实例研究结果验证了所提出方法的有效性.  相似文献   

13.
When designing with young children, designers usually select user centred design methods based on the children’s required level of engagement and the inspiration expected to be created according to the designer. User centred design methods should be selected for their suitability for children and for the quality of the output of the design method. To understand the suitability of design methods, a framework was developed to describe design methods in terms of required design skills as identified by the Theory of Multiple Intelligences. The proposed framework could provide the basis for a tool to compare design methods and to generate hypotheses about what design method would work optimally with children in a specific school grade. The initial examination of the viability of the framework is a comparison of design methods by the number of skills involved; earlier work showed that the involvement of more skills (as with, e.g. low-fi prototyping) could result in more options for a design problem than the involvement of fewer skills (as with e.g. brainstorming). Options and Criteria were counted to understand the quality of the method in terms of the amount of design-information. The results of the current paper indicate that 8-to-10-year-old children generate significantly more options in prototyping sessions than when they are involved in sessions applying a Nominal Group Technique. The paper indicates that (a) with the framework we can generate hypotheses to compare design methods with children and (b) that the outcome of various design methods, which might lead to very different representations, can be compared in terms of Options and Criteria. Further usage of the framework is expected to result in empirical support for selecting a design method to be applied with young children.  相似文献   

14.
Emergency department (ED) crowding is a common challenge for hospitals across the globe. The efficiency and effectiveness of ED services can be improved through identifying the causing ED crowding and modeling the prediction of ED crowding. The nature of ED crowding involves a complex dynamics of intertwined processes and workflows among the different departments within a hospital; thus, the problem cannot be tackled by examining ED alone. It is important to build a model which can identify the factors causing ED crowding and validate the coping strategies of hospitals. This study proposes an intelligence model which first introduces the well-know decision tree method to fit an accommodated nonlinear association and obtain intelligent grading rules of ED crowding; Then it integrates the intelligent grading rules and indexes of coping strategies to construct a hierarchical linear model. The results simultaneously solved traditional modeling issue of high correlation among independent variables and un-convergence. It also provides a better illustration of ED crowding phenomena with more accurate model fitting, as well as a clear linkage between coping strategies and the factors causing ED crowding. Furthermore, our proposed model can have a better understanding of problem nature and guild a better bed management for decision makers. It can also detect intelligently whether hospitals have drawn up active or passive bed management strategies to cope with ED crowding.  相似文献   

15.
Knowledge-based organization evaluation   总被引:1,自引:0,他引:1  
Knowledge has become the main value driver for modern organizations. In particular, knowledge-based organizations (KBOs) allocate resources to intangible assets (e.g., R&D) in the rapidly changing and highly competitive business environment in order to gain competitive advantages. Therefore, how to evaluate knowledge-based organizations has become one of the most important issues in knowledge management. The purpose of this paper is to provide a framework for the evaluation of KBOs under uncertainty, using the state-of-the-art methodology of Real Options. We define the unique features of KBOs and explain their value drivers. The present study's contribution is threefold: (1) it bridges the gaps in knowledge management literature related to evaluating knowledge capital; (2) it provides a systematic application of Real Options models in the context of knowledge-based organization evaluation; and, (3) it uses a real-world case to demonstrate the implications of the main findings for management.  相似文献   

16.
The growing costs of fuel and operation of power generating units warrant improvement of optimization methodologies for economic dispatch (ED) problems. The practical ED problems have non-convex objective functions with equality and inequality constraints that make it much harder to find the global optimum using any mathematical algorithms. Modern optimization algorithms are often meta-heuristic, and they are very promising in solving nonlinear programming problems. This paper presents a novel approach to determining the feasible optimal solution of the ED problems using the recently developed Firefly Algorithm (FA). Many nonlinear characteristics of power generators, and their operational constraints, such as generation limitations, prohibited operating zones, ramp rate limits, transmission loss, and nonlinear cost functions, were all contemplated for practical operation. To demonstrate the efficiency and applicability of the proposed method, we study four ED test systems having non-convex solution spaces and compared with some of the most recently published ED solution methods. The results of this study show that the proposed FA is able to find more economical loads than those determined by other methods. This algorithm is considered to be a promising alternative algorithm for solving the ED problems in practical power systems.  相似文献   

17.
This paper presents a method to solve the economic dispatch (ED) problem for thermal unit systems involving combined cycle (CC) units. The ED problem finds the optimal generation of each unit in order to minimize the total generation cost while satisfying the total demand and generating-capacity constraints. A CC unit presents multiple configurations or states, each state having its own unique cost curve. Therefore, in performing ED, we need to be able to shift between these cost curves. Moreover, the cost curve is not convex for some of these states. Hence, ED becomes a non-convex optimization problem, which is difficult to solve by conventional methods. In this paper we present a new technique, developed to find the global solution, that is based on the calculation of the infimal convolution. The paper includes the results for a case test and we compare our solution with other techniques.  相似文献   

18.
Real Options Theory is often applied when evaluating IT investments. The application of Real Options Theory is generally accompanied by a monetary valuation of real options through option pricing models which in turn are based on restrictive assumptions and thus subject to criticism. Therefore, this paper analyzes the application of option pricing models to the valuation of IT investments. A structured literature review reveals the types of IT investments which are valued with Real Options Theory in scientific literature. These types of IT investments are further investigated and their main characteristics are compared to the restrictive assumptions of traditional option pricing models. This analysis serves as a basis for further discussion on how the identified papers address these assumptions. The results show that a great deal of papers do not account for critical assumptions, although it is known that the assumptions are not fulfilled. Moreover, the type of IT investment determines the criticality of the assumptions. Additionally, several extensions or adaptions of traditional option pricing models can be found which provide the possibility to relax critical assumptions. Researchers can profit from the results derived in this paper in two ways: First, is is demonstrated which assumptions can be critical for which type of IT investments. Second, extensions of option pricing models that relax critical assumptions are introduced.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号