首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
为实现复杂任务环境中多无人机的自主飞行, 本文采用改进的强化学习算法,设计了一种具有避碰避障功能的多无人机智能航迹规划策略。通过改进搜索策略、引入具有近似功能的神经网络函数、构造合理的立即回报函数等方法,提高算法运算的灵活性、降低无人机运算负担, 使得多无人机能够考虑复杂任务环境中风速等随机因素以及静态和动态威胁的影响, 自主规划出从初始位置到指定目标点的安全可行航迹。为了探索所提算法在实际飞行过程的可行性, 本文以四旋翼无人机为实验对象, 在基于ROS的仿真环境中验证了算法的可行性与有效性。  相似文献   

2.
ERP implementations remain problematic despite the fact that many of the issues are by now quite well known. In this paper, we take a different perspective from the critical success factors and risks approaches that are common in the information systems discipline to explain why ERP implementations fail. Specifically, we adapt Sitkin's theory of intelligent failure to ERP implementations resulting in a theory that we call learning from failure. We then examine from the viewpoint of this theory the details of two SAP R/3 implementations, one of which failed while the other succeeded. Although it is impossible to state, unequivocally, that the implementation that failed did so because it did not use the approach that was derived from the theory, the analysis reveals that the company that followed many of the tenets of the theory succeeded while the other did not.  相似文献   

3.
Lundqvist  Thomas  Stenström  Per 《Real-Time Systems》1999,17(2-3):183-207
Previously published methods for estimation of the worst-case execution time on high-performance processors with complex pipelines and multi-level memory hierarchies result in overestimations owing to insufficient path and/or timing analysis. This does not only give rise to poor utilization of processing resources but also reduces the schedulability in real-time systems. This paper presents a method that integrates path and timing analysis to accurately predict the worst-case execution time for real-time programs on high-performance processors. The unique feature of the method is that it extends cycle-level architectural simulation techniques to enable symbolic execution with unknown input data values; it uses alternative instruction semantics to handle unknown operands. We show that the method can exclude many infeasible (or non-executable) program paths and can calculate path information, such as bounds on number of loop iterations, without the need for manual annotations of programs. Moreover, the method is shown to accurately analyze timing properties of complex features in high-performance processors using multiple-issue pipelines and instruction and data caches. The combined path and timing analysis capability is shown to derive exact estimates of the worst-case execution time for six out of seven programs in our benchmark suite.  相似文献   

4.
Reinforcement Learning (RL) is learning through directexperimentation. It does not assume the existence of a teacher thatprovides examples upon which learning of a task takes place. Instead, inRL experience is the only teacher. With historical roots on the study ofbiological conditioned reflexes, RL attracts the interest of Engineersand Computer Scientists because of its theoretical relevance andpotential applications in fields as diverse as Operational Research andIntelligent Robotics.Computationally, RL is intended to operate in a learning environmentcomposed by two subjects: the learner and a dynamic process. Atsuccessive time steps, the learner makes an observation of the processstate, selects an action and applies it back to the process. Its goal isto find out an action policy that controls the behavior of the dynamicprocess, guided by signals (reinforcements) that indicate how badly orwell it has been performing the required task. These signals are usuallyassociated to a dramatic condition – e.g., accomplishment of a subtask(reward) or complete failure (punishment), and the learner tries tooptimize its behavior by using a performance measure (a function of thereceived reinforcements). The crucial point is that in order to do that,the learner must evaluate the conditions (associations between observedstates and chosen actions) that led to rewards or punishments.Starting from basic concepts, this tutorial presents the many flavorsof RL algorithms, develops the corresponding mathematical tools, assesstheir practical limitations and discusses alternatives that have beenproposed for applying RL to realistic tasks.  相似文献   

5.
学习、交互及其结合是建立健壮、自治agent的关键必需能力。强化学习是agent学习的重要部分,agent强化学习包括单agent强化学习和多agent强化学习。文章对单agent强化学习与多agent强化学习进行了比较研究,从基本概念、环境框架、学习目标、学习算法等方面进行了对比分析,指出了它们的区别和联系,并讨论了它们所面临的一些开放性的问题。  相似文献   

6.
教学的个性化和智能化是智能教学系统研究的重点和难点。文章采用智能代理技术模拟系统中学生的智能和行为方式,将强化学习理论应用于多代理体,设计了结合资格迹理论的强化学习算法,并用以生成和调整适合于每个学生个体的教学内容和教学策略。多代理体技术实现了教学的个性化,强化学习算法使得教学策略具有智能化。实验结果表明,新的算法较原有算法更为有效。  相似文献   

7.
简述了研究船舶拟人智能避碰决策(PIDVCA)的意义、目标及其实现方法,提出了基于机器学习构建动态避碰知识库的关键技术及PIDVCA理论的机器学习机制,并结合避碰仿真实例,着重讨论了PIDVCA理论的集成机器学习策略以及获取动态避碰知识的机理.  相似文献   

8.
Long-Ji Lin 《Machine Learning》1992,8(3-4):293-321
To date, reinforcement learning has mostly been studied solving simple learning tasks. Reinforcement learning methods that have been studied so far typically converge slowly. The purpose of this work is thus two-fold: 1) to investigate the utility of reinforcement learning in solving much more complicated learning tasks than previously studied, and 2) to investigate methods that will speed up reinforcement learning.This paper compares eight reinforcement learning frameworks: adaptive heuristic critic (AHC) learning due to Sutton, Q-learning due to Watkins, and three extensions to both basic methods for speeding up learning. The three extensions are experience replay, learning action models for planning, and teaching. The frameworks were investigated using connectionism as an approach to generalization. To evaluate the performance of different frameworks, a dynamic environment was used as a testbed. The environment is moderately complex and nondeterministic. This paper describes these frameworks and algorithms in detail and presents empirical evaluation of the frameworks.  相似文献   

9.
随着智能电网的不断发展,变电站数量随之增加。针对变电站中巡检任务繁重以及人工巡检可视化水平低的问题,该文提出了一种基于改进深度强化学习的变电站机器人巡检路径规划方法。结合巡检机器人的运动模型,设计深度强化学习的动作和状态空间。将深度强化学习网络与人工势场相结合,重新构造深度强化学习的奖励函数,优化卷积神经网络结构。通过实际变电站场景进行验证,提出的改进深度强化学习算法较传统算法计算时间更短,效率更高,更有利于对变电站巡检机器人的巡检路径进行精准规划,提升变电站的自动化程度水平。  相似文献   

10.
To cope with the issue of ``brain drain' intoday's competitive industrial environment, itis important to capture relevant experience andknowledge in order to sustain the continualgrowth of company business. In this respect,the study in the domain of knowledge learningis of paramount importance in terms ofcapturing and reuse of tacit and explicitknowledge. To support the process of knowledgelearning, a methodology to establish anintelligent system, which consists of bothOn-Line Analytical Processing (OLAP) and fuzzylogic principles, is suggested. This paperattempts to propose this approach forintegrating OLAP and fuzzy logic to form anintelligent system, capitalizing on the meritsand at the same time offsetting the drawbacksof the involved technologies. In this system,the values and positions of related fuzzy setsare modified to suit the industrialenvironment, supporting smoother operation withless error. To validate the feasibility of theproposed system, a case study related to themonitoring of chemical concentration of PCBelectroplating process is covered in thepaper.  相似文献   

11.
Machine Learning for Intelligent Processing of Printed Documents   总被引:1,自引:0,他引:1  
A paper document processing system is an information system component which transforms information on printed or handwritten documents into a computer-revisable form. In intelligent systems for paper document processing this information capture process is based on knowledge of the specific layout and logical structures of the documents. This article proposes the application of machine learning techniques to acquire the specific knowledge required by an intelligent document processing system, named WISDOM++, that manages printed documents, such as letters and journals. Knowledge is represented by means of decision trees and first-order rules automatically generated from a set of training documents. In particular, an incremental decision tree learning system is applied for the acquisition of decision trees used for the classification of segmented blocks, while a first-order learning system is applied for the induction of rules used for the layout-based classification and understanding of documents. Issues concerning the incremental induction of decision trees and the handling of both numeric and symbolic data in first-order rule learning are discussed, and the validity of the proposed solutions is empirically evaluated by processing a set of real printed documents.  相似文献   

12.
To create autonomous robots,both hardware and software are needed.If enormous progress has already been made in the field of equipment,then robot software depends on the development of artificial intelligence.This article proposes a solution for creating\"logical\"brains for autonomous robots,namely,an approach for creating an intelligent robot action planner based on Mivar expert sys-tems.The application of this approach provides opportunities to reduce the computational complexity of solving planning problems and the requirements for the computational characteristics of hardware platforms on which intelligent planning systems are deployed.To theoretically and practically justify the expediency of using logically solving systems,in particular Mivar expert systems,to create intel-ligent planners,the MIPRA(Mivar-based Intelligent Planning of Robot Actions)planner was created to solve problems such as STRIPS for permutation cubes in the Blocks World domain.The planner is based on the platform for creating expert systems of the Razumator.As a result,the Mivar planner can process information about the state of the subject area based on the analysis of cause-effect relation-ships and an algorithm for automatically constructing logical inference(finding a solution from\"Given\"to\"Find\").Moreover,an im-portant feature of the MIPRA is that the system is built on the principles of a\"white box\",due to which the system can explain any of its decisions and provide justification for the actions performed in the form of a retrospective of the stages of the decision-making process.When preparing a set of robot actions aimed at changing control objects,expert knowledge is used,which is the basis for the functioning algorithms of the planner.This approach makes it possible to include an expert in the process of organizing the work of the intelligent planner and use existing knowledge about the subject area.Practical experiments of this study have shown that instead of many hours and powerful multiprocessor servers,the MIPRA on a personal computer solves the planning problems with the following number of cubes:10 cubes can be rearranged in 0.028 seconds,100 cubes in 0.938 seconds,and 1 000 cubes in 84.188 seconds.The results of this study can be used to reduce the computational complexity of solving tasks of planning the actions of robots,as well as their groups,mul-tilevel heterogeneous robotic systems,and cyber-physical systems of various bases and purposes.Practical demonstration of MIPRA:ht-tps://mivar.org/en/about/contacts/  相似文献   

13.
In this paper, we investigate Reinforcement learning (RL) in multi-agent systems (MAS) from an evolutionary dynamical perspective. Typical for a MAS is that the environment is not stationary and the Markov property is not valid. This requires agents to be adaptive. RL is a natural approach to model the learning of individual agents. These Learning algorithms are however known to be sensitive to the correct choice of parameter settings for single agent systems. This issue is more prevalent in the MAS case due to the changing interactions amongst the agents. It is largely an open question for a developer of MAS of how to design the individual agents such that, through learning, the agents as a collective arrive at good solutions. We will show that modeling RL in MAS, by taking an evolutionary game theoretic point of view, is a new and potentially successful way to guide learning agents to the most suitable solution for their task at hand. We show how evolutionary dynamics (ED) from Evolutionary Game Theory can help the developer of a MAS in good choices of parameter settings of the used RL algorithms. The ED essentially predict the equilibriums outcomes of the MAS where the agents use individual RL algorithms. More specifically, we show how the ED predict the learning trajectories of Q-Learners for iterated games. Moreover, we apply our results to (an extension of) the COllective INtelligence framework (COIN). COIN is a proved engineering approach for learning of cooperative tasks in MASs. The utilities of the agents are re-engineered to contribute to the global utility. We show how the improved results for MAS RL in COIN, and a developed extension, are predicted by the ED. Author funded by a doctoral grant of the institute for advancement of scientific technological research in Flanders (IWT).  相似文献   

14.
Agents in a competitive interaction can greatly benefit from adapting to a particular adversary, rather than using the same general strategy against all opponents. One method of such adaptation isOpponent Modeling, in which a model of an opponent is acquired and utilized as part of the agents decision procedure in future interactions with this opponent. However, acquiring an accurate model of a complex opponent strategy may be computationally infeasible. In addition, if the learned model is not accurate, then using it to predict the opponents actions may potentially harm the agents strategy rather than improving it. We thus define the concept ofopponent weakness, and present a method for learning a model of this simpler concept. We analyze examples of past behavior of an opponent in a particular domain, judging its actions using a trusted judge. We then infer aweakness model based on the opponents actions relative to the domain state, and incorporate this model into our agents decision procedure. We also make use of a similar self-weakness model, allowing the agent to prefer states in which the opponent is weak and our agent strong; where we have arelative advantage over the opponent. Experimental results spanning two different test domains demonstrate the agents improved performance when making use of the weakness models.  相似文献   

15.
Kazakov  Dimitar  Manandhar  Suresh 《Machine Learning》2001,43(1-2):121-162
This article presents a combination of unsupervised and supervised learning techniques for the generation of word segmentation rules from a raw list of words. First, a language bias for word segmentation is introduced and a simple genetic algorithm is used in the search for a segmentation that corresponds to the best bias value. In the second phase, the words segmented by the genetic algorithm are used as an input for the first order decision list learner CLOG. The result is a set of first order rules which can be used for segmentation of unseen words. When applied on either the training data or unseen data, these rules produce segmentations which are linguistically meaningful, and to a large degree conforming to the annotation provided.  相似文献   

16.
Tecuci  Gheorghe 《Machine Learning》1993,11(2-3):237-261
This article describes a framework for the deep and dynamic integration of learning strategies. The framework is based on the idea that each single-strategy learning method is ultimately the result of certain elementary inferences (like deduction, analogy, abduction, generalization, specialization, abstraction, concretion, etc.). Consequently, instead of integrating learning strategies at a macro level, we propose to integrate the different inference types that generate individual learning strategies. The article presents a concept-learning and theory-revision method that was developed in this framework. It allows the system to learn from one or from several (positive and/or negative) examples, and to both generalize and specialize its knowledge base. The method integrates deeply and dynamically different learning strategies, depending on the relationship between the input information and the knowledge base. It also behaves as a single-strategy learning method whenever the applicability conditions of such a method are satisfied.  相似文献   

17.
While people compare images using semantic concepts, computers compare images using low-level visual features that sometimes have little to do with these semantics. To reduce the gap between the high-level semantics of visual objects and the low-level features extracted from them, in this paper we develop a framework of learning pseudo metrics (LPM) using neural networks for semantic image classification and retrieval. Performance analysis and comparative studies, by experimenting on an image database, show that the LPM has potential application to multimedia information processing.  相似文献   

18.
The Multi-Agent Distributed Goal Satisfaction (MADGS) system facilitates distributed mission planning and execution in complex dynamic environments with a focus on distributed goal planning and satisfaction and mixed-initiative interactions with the human user. By understanding the fundamental technical challenges faced by our commanders on and off the battlefield, we can help ease the burden of decision-making. MADGS lays the foundations for retrieving, analyzing, synthesizing, and disseminating information to commanders. In this paper, we present an overview of the MADGS architecture and discuss the key components that formed our initial prototype and testbed. Eugene Santos, Jr. received the B.S. degree in mathematics and Computer science and the M.S. degree in mathematics (specializing in numerical analysis) from Youngstown State University, Youngstown, OH, in 1985 and 1986, respectively, and the Sc.M. and Ph.D. degrees in computer science from Brown University, Providence, RI, in 1988 and 1992, respectively. He is currently a Professor of Engineering at the Thayer School of Engineering, Dartmouth College, Hanover, NH, and Director of the Distributed Information and Intelligence Analysis Group (DI2AG). Previously, he was faculty at the Air Force Institute of Technology, Wright-Patterson AFB and the University of Connecticut, Storrs, CT. He has over 130 refereed technical publications and specializes in modern statistical and probabilistic methods with applications to intelligent systems, multi-agent systems, uncertain reasoning, planning and optimization, and decision science. Most recently, he has pioneered new research on user and adversarial behavioral modeling. He is an Associate Editor for the IEEE Transactions on Systems, Man, and Cybernetics: Part B and the International Journal of Image and Graphics. Scott DeLoach is currently an Associate Professor in the Department of Computing and Information Sciences at Kansas State University. His current research interests include autonomous cooperative robotics, adaptive multiagent systems, and agent-oriented software engineering. Prior to coming to Kansas State, Dr. DeLoach spent 20 years in the US Air Force, with his last assignment being as an Assistant Professor of Computer Science and Engineering at the Air Force Institute of Technology. Dr. DeLoach received his BS in Computer Engineering from Iowa State University in 1982 and his MS and PhD in Computer Engineering from the Air Force Institute of Technology in 1987 and 1996. Michael T. Cox is a senior scientist in the Intelligent Distributing Computing Department of BBN Technologies, Cambridge, MA. Previous to this position, Dr. Cox was an assistant professor in the Department of Computer Science & Engineering at Wright State University, Dayton, Ohio, where he was the director of Wright State’s Collaboration and Cognition Laboratory. He received his Ph.D. in Computer Science from the Georgia Institute of Technology, Atlanta, in 1996 and his undergraduate from the same in 1986. From 1996 to 1998, he was a postdoctoral fellow in the Computer Science Department at Carnegie Mellon University in Pittsburgh working on the PRODIGY project. His research interests include case-based reasoning, collaborative mixed-initiative planning, intelligent agents, understanding (situation assessment), introspection, and learning. More specifically, he is interested in how goals interact with and influence these broader cognitive processes. His approach to research follows both artificial intelligence and cognitive science directions.  相似文献   

19.
We have verified the FM9801, a microprocessor design whose features include speculative execution, out-of-order issue and completion of instructions using Tomasulo's algorithm, and precise exceptions and interrupts. As a correctness criterion, we used a commutative diagram that compares the result of the pipelined execution from a flushed state to another flushed state with that of the sequential execution. Like many pipelined microprocessors, the FM9801 may not operate correctly if the executed program modifies itself. We discuss the condition under which the processor is guaranteed to operate correctly. In order to show that the correctness criterion is satisfied, we introduce an intermediate abstraction that records the history of executed instructions. Using this abstraction, we define a number of invariant properties that must hold during the operation of the FM9801. We verify these invariant properties, and then derive the proof of the commutative diagram from them. The proof has been mechanically checked by the ACL2 theorem prover.  相似文献   

20.
A new approach is presented to deal with the problem of modelling and simulating the control mechanisms underlying planned-arm-movements. We adopt a synergetic view in which we assume that the movement patterns are not explicitly programmed but rather are emergent properties of a dynamic system constrained by physical laws in space and time. The model automatically translates a high-level command specification into a complete movement trajectory. This is an inverse problem, since the dynamic variables controlling the current state of the system have to be calculated from movement outcomes such as the position of the arm endpoint. The proposed method is based on an optimization strategy: the dynamic system evolves towards a stable equilibrium position according to the minimization of a potential function. This system, which could well be described as a feedback control loop, obeys a set of non-linear differential equations. The gradient descent provides a solution to the problem which proves to be both numerically stable and computationally efficient. Moreover, the addition into the control loop of elements whose structure and parameters have a pertinent biological meaning allows for the synthesis of gestural signals whose global patterns keep the main invariants of human gestures. The model can be exploited to handle more complex gestures involving planning strategies of movement. Finally, the extension of the approach to the learning and control of non-linear biological systems is discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号