首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
One of the most important steps in designing a model predictive control strategy is selecting appropriate parameters for the relative weights of the objective function. Typically, these are selected through trial and error to meet the desired performance. In this paper, a reinforcement learning technique called learning automata is used to select appropriate parameters for the controller of a differential drive robot through a simulation process. Results of the simulation show that the parameters always converge, although to different values. A controller chosen by the learning process is then ported to a real platform. The selected controller is shown to control the robot better than a standard model predictive control.  相似文献   

简述了虚拟现实技术的功能特征及相关技术概念,探讨该技术在电力系统仿真培训中的应用。以变电站虚拟屏台为例,说明电力系统虚拟环境开发过程,并介绍了用三维驱动控件技术构建变电站虚拟屏台的方法。利用虚拟现实技术,极大地方便了电力系统的仿真培训,具有显著的经济效益和社会效益。  相似文献   

In this paper, a learning algorithm combining memory‐less learning and memory‐based learning is proposed for agents operating under POMDP. In the first stage of the proposed algorithm, memory‐less learning is applied. The stochastic gradient method is employed as a memory‐less learning algorithm. In the first stage, a state‐action set series that accomplishes the task is stored in memory. In the second stage, memory‐based learning is applied. In this process, only the series obtained in the first stage is used, so that this method is able to reduce significantly the amount of required memory. The proposed algorithm is applied to three simulations for comparison with the memory‐less learning algorithm. Through computer simulations, it is shown that the proposed algorithm works more effectively in POMDP than ordinary memory‐less learning. © 2010 Wiley Periodicals, Inc. Electr Eng Jpn, 173(1): 32–40, 2010; Published online in Wiley InterScience ( www.interscience.wiley.com ). DOI 10.1002/eej.20963  相似文献   

A mobile robot must move without unacceptable rapid motion. To address this issue, we propose a preview controller using the time‐based spline approach. With this approach, it is also important to plan an adequate trajectory. Here an approach to trajectory planning which has the trajectory determination strategy via a virtual manipulator is proposed. Numerical and experimental results are shown to confirm the proposed algorithm. © 2005 Wiley Periodicals, Inc. Electr Eng Jpn, 151(4): 65–71, 2005; Published online in Wiley InterScience ( www.interscience.wiley.com ). DOI 10.1002/eej.10349  相似文献   

遥操作技术是空间、深海以及遥远距离等特殊环境下机器人完成作业任务的关键技术。构建了基于PHANTOM手控器及MOTOMAN机器人的遥操作机器人系统平台,能够实现对象的主从抓取、自主抓取以及两者结合抓取等几种抓取任务,并完成人性化用户界面的设计。针对遥操作机器人主从双边存在的时延问题,设计了基于任务空间检测的自主模式和主从模式相结合的遥操作机器人系统工作模式,相比于传统的遥操作机器人系统所采用的工作模式更加省时高效。实验结果表明所提出的工作模式能够满足存在通信时延情况下遥操作机器人完成任务的需要。  相似文献   

This paper develops a multi-timescale coordinated operation method for microgrids based on modern deep reinforcement learning. Considering the complementary characteristics of different storage devices, the proposed approach achieves multi-timescale coordination of battery and supercapacitor by introducing a hierarchical two-stage dispatch model. The first stage makes an initial decision irrespective of the uncertainties using the hourly predicted data to minimize the operational cost. For the second stage, it aims to generate corrective actions for the first-stage decisions to compensate for real-time renewable generation fluctuations. The first stage is formulated as a non-convex deterministic optimization problem, while the second stage is modeled as a Markov decision process solved by an entropy-regularized deep reinforcement learning method, i.e., the Soft Actor-Critic. The Soft Actor-Critic method can efficiently address the exploration–exploitation dilemma and suppress variations. This improves the robustness of decisions. Simulation results demonstrate that different types of energy storage devices can be used at two stages to achieve the multi-timescale coordinated operation. This proves the effectiveness of the proposed method.  相似文献   

新型电力系统的“双高”趋势改变了电力系统经典稳定特性,导致稳定机理更复杂,系统稳定模式更多样,因此基于典型运行方式的在线稳定控制策略面临挑战。为解决新型电力系统的功角稳定问题,提出了基于安全强化学习的稳控策略智能生成方法。首先,建立了电力系统稳控问题的含约束马尔可夫模型,归纳并提出了紧急控制切机动作涉及的安全约束。其次,为了提高对于电网暂态响应的时空特征提取能力,构建了基于图卷积层和长短期记忆单元的特征感知网络。然后,为了提高稳控策略智能体的训练效率,提出了基于内嵌领域知识约束的近端策略优化算法稳控策略训练框架。最后,在IEEE 39节点系统和某实际电网中进行测试验证。结果表明,所提方法能够根据系统运行状态和故障响应自适应生成切机稳控策略,其决策效果和效率均优于现有的稳控策略。  相似文献   

根据远程分布式测控的工程需要,研究虚拟仪器系统中的网络通信功能和各种通信协议的应用;将TCP协议的可靠性与UDP协议的灵活性相结合,开发适应复杂恶劣网络环境的实用网络通信技术.  相似文献   

智能变电站二次回路设计基本相通,由于虚端子描述尚不统一,使得新建站虚端子连接无法复用,仍采用人工点对点连接,导致效率不高。提出自动创建虚回路模板库,通过关键字匹配实现虚回路的自动设计和完整性及正确性校验,其核心在于学习型模板库海量学习已有配置描述(SCD)文件,采用中文分词技术进行关键字提取,引入经典的RKR-GST算法完成虚端子描述的相似度计算,从而进行关键字的匹配整合,从而创建、整理和完善虚回路模板库。试验证明,通过该方法对新建智能变电站的虚端子连接效率和准确性都有较大提升。  相似文献   

研究了一种基于虚拟同步发电机算法的微网逆变器控制策略,并分别设计了虚拟同步发电机算法、虚拟原动机调节及虚拟励磁电流调节模块,通过电压电流双环控制,使逆变器输出的电压幅值及频率分别与无功功率及有功功率呈现良好的下垂特性、且能够较快的跟踪负荷的变化,从而提高了系统的稳定性和可靠性。此外文章对虚拟原动机调节模块进行了改进,能够实现频率的实时无差控制,提高了频率控制的精度和响应速度。仿真结果表明,逆变器输出电压较好的模拟了同步发电机的外特性,且相电压的畸变率仅为0.2%,同时输出频率不随负荷的变化发生偏移,可稳定在49.98Hz,验证了文章理论分析的正确性和可行性。  相似文献   

This work deals with the leader‐follower and the leaderless consensus problems in networks of multiple robot manipulators. The robots are non‐identical, kinematically different (heterogeneous), and their physical parameters are uncertain. The main contribution of this work is a novel controller that solves the two consensus problems, in the task space, with the following features: it estimates the kinematic and the dynamic physical parameters; it is robust to interconnecting variable‐time delays; it employs the singularity‐free unit‐quaternions to represent the orientation; and, using energy‐like functions, the controller synthesis follows a constructive procedure. Simulations using a network with four heterogeneous manipulators illustrate the performance of the proposed controller. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

This paper presents an online learning algorithm based on integral reinforcement learning (IRL) to design an output‐feedback (OPFB) H tracking controller for partially unknown linear continuous‐time systems. Although reinforcement learning techniques have been successfully applied to find optimal state‐feedback controllers, in most control applications, it is not practical to measure the full system states. Therefore, it is desired to design OPFB controllers. To this end, a general bounded L2 ‐gain tracking problem with a discounted performance function is used for the OPFB H tracking. A tracking game algebraic Riccati equation is then developed that gives a Nash equilibrium solution to the associated min‐max optimization problem. An IRL algorithm is then developed to solve the game algebraic Riccati equation online without requiring complete knowledge of the system dynamics. The proposed IRL‐based algorithm solves an IRL Bellman equation in each iteration online in real time to evaluate an OPFB policy and updates the OPFB gain using the information given by the evaluated policy. An adaptive observer is used to provide the knowledge of the full states for the IRL Bellman equation during learning. However, the observer is not needed after the learning process is finished. A simulation example is provided to verify the convergence of the proposed algorithm to a suboptimal OPFB solution and the performance of the proposed method.  相似文献   

针对刚性臂机器人系统,提出基于极限学习机(ELM)的两种自适应神经控制算法。极限学习机随机选择单隐层前馈神经网络(SLFNs)的隐层节点及其参数,仅调整其网络的输出权值,以极快的学习速度可获得良好的推广性。在自适应控制算法中,ELM逼近系统的未知非线性函数,附加的鲁棒控制项补偿系统的逼近误差。ELM神经控制器的参数自适应调整律及鲁棒控制项由Lyapunov稳定性理论分析得出,所设计的两种控制算法均不依赖于初始条件的约束且放松对参数有界的要求,同时保证闭环系统跟踪误差满足全局稳定而且渐近收敛于零。将所提出的ELM控制器应用于二连杆刚性臂机器人跟踪控制实例中,并与现有的径向基函数(RBF)神经网络自适应控制算法进行比较,仿真结果表明,在同等条件下,ELM控制器具有良好的跟踪控制性能,显示出其有效性和应用潜力。  相似文献   

Recently, two degrees‐of‐freedom (2DOF) control has been widely recognized to be efficient. The major merit of 2DOF control is independence between tracking performance and the feedback performance. However, there is a limitation on tracking performance in the 2DOF control system. In this paper, we propose a new control system that consists of a conventional 2DOF controller and a learning controller. The role of the learning controller is to realize high tracking performance, which cannot be realized alone by the 2DOF controller. The learning controller can be designed by using only information specifying a 2DOF controller, and it does not need information about the controlled plant. We show some experimental results to verify the effectiveness of the proposed system. © 1999 Scripta Technica, Electr Eng Jpn, 128(4): 102–110, 1999  相似文献   

A virtual cathode oscillator with a stainless‐steel mesh anode of various transparencies and wire diameters was studied experimentally for the enhancement of microwave power and its repetitive operations. The maximum microwave power observed was about 20 MW at 12 GHz for a diode voltage of 250 kV and an electron beam current of 39 kA using an anode mesh with wire diameter of 0.22 mm and a transparency of 67%. The microwave emission was enhanced by decreasing the mean angle of beam scattering when a mesh of smaller wire diameters was used in the anode. The increased transparency of the fine mesh also contributed to the enhancement of the microwave emission. Use of the mesh anode afforded the operation in several repetitive shots. © 2003 Wiley Periodicals, Inc. Electr Eng Jpn, 146(2): 1–10, 2004; Published online in Wiley InterScience ( www.interscience.wiley.com ). DOI 10.1002/eej.10259  相似文献   

强化学习理论是人工智能领域中机器学习方法的一个重要分支,也是马尔可夫决策过程的一类重要方法.所谓强化学习就是智能系统从环境到行为映射的学习,以使奖励信号(强化信号)函数值最大.强化学习理论及其应用研究近年来日益受到国际机器学习和智能控制学术界的重视.系统地介绍了强化学习的基本思想和算法,综述了目前强化学习在安全稳定控制、自动发电控制、电压无功控制及电力市场等方面应用研究的主要成果与方法,并探讨了该课题在电力系统运行控制中的巨大潜力,以及与经典控制、神经网络、模糊理论和多Agent系统等智能控制技术的相互结合问题,最后对强化学习在电力科学领域的应用前景作出了展望.  相似文献   

This paper focuses on the application results of the dispersed autonomous voltage control system, which the authors have proposed, to a real distribution network. This system is effective for regulating the supply voltage of an entire HV line within an optimum range. In the system configuration, an SCC is installed together with an SC and/or ShR on the line. Individual SCCs autonomously control the operation of SCs and/or ShRs based on the voltage measured where the SCs and/or ShRs are located on the line. A field test on a real high‐voltage distribution network found that the proposed system could sustain a high fault tolerance ability and also be cost‐effective in regulating line voltage. © 2003 Wiley Periodicals, Inc. Electr Eng Jpn, 146(1): 27–36, 2004; Published online in Wiley InterScience ( www.interscience.wiley.com ). DOI 10.1002/eej.10252  相似文献   

作为多类分布式能源的集成者,微网在促进清洁低碳能源发展方面有巨大潜力.然而,可再生能源出力的不确定性给微网的管理带来了挑战,同时也将这种不确定因素带给外部电网.文章基于实时市场,构建了一个包含新能源机组、传统机组和需求响应资源的微网环境,并采用了能够利用环境信息的深度确定性策略梯度算法,这种无模型(Model-free...  相似文献   

In this paper, iterative learning control (ILC) of a class of non‐affine‐in‐input processes is considered in Hilbert space, where the plant operators are quite general in the sense that they could be static or dynamic, differentiable or non‐differentiable, continuous‐time or discrete‐time, and so forth. The control problem is first transformed to a problem of solving global implicit function to ensure the uniqueness of desired control input. Then, two contraction mapping‐based ILC schemes are proposed in terms of the continuous differentiability of process model, where the learning convergence condition is derived through rigorous analysis. The proposed ILC schemes make full use of the process repetition, deal with system uncertainties easily, and are effective to infinite‐dimensional or distributed parameter systems. In the end, the learning controller is applied to the boundary output control of a class of anaerobic digestion process for wastewater treatment. The control efficacy is verified by simulation. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

In this work, we present a novel iterative learning control (ILC) scheme for a class of joint position constrained robot manipulator systems with both multiplicative and additive actuator faults. Unlike most ILC literature that requires identical reference trajectory from trail to trail, in this work the reference trajectory can be non‐repetitive over the iteration domain without assuming the identical initial condition. A tan‐type Barrier Lyapunov Function is proposed to deal with the constraint requirements which can be both time and iteration varying, with ILC update laws adopted to learn the iteration‐invariant system uncertainties, and robust methods used to compensate the iteration and time varying actuator faults and disturbances. We show that under the proposed ILC scheme, uniform convergence of the full state tracking error beyond a small time interval in each iteration can be guaranteed over the iteration domain, while the constraint requirements on the joint position vector will not be violated during operation. An illustrative example on a two degree‐of‐freedom robotic manipulator is presented to demonstrate the effectiveness of the proposed control scheme. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号