首页 | 本学科首页   官方微博 | 高级检索  
文章检索
  按 检索   检索词:      
出版年份:   被引次数:   他引次数: 提示:输入*表示无穷大
  收费全文   217篇
  免费   47篇
  国内免费   60篇
电工技术   24篇
综合类   30篇
机械仪表   9篇
建筑科学   1篇
矿业工程   1篇
能源动力   4篇
水利工程   1篇
武器工业   2篇
无线电   33篇
一般工业技术   8篇
冶金工业   1篇
自动化技术   210篇
  2024年   5篇
  2023年   11篇
  2022年   28篇
  2021年   25篇
  2020年   25篇
  2019年   11篇
  2018年   7篇
  2017年   11篇
  2016年   8篇
  2015年   10篇
  2014年   16篇
  2013年   13篇
  2012年   15篇
  2011年   21篇
  2010年   15篇
  2009年   17篇
  2008年   19篇
  2007年   12篇
  2006年   11篇
  2005年   7篇
  2004年   4篇
  2003年   6篇
  2002年   7篇
  2001年   4篇
  2000年   1篇
  1999年   4篇
  1998年   5篇
  1997年   2篇
  1996年   2篇
  1994年   2篇
排序方式: 共有324条查询结果,搜索用时 158 毫秒
1.
随着综合能源系统建设和电力市场改革推进,综合能源服务商有望成为新的市场交易成员。为解决申报阶段有限的决策参考信息制约申报策略制定的问题,文章提出了一种基于Q强化学习的综合能源服务商现货市场申报策略,以提升申报策略的理想度。该方法的主要特点在于充分利用庞大的历史运行信息,通过人工智能算法训练申报策略智能体,建立综合能源服务商所掌握的有限参考信息与最优申报策略之间的内在关系。智能体以市场公开信息、社会公共信息及服务商私有信息为环境变量,能够实现申报策略的自动生成和智能改进。最后,基于某省电网实际数据构造算例表明,该方法能较好地拟合合作博弈下的申报策略,具有收敛速度快、理想度高、计算效率高等特点,更符合综合能源服务商决策需求。  相似文献   
2.
将综合能源系统随机动态优化问题建模为马尔可夫决策过程,并引入Q学习算法实现该复杂问题的求解。针对Q学习算法的弊端,对传统的Q学习算法做了2个改进:改进了Q值表初始化方法,采用置信区间上界算法进行动作选择。仿真结果表明:Q学习算法在实现问题求解的同时保证了较好的收敛性,改进的初始化方法和采用的置信区间上界算法能显著提高计算效率,使结果收敛到更优解;与常规混合整数线性规划模型相比,Q学习算法具有更好的优化结果。  相似文献   
3.
BDI模型能够很好地解决在特定环境下的Agent的推理和决策问题,但在动态和不确定环境下缺少决策和学习的能力。强化学习解决了Agent在未知环境下的决策问题,却缺少BDI模型中的规则描述和逻辑推理。针对BDI在未知和动态环境下的策略规划问题,提出基于强化学习Q-learning算法来实现BDI Agent学习和规划的方法,并针对BDI的实现模型ASL的决策机制做出了改进,最后在ASL的仿真平台Jason上建立了迷宫的仿真,仿真实验表明,在加入Q-learning学习机制后的新的ASL系统中,Agent在不确定环境下依然可以完成任务。  相似文献   
4.
In this paper, a data-based scheme is proposed to solve the optimal tracking problem of autonomous nonlinear switching systems. The system state is forced to track the reference signal by minimizing the performance function. First, the problem is transformed to solve the corresponding Bellman optimality equation in terms of the Q-function (also named as action value function). Then, an iterative algorithm based on adaptive dynamic programming (ADP) is developed to find the optimal solution which is totally based on sampled data. The linear-in-parameter (LIP) neural network is taken as the value function approximator. Considering the presence of approximation error at each iteration step, the generated approximated value function sequence is proved to be boundedness around the exact optimal solution under some verifiable assumptions. Moreover, the effect that the learning process will be terminated after a finite number of iterations is investigated in this paper. A sufficient condition for asymptotically stability of the tracking error is derived. Finally, the effectiveness of the algorithm is demonstrated with three simulation examples.   相似文献   
5.
多Agent协作追捕问题是多Agent协调与协作研究中的一个典型问题。针对具有学习能力的单逃跑者追捕问题,提出了一种基于博弈论及Q学习的多Agent协作追捕算法。首先,建立协作追捕团队,并构建协作追捕的博弈模型;其次,通过对逃跑者策略选择的学习,建立逃跑者有限的Step-T累积奖赏的运动轨迹,并把运动轨迹调整到追捕者的策略集中;最后,求解协作追捕博弈得到Nash均衡解,每个Agent执行均衡策略完成追捕任务。同时,针对在求解中可能存在多个均衡解的问题,加入了虚拟行动行为选择算法来选择最优的均衡策略。C#仿真实验表明,所提算法能够有效地解决障碍环境中单个具有学习能力的逃跑者的追捕问题,实验数据对比分析表明该算法在同等条件下的追捕效率要优于纯博弈或纯学习的追捕算法。  相似文献   
6.
A self-learning energy management is proposed for plug-in hybrid electric bus, by combining Q-Learning (QL) and Pontryagin's minimum principle algorithms. Different from the existing strategies, the expert experience and generalization performance are focused in the proposed strategy. The expert experience is designed as the approximately optimal reference state-of-charge (SOC) trajectories, and the generalization performance is enhanced by a multiply driving cycle training method. In specific, an efficient zone of SOC is firstly designed based on the approximately optimal reference SOC trajectories. Then, the agent of the QL is trained off-line by taking the expert experience as reference SOC trajectories. Finally, an adaptive strategy is proposed based on the well-trained agent. Specially, two different reward functions are defined. That is, the reward function in the off-line training mainly considers the tracking performance between the expert experience and the SOC, while mainly considering the punishment in the adaptive strategy. Simulation results show that the proposed strategy has good generalization performance and can improve the fuel economy by 22.49%, compared to a charge depleting-charge sustaining (CDCS) strategy.  相似文献   
7.
Suitable rescue path selection is very important to rescue lives and reduce the loss of disasters,and has been a key issue in the field of disaster response management.In this paper,we present a path selection algorithm based on Q-learning for disaster response applications.We assume that a rescue team is an agent,which is operating in a dynamic and dangerous environment and needs to find a safe and short path in the least time.We first propose a path selection model for disaster response management,and deduce that path selection based on our model is a Markov decision process.Then,we introduce Q-learning and design strategies for action selection and to avoid cyclic path.Finally,experimental results show that our algorithm can find a safe and short path in the dynamic and dangerous environment,which can provide a specific and significant reference for practical management in disaster response applications.  相似文献   
8.
为了改善节点的学习策略,提高节点的应用性能,以数据收集为应用建立任务模型,提出基于Q学习和规划的传感器节点任务调度算法,包括定义状态空间、延迟回报、探索和利用策略等基本元素.根据无线传感器网络(WSN)特性,建立基于优先级机制和过期机制的规划过程,使节点可以有效利用经验知识,改善学习策略.实验表明,文中算法具备根据当前WSN环境进行动态任务调度的能力.相比其它任务调度算法,文中算法能量消耗合理且获得较好的应用性能.  相似文献   
9.
In this paper a learning mechanism for reactive fuzzy controller design of a mobile robot navigating in unknown environments is proposed. The fuzzy logical controller is constructed based on the kinematics model of a real robot. The approach to learning the fuzzy rule base by relatively simple and less computational Q-learning is described in detail. After analyzing the credit assignment problem caused by the rules collision, a remedy is presented. Furthermore, time-varying parameters are used to increase the learning speed. Simulation results prove the mechanism can learn fuzzy navigation rules successfully only using scalar reinforcement signal and the rule base learned is proved to be correct and feasible on real robot platforms.  相似文献   
10.
基于量子计算的多Agent协作学习算法   总被引:1,自引:0,他引:1       下载免费PDF全文
针对多Agent协作强化学习中存在的行为和状态维数灾问题,以及行为选择上存在多个均衡解,为了收敛到最佳均衡解需要搜索策略空间和协调策略选择问题,提出了一种新颖的基于量子理论的多Agent协作学习算法。新算法借签了量子计算理论,将多Agent的行为和状态空间通过量子叠加态表示,利用量子纠缠态来协调策略选择,利用概率振幅表示行为选择概率,并用量子搜索算法来加速多Agent的学习。相应的仿真实验结果显示新算法的有效性。  相似文献   
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号