期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

全文获取类型

收费全文	220篇
免费	44篇
国内免费	63篇

专业分类

电工技术	25篇
综合类	30篇
机械仪表	9篇
建筑科学	1篇
矿业工程	1篇
能源动力	4篇
水利工程	1篇
武器工业	2篇
无线电	33篇
一般工业技术	8篇
冶金工业	1篇
自动化技术	212篇

出版年

2024年	8篇
2023年	12篇
2022年	28篇
2021年	25篇
2020年	25篇
2019年	11篇
2018年	7篇
2017年	11篇
2016年	8篇
2015年	10篇
2014年	15篇
2013年	13篇
2012年	15篇
2011年	21篇
2010年	15篇
2009年	17篇
2008年	19篇
2007年	12篇
2006年	11篇
2005年	7篇
2004年	4篇
2003年	6篇
2002年	7篇
2001年	4篇
2000年	1篇
1999年	4篇
1998年	5篇
1997年	2篇
1996年	2篇
1994年	2篇

排序方式： 共有327条查询结果，搜索用时 15 毫秒

[首页] « 上一页 [9] [10] [11] [12] [13] 14 [15] [16] [17] [18] [19] 下一页 » 末页»

131.

互联电网CPS调节指令动态最优分配Q–学习算法

余涛王宇名刘前进《中国电机工程学报》2010,(7)

控制性能标准(control performance standard,CPS)下互联电网调度端的自动发电控制(automatic generation control,AGC)指令(简称CPS指令)到各类型AGC机组的动态优化分配是随机最优问题。将CPS指令分配的连续控制过程离散化,并可将其看作是一个离散时间马尔可夫决策过程,提出应用基于Q–学习的动态控制方法。根据优化目标的差异,设计不同的奖励函数,并将其引入到算法当中,有效结合水、火电机组的调节特性,并考虑水电机组的调节裕度,提高AGC系统调节能力。遗传算法和工程实用方法在标准两区域模型及南方电网模型的仿真研究显示,Q–学习有效提高了系统的适应性、鲁棒性和CPS考核合格率。相似文献

132.

基于增强学习的多agent自动协商研究 总被引：2，自引：1，他引：2

杨明嘉莉邱玉辉《计算机工程与应用》2004,40(33):98-100,117

该文通过对协商协议的引入,对提议形式、协商流程的分析,结合多属性效用理论和连续决策过程,提出了一个开放的、动态的、支持学习机制的形式化多问题自动协商模型。并在模型的基础上分别对评估提议、更新信念、生成提议等协商过程作了详细描述;对传统Q学习进行了扩充,设计了基于agent的当前信念和最近探索盈余的动态Q学习算法。相似文献

133.

基于动态Q学习算法的情感自动机模型研究

于冬梅方建安《计算机科学》2008,35(5):172-173

让计算机具有认知和表达自己情感的能力,培养其智能地时人类的情感做出反应是现阶段信息科学研究的热点内容.本文构建了基于动态Q学习算法的情感自动机模型,在该模型中,定义了情感元的概念,情感元应用动态Q学习算法来预测并感知环境的变化,从而改变自身情感来适应周围环境. 相似文献

134.

Autonomic discovery of subgoals in hierarchical reinforcement learning

XIAO Ding LI Yi-tong SHI Chuan 《中国邮电高校学报(英文版)》2014,21(5):94-104

Option is a promising method to discover the hierarchical structure in reinforcement learning （RL） for learning acceleration. The key to option discovery is about how an agent can find useful subgoals autonomically among the passing trails. By analyzing the agent＇s actions in the trails, useful heuristics can be found. Not only does the agent pass subgoals more frequently, but also its effective actions are restricted in subgoals. As a consequence, the subgoals can be deemed as the most matching action-restricted states in the paths. In the grid-world environment, the concept of the unique-direction value reflecting the action-restricted property was introduced to find the most matching action-restricted states. The unique-direction-value （UDV） approach is chosen to form options offline and online autonomically. Experiments show that the approach can find subgoals correctly. Thus the Q-learning with options found on both offline and online process can accelerate learning significantly. 相似文献

135.

A Q-learning-based multi-agent system for data classification

《Applied Soft Computing》2017

In this paper, a multi-agent classifier system with Q-learning is proposed for tackling data classification problems. A trust measurement using a combination of Q-learning and Bayesian formalism is formulated. Specifically, a number of learning agents comprising hybrid neural networks with Q-learning, which we have formulated in our previous work, are devised to form the proposed Q-learning Multi-Agent Classifier System (QMACS). The time complexity of QMACS is analyzed using the big O-notation method. In addition, a number of benchmark problems are employed to evaluate the effectiveness of QMACS, which include small and large data sets with and without noise. To analyze the QMACS performance statistically, the bootstrap method with 95% confidence interval is used. The results from QMACS are compared with those from its constituents and other models reported in the literature. The outcome indicates the effectiveness of QMACS in combining the predictions from its learning agents to improve the overall classification performance. 相似文献

136.

Intelligent Service Function Chain Mapping Framework for Cloud-and-Edge-Collaborative IoT

Yang Chao Li Yimin Li Tong Xu Siya Qi Jun Zhang Yu 《中国邮电高校学报(英文版)》2022,29(3):54-68

With the rapid development of Internet of thing (IoT) technology, it has become a challenge to deal with the increasing number and diverse requirements of IoT services. By combining burgeoning network function virtualization ( NFV) technology with cloud computing and mobile edge computing ( MEC), an NFV-enabled cloud-and-edge-collaborative IoT (CECIoT) architecture can efficiently provide flexible service for IoT traffic in the form of a service function chain (SFC) by jointly utilizing edge and cloud resources. In this promising architecture, a difficult issue is how to balance the consumption of resource and energy in SFC mapping. To overcome this challenge, an intelligent energy-and-resource-balanced SFC mapping scheme is designed in this paper. It takes the comprehensive deployment consumption as the optimization goal, and applies a deep Q-learning(DQL)-based SFC mapping (DQLBM) algorithm as well as an energy-based topology adjustment (EBTA) strategy to make efficient use of the limited network resources, while satisfying the delay requirement of users. Simulation results show that the proposed scheme can decrease service delay, as well as energy and resource consumption. 相似文献

137.

学习过程中共享经验的Q学习算法的研究

乔林罗杰《计算机科学》2012,39(5):213-216

主要以提高多智能体系统中Q学习算法的学习效率为研究目标,以追捕问题为研究平台,提出了一种基于共享经验的Q学习算法。该算法模拟人类的团队学习行为,各个智能体拥有共同的最终目标,即围捕猎物,同时每个智能体通过协商获得自己的阶段目标。在学习过程中把学习分为阶段性学习,每学习一个阶段,就进行一次阶段性总结,分享彼此好的学习经验,以便于下一阶段的学习。这样以学习快的、好的带动慢的、差的,进而提升总体的学习性能。仿真实验证明,在学习过程中共享经验的Q学习算法能够提高学习系统的性能,高效地收敛于最优策略。相似文献

138.

基于Q学习的无线传感器网络自组织方法研究

章韵王静玉陈志鲍贵城周峰扈罗全《传感技术学报》2010,23(11):1623-1626

为实现无线传感器网络中数据传输效率与能量节省的综合性能优化,提出一种基于Q学习的自组织协议方法,将无线传感器网络的每个节点映射为一个Agent,通过学习训练,使得每个Agent可以选择一个较优的转发方向,从而实现无线传感器网络的自组织。实例分析表明,应用Q学习构建的自组织传感器网络能够提高数据传输的能量效率,延长网络生存期。相似文献

139.

基于BP神经网络的双层启发式强化学习方法

刘智斌曾晓勤刘惠义储荣《计算机研究与发展》2015,(3):579-587

强化学习通过与环境交互的方式进行学习,在较大状态空间中其学习效率却很低．植入先验知识能够提高学习速度,然而不恰当的先验知识反而会误导学习过程,对学习性能不利．提出一种基于BP神经网络的双层启发式强化学习方法NNH‐QL ,改变了传统强化学习过程的盲目性．作为定性层,高层由BP神经网络构成,它不需要由外界提供背景知识,利用Shaping技术,将在线获取的动态知识对底层基于表格的Q学习过程进行趋势性启发．算法利用资格迹技术训练神经网络以提高学习效率．NN H‐QL方法既发挥了标准Q学习的灵活性,又利用了神经网络的泛化性能,为解决较大状态空间下的强化学习问题提供了一个可行的方法．实验结果表明：该方法能够较好地提高强化学习的性能且具有明显的加速效果．相似文献

140.

基于拉普拉斯特征映射的启发式Q学习

朱美强李明程玉虎张倩王雪松《控制与决策》2014,29(3):425-430

在基于目标的强化学习任务中, 欧氏距离常作为启发式函数用于策略选择, 其用于状态空间在欧氏空间内不连续的任务效果不理想. 针对此问题, 引入流形学习中计算复杂度较低的拉普拉斯特征映射法, 提出一种基于谱图理论的启发式策略选择方法. 所提出的方法适用于状态空间在某个内在维数易于估计的流形上连续, 且相邻状态间的连接关系为无向图的任务. 格子世界的仿真结果验证了所提出方法的有效性.

相似文献

[首页] « 上一页 [9] [10] [11] [12] [13] 14 [15] [16] [17] [18] [19] 下一页 » 末页»