期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

全文获取类型

收费全文	27篇
免费	14篇
国内免费	13篇

专业分类

电工技术	4篇
综合类	2篇
化学工业	1篇
机械仪表	1篇
矿业工程	1篇
无线电	3篇
一般工业技术	2篇
自动化技术	40篇

出版年

2024年	3篇
2023年	3篇
2022年	8篇
2021年	5篇
2020年	7篇
2019年	5篇
2018年	6篇
2017年	1篇
2016年	1篇
2014年	3篇
2013年	2篇
2012年	2篇
2011年	2篇
2009年	2篇
2008年	1篇
2006年	1篇
2005年	1篇
2002年	1篇

排序方式： 共有54条查询结果，搜索用时 17 毫秒

[首页] « 上一页 [1] [2] 3 [4] [5] [6] 下一页 » 末页»

21.

安全屏障机制下基于SAC算法的机器人导航系统

马丽新刘磊刘晨《南京信息工程大学学报》2023,15(2):201-209

为了提高移动机器人自主导航系统的智能化水平和安全性,设计了安全屏障机制下基于SAC(Soft Actor-Critic)算法的自主导航系统,并构建了依赖于机器人与最近障碍物距离、目标点距离以及偏航角的回报函数.在Gazebo仿真平台中,搭建载有激光雷达的移动机器人以及周围环境.实验结果表明,安全屏障机制在一定程度上降低了机器人撞击障碍物的概率,提高了导航的成功率,并使得基于SAC算法的移动机器人自主导航系统具有更高的泛化能力.在更改起终点甚至将静态环境改为动态时,系统仍具有自主导航的能力. 相似文献

22.

Adaptive variable structure control of MIMO nonlinear systems with time-varying delays and unknown dead-zones 总被引：1，自引：0，他引：1

Tian-Ping Zhang Cai-Ying Zhou Qing Zhu 《国际自动化与计算杂志》2009,6(2):124-136

In this paper, adaptive variable structure neural control is presented for a class of uncertain multi-input multi-output （MIMO） nonlinear systems with state time-varying delays and unknown nonlinear dead-zones. The unknown time-varying delay uncer- tainties are compensated for using appropriate Lyapunov-Krasovskii functionals in the design. The approach removes the assumption of linear function outside the deadband without necessarily constructing a dead-zone inverse as an added contribution. By utilizing the integral-type Lyapunov function and introducing an adaptive compensation term for the upper bound of the residual and optimal approximation error as well as the dead-zone disturbance, the closed-loop control system is proved to be semi-globally uniformly ultimately bounded. In addition, a modified adaptive control algorithm is given in order to avoid the high-frequency chattering phenomenon. Simulation results demonstrate the effectiveness of the approach. 相似文献

23.

基于柔性策略-评价网络的微电网源储协同优化调度策略

刘林鹏朱建全陈嘉俊叶汉芳《电力自动化设备》2022,42(1):79-85

近年来,微电网中的可再生能源与储能占比不断增大,给其优化调度带来了新的挑战。针对微电网源储协同调度问题中非凸非线性约束带来的求解困难,利用深度强化学习算法构建基于数据的策略函数,通过不断地与环境进行交互学习寻找最优策略,避免了对原非凸非线性问题的直接求解。考虑到训练过程中策略函数可能不满足安全约束,进一步提出了一种利用部分模型信息的微电网源储协同优化调度安全策略学习方法,得到了满足网络安全约束的优化策略。此外,针对强化学习的智能体在训练过程中与环境的交互耗时较长的问题,采用神经网络对环境进行建模以提高学习效率。相似文献

24.

LIDAR: learning from imperfect demonstrations with advantage rectification

Xiaoqin ZHANG Huimin MA Xiong LUO Jian YUAN 《Frontiers of Computer Science》2022,16(1):161312

In actor-critic reinforcement learning(RL)algorithms,function estimation errors are known to cause ineffective random exploration at the beginning of training,and lead to overestimated value estimates and suboptimal policies.In this paper,we address the problem by executing advantage rectification with imperfect demonstrations,thus reducing the function estimation errors.Pretraining with expert demonstrations has been widely adopted to accelerate the learning process of deep reinforcement learning when simulations are expensive to obtain.However,existing methods,such as behavior cloning,often assume the demonstrations contain other information or labels with regard to performances,such as optimal assumption,which is usually incorrect and useless in the real world.In this paper,we explicitly handle imperfect demonstrations within the actor-critic RL frameworks,and propose a new method called learning from imperfect demonstrations with advantage rectification(LIDAR).LIDAR utilizes a rectified loss function to merely learn from selective demonstrations,which is derived from a minimal assumption that the demonstrating policies have better performances than our current policy.LIDAR learns from contradictions caused by estimation errors,and in turn reduces estimation errors.We apply LIDAR to three popular actor-critic algorithms,DDPG,TD3 and SAC,and experiments show that our method can observably reduce the function estimation errors,effectively leverage demonstrations far from the optimal,and outperform state-of-the-art baselines consistently in all the scenarios. 相似文献

25.

时变时滞随机非线性系统的自适应神经网络跟踪控制

余昭旭杜红彬《控制理论与应用》2011,28(12):1808-1812

针对一类具有时变时滞的不确定随机非线性严格反馈系统的自适应跟踪问题,利用Razumikhin引理和backstepping方法,提出一种新的自适应神经网络跟踪控制器.该控制器可保证闭环系统的所有误差变量皆四阶矩半全局一致最终有界,并且跟踪误差可以稳定在原点附近的邻域内.仿真例子表明所提出控制方案的有效性. 相似文献

26.

Neural network language models for off-line handwriting recognition

F. Zamora-Martínez V. Frinken S. España-Boquera M.J. Castro-Bleda A. Fischer H. Bunke 《Pattern recognition》2014

Unconstrained off-line continuous handwritten text recognition is a very challenging task which has been recently addressed by different promising techniques. This work presents our latest contribution to this task, integrating neural network language models in the decoding process of three state-of-the-art systems: one based on bidirectional recurrent neural networks, another based on hybrid hidden Markov models and, finally, a combination of both. Experimental results obtained on the IAM off-line database demonstrate that consistent word error rate reductions can be achieved with neural network language models when compared with statistical N-gram language models on the three tested systems. The best word error rate, 16.1%, reported with ROVER combination of systems using neural network language models significantly outperforms current benchmark results for the IAM database. 相似文献

27.

An ADDHP-based Q-learning algorithm for optimal tracking control of linear discrete-time systems with unknown dynamics

《Applied Soft Computing》2019

This paper investigates a novel Q-learning algorithm based on action dependent dual heuristic programming (ADDHP) to solve the infinite-time domain linear quadratic tracker (LQT) for unknown linear discrete-time systems. The proposed method is conducted based on only system data without requiring the knowledge of the system matrices. After the reference system is determined, an augmented system composed of the original system and the reference system is constructed, and it is proved that the value function of LQT is quadratic concerning the state of the augmented system. Using the quadratic value function, the augmented algebraic Riccati equation (ARE) is derived to solve the LQT. Due to the difficulty of directly solving the augmented ARE, a Q-learning algorithm based on ADDHP structure is used to solve this problem. With unknown system matrices, a model neural network is developed to reconstruct system dynamics incorporating stability analysis. The estimated system matrices are employed to the proposed algorithm to calculate the optimal control by policy iteration. Moreover, the convergence of the algorithm is proved. Two simulation examples are used to validate the performance of the method, where all results have demonstrated the effectiveness of the proposed ADDHP-based Q-learning method without a priori knowledge of system matrices for LQT. 相似文献

28.

矿山信息物理融合系统多节点智联策略

马洋锦付茂全许志李敬兆《工矿自动化》2020,46(3):38-42,48

针对当前矿山信息物理融合系统(CPS)的通信节点无法与基于不同无线通信协议的感知节点实现智能连接的问题,在通信节点上集成多种通信模块构成多模态通信节点,提出了一种基于渐进式神经网络的矿山CPS多节点智联策略。采用渐进式神经网络控制多模态通信节点准确切换工作模态,实现异构无线通信网络自主建立;利用异步优势动作评价算法对渐进式神经网络进行深度训练,提高渐进式神经网络的收敛速度和训练精度。实验结果表明,该策略实现了多模态通信节点与多类感知节点之间的准确、可靠通信。相似文献

29.

基于深度强化学习的智能灯个性化调节方法

下载免费PDF全文

邓心那俊张瀚铎王昱林张斌《计算机工程与应用》2022,58(6):264-270

提出一种基于深度强化学习的智能灯亮度个性化调节方法,综合考虑自然光亮度及用户位置对用户实际感受亮度的影响,动态计算并设置灯光亮度,以满足用户个性化使用习惯.在每次完成灯光亮度自动调节后,根据用户是否再次进行手动调节设定正、负反馈,训练强化学习模型逐渐拟合用户使用习惯.实验分别实现了DQN、DDQN和A3C三种算法,在基... 相似文献

30.

Design of K-means clustering-based polynomial radial basis function neural networks (pRBF NNs) realized with the aid of particle swarm optimization and differential evolution

Sung-Kwun OhAuthor Vitae Wook-Dong KimAuthor VitaeWitold PedryczAuthor Vitae Su-Chong JooAuthor Vitae 《Neurocomputing》2012,78(1):121-132

In this paper, we introduce an advanced architecture of K-means clustering-based polynomial Radial Basis Function Neural Networks (p-RBF NNs) designed with the aid of Particle Swarm Optimization (PSO) and Differential Evolution (DE) and develop a comprehensive design methodology supporting their construction. The architecture of the p-RBF NNs comes as a result of a synergistic usage of the evolutionary optimization-driven hybrid tools. The connections (weights) of the proposed p-RBF NNs being of a certain functional character and are realized by considering four types of polynomials. In order to design the optimized p-RBF NNs, a prototype (center value) of each receptive field is determined by running the K-means clustering algorithm and then a prototype and a spread of the corresponding receptive field are further optimized through running Particle Swarm Optimization (PSO) and Differential Evolution (DE). The Weighted Least Square Estimation (WLSE) is used to estimate the coefficients of the polynomials (which serve as functional connections of the network). The performance of the proposed model and the comparative analysis involving models designed with the aid of PSO and DE are presented in case of a nonlinear function and two Machine Learning (ML) datasets 相似文献

[首页] « 上一页 [1] [2] 3 [4] [5] [6] 下一页 » 末页»