期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

梁毅丁振兴赵昱刘明洁潘勇金翊《计算机学报》2022,45(2):302-316

如何在受限时间内满足深度学习模型的训练精度需求并最小化资源成本是分布式深度学习系统面临的一大挑战.资源和批尺寸超参数配置是优化模型训练精度及资源成本的主要方法.既有工作分别从计算效率和训练精度的角度,对资源及批尺寸超参数进行独立配置.然而,两类配置对于模型训练精度及资源成本的影响具有复杂的依赖关系,既有独立配置方法难以... 相似文献

2.

非线性系统的再励学习控制研究

蒋志明王丽红段锁林林廷圻《控制理论与应用》2000,17(6):899-902

研究了一种带有的CMAC神经网络的再励学习（RL）控制方法,以解决具有高度非线性的系统控制问题。研究的重点在于算法的简化以及具有连续输出的函数学习上。控制策略由两部分构成;再励学习控制器和固定增益常规控制器。前者用于学习系统的非线性,后者用于稳定系统。仿真结果表明,所提出的控制策略不仅是有效的,而且具有很高的控制精度。相似文献

3.

一种基于强化学习的嵌入式系统抗拒绝服务攻击的缓存调度方案

黄锦灏丁钰真肖亮沈志荣朱珍民《计算机科学》2020,47(7):282-286

在多核嵌入式操作系统中,中央处理器对共享最后一级缓存(Last Level Cache,LLC)的资源调度决定了各用户进程的指令周期数(Instructions Per Cycle,IPC),以及对拒绝服务(Denial-of-Service,DoS)攻击的鲁棒性。但是,现有缓存调度方案依赖于具体的LLC调度模型和DoS攻击模型,使中央处理器难以在不同调度环境中的每个调度周期及时获得用户进程的运行信息。因此,文中提出一种基于强化学习的嵌入式系统LLC调度技术,以抵御拒绝服务攻击。该技术根据用户进程的LLC占用起始位置和终止位置,结合反馈的指令周期数、载入未命中率和存储未命中率等信息,优化LLC的占用位置和占用空间。在动态LLC调度环境下,中央处理器不需要预知DoS攻击模型,即可提高指令周期数并同时降低恶意进程的DoS攻击成功率。在多租户虚拟机共同参与的多核嵌入式操作系统中的仿真结果表明,所提技术可以显著提高指令周期数并降低DoS攻击的成功率。相似文献

4.

智能控制系统中机器学习的研究 总被引：1，自引：0，他引：1

下载免费PDF全文

谷强汪叔淳《计算机工程与科学》2000,22(1):59-62

本文总结了制造领域中机器学习方法的研究现状,针对智能控制系统所面临的外界环境动态变化和内部技术演化,提出一个多Ａｇｅｎｔ协同实现、多种学习机制集成的学习模型框架;最后,分析并展望了智能制造系统中机器学习研究的发展方向。相似文献

5.

可重入生产系统的平均报酬型强化学习调度 总被引：4，自引：0，他引：4

柳长春沈志江于海斌《信息与控制》2004,33(2):145-150

在可重入生产系统中,一个重要的问题就是对调度策略进行优化,以提高系统平均输出率.本文采用了一种平均报酬型强化学习算法来解决该问题,直接从所关心的系统品质出发,自动获得具有自适应性的动态调度策略.仿真结果表明,其性能优于两种熟知的优先权调度策略. 相似文献

6.

物料需求计划与准时生产制的对比分析 总被引：7，自引：2，他引：5

汪定伟徐昌国《自动化学报》1993,19(3):370-378

本文对西方的物料需求计划(MRP)和日本的准时生产制(JIT)这两种生产管理控制方法的产生背景、构成要素、运行机制及性能进行了对比。综述了将二者结合起来的混合控制策略的研究情况。探讨了不同控制策略在计算机集成制造系统中与制造过程集成的可能性。并结合中国国情,分析了混合控制策略的应用前景。相似文献

7.

云环境下基于强化学习的多目标任务调度算法

童钊邓小妹陈洪剑梅晶叶锋《小型微型计算机系统》2020,(2):285-290

针对云计算环境下的多目标任务调度问题,提出一种新的基于Q学习的多目标优化任务调度算法(Multi-objective Task Scheduling Algorithm based on Q-learning,QM TS).该算法的主要思想是:首先,在任务排序阶段利用Q-learning算法中的自学习过程得到更加合理的任务序列;然后,在虚拟机分配阶段使用线性加权法综合考虑任务最早完成时间和计算节点的计算成本,达到同时优化多目标问题的目的;最后,以产生更小的makespan和总成本为目标函数对任务进行调度,得到任务完成后的实验结果.实验结果表明,QMTS算法在使用Q-learning对任务进行排序后可以得到比HEFT算法更小的makespan;并且根据优化多目标调度策略在任务执行过程中减少了makespan和总成本,是一种有效的多目标优化任务调度算法. 相似文献

8.

基于联邦强化学习的分布式模型剪枝

聂宇铭臧文科马学豪刘宇儒包致成张镇彭亿《计算机系统应用》2024,33(5):154-161

联邦学习系统中, 在资源受限的边缘端进行本地模型训练存在一定的挑战. 计算、存储、能耗等方面的限制时刻影响着模型规模及效果. 传统的联邦剪枝方法在联邦训练过程中对模型进行剪裁, 但仍存在无法根据模型所处环境自适应修剪以及移除一些重要参数导致模型性能下降的情况. 本文提出基于联邦强化学习的分布式模型剪枝方法以解决此问题. 首先, 将模型剪枝过程抽象化, 建立马尔可夫决策过程, 使用DQN算法构建通用强化剪枝模型, 动态调整剪枝率, 提高模型的泛化性能. 其次设计针对稀疏模型的聚合方法, 辅助强化泛化剪枝方法, 更好地优化模型结构, 降低模型的复杂度. 最后, 在多个公开数据集上将本方法与不同基线方法进行比较. 实验结果表明, 本文所提出的方法在保持模型效果的同时减少模型复杂度. 相似文献

9.

带有欺骗攻击的分布式滤波器型迭代学习控制

乔伟豪朱凤增彭力《控制工程》2022,29(1):167-174

研究了一类受欺骗攻击的离散系统的分布式滤波器型迭代学习控制,对于欺骗攻击的描述采用满足Bernoulli分布的随机序列,通过构造Lyapunov函数,导出滤波误差系统均方指数稳定且满足l₂-l_∞性能指标的充分条件,设计受欺骗攻击的分布式l₂-l_∞滤波器,并通过线性矩阵不等式方法求解滤波器参数。给出分布式滤波器型迭代控制收敛的充分条件,通过仿真实例说明该设计方法的有效性。相似文献

10.

Closed-loop P-type Iterative Learning Control of Uncertain Linear Distributed Parameter Systems

下载免费PDF全文

Xisheng Dai Senping Tian Yunjian Peng Wenguang Luo 《IEEE/CAA Journal of Automatica Sinica》2014,1(3):267-273

An iterative learning control problem for a class of uncertain linear parabolic distributed parameter systems is discussed, which covers many processes such as heat and mass transfer, convection diffusion and transport. Under condition of allowing system state initially to have error in the iterative process a closed-loop P-type iterative learning algorithm is presented, and the sufficient condition of tracking error convergence in L2 norm is given. Next, the convergence of the tracking error in L2 and W1,2 space is proved by using Gronwall-Bellman inequality and Sobolev inequality. In the end, a numerical example is given to illustrate the effectiveness of the proposed method. 相似文献

11.

影响分布式控制系统性能的若干因素

王兰香《自动化技术与应用》2007,26(5):98-101

分布式控制系统(DCS)是工业控制应用技术的热点和主流.本文简要地阐述了系统体系结构的架构、时间延迟、控制网络和通讯调度等这些因素对控制系统整体性能的影响以及相关研究现状.并指出系统的、恰当的方法分析DCS的行为特性,允许系统集成人员能够快速评价多种不同方案并从中选择最佳方案而无需大量的仿真,在此基础上,对系统体系结构进行优化.这是DCS研究需要进一步深入的课题. 相似文献

12.

End-to-End Utilization Control for Aperiodic Tasks in Distributed Real-Time Systems

下载免费PDF全文

Yong Liao Xu-Dong Chen Guang-Ze Xiong Qing-Xin Zhu and Nan Sang 《计算机科学技术学报》2007,22(1):135-146

An increasing number of DRTS （Distributed model. The key challenges of such DRTS are guaranteeing Real-Time Systems） are employing an end-to-end aperiodic task utilization on multiple processors to achieve overload protection, and meeting the end-to-end deadlines of aperiodic tasks. This paper proposes an end-to-end utilization control architecture and an IC-EAT （Integration Control for End-to-End Aperiodic Tasks） algorithm, which features a distributed feedback loop that dynamically enforces the desired utilization bound on multiple processors. IC-EAT integrates admission control with feedback control, which is able to dynamically determine the QoS （Quality of Service） of incoming tasks and guarantee the end-to-end deadlines of admitted tasks. Then an LQOCM （Linear Quadratic Optimal Control Model） is presented. Finally, experiments demonstrate that, for the end-to-end DRTS whose control matrix G falls into the stable region, the IC-EAT is convergent and stable. Moreover,it is capable of providing better QoS guarantees for end-to-end aperiodic tasks and improving the system throughput. 相似文献

13.

Fuzzy Policy Reinforcement Learning in Cooperative Multi-robot Systems

Dongbing Gu Erfu Yang 《Journal of Intelligent and Robotic Systems》2007,48(1):7-22

A multi-agent reinforcement learning algorithm with fuzzy policy is addressed in this paper. This algorithm is used to deal with some control problems in cooperative multi-robot systems. Specifically, a leader-follower robotic system and a flocking system are investigated. In the leader-follower robotic system, the leader robot tries to track a desired trajectory, while the follower robot tries to follow the reader to keep a formation. Two different fuzzy policies are developed for the leader and follower, respectively. In the flocking system, multiple robots adopt the same fuzzy policy to flock. Initial fuzzy policies are manually crafted for these cooperative behaviors. The proposed learning algorithm finely tunes the parameters of the fuzzy policies through the policy gradient approach to improve control performance. Our simulation results demonstrate that the control performance can be improved after the learning. 相似文献

14.

Iterative Learning Control for Distributed Parameter Systems Based on Non-Collocated Sensors and Actuators

下载免费PDF全文

Jianxiang Zhang Baotong Cui Xisheng Dai Zhengxian Jiang 《IEEE/CAA Journal of Automatica Sinica》2020,7(3):865-871

In this paper, an open-loop PD-type iterative learning control (ILC) scheme is first proposed for two kinds of distributed parameter systems (DPSs) which are described by parabolic partial differential equations using non-collocated sensors and actuators. Then, a closed-loop PD-type ILC algorithm is extended to a class of distributed parameter systems with a non-collocated single sensor and m actuators when the initial states of the system exist some errors. Under some given assumptions, the convergence conditions of output errors for the systems can be obtained. Finally, one numerical example for a distributed parameter system with a single sensor and two actuators is presented to illustrate the effectiveness of the proposed ILC schemes. 相似文献

15.

Iterative Learning Control for a Class of Fractional Order Distributed Parameter Systems

Yong‐Hong Lan Li Liu Yi‐Ping Luo 《Asian journal of control》2020,22(1):449-459

This paper concerns a second‐order P‐type iterative learning control (ILC) scheme for a class of fractional order linear distributed parameter systems. First, by analyzing of the control and learning processes, a discrete system for P‐type ILC is established and the ILC design problem is then converted to a stability problem for such a discrete system. Next, a sufficient condition for the convergence of the control input and the tracking errors is obtained by using generalized Gronwall inequality, which is less conservative than the existing one. By incorporating the convergent condition obtained into the original system, the ILC scheme is derived. Finally, the validity of the proposed method is verified by a numerical example. 相似文献

16.

一种基于分布式强化学习的多智能体协调方法 总被引：2，自引：0，他引：2

范波潘泉张洪才《计算机仿真》2005,22(6):115-118

多智能体系统研究的重点在于使功能独立的智能体通过协商、协调和协作,完成复杂的控制任务或解决复杂的问题。通过对分布式强化学习算法的研究和分析,提出了一种多智能体协调方法,协调级将复杂的系统任务进行分解,协调智能体利用中央强化学习进行子任务的分配,行为级中的任务智能体接受各自的子任务,利用独立强化学习分别选择有效的行为,协作完成系统任务。通过在Robot Soccer仿真比赛中的应用和实验,说明了基于分布式强化学习的多智能体协调方法的效果优于传统的强化学习。相似文献

17.

深度强化学习在智能制造中的应用展望综述

下载免费PDF全文

孔松涛刘池池史勇谢义王堃《计算机工程与应用》2021,57(2):49-59

深度强化学习作为机器学习发展的最新成果,已经在很多应用领域崭露头角。关于深度强化学习的算法研究和应用研究,产生了很多经典的算法和典型应用领域。深度强化学习应用在智能制造中,能在复杂环境中实现高水平控制。对深度强化学习的研究进行概述,对深度强化学习基本原理进行介绍,包括深度学习和强化学习。介绍深度强化学习算法应用的理论方法,在此基础对深度强化学习的算法进行了分类介绍,分别介绍了基于值函数和基于策略梯度的强化学习算法,列举了这两类算法的主要发展成果,以及其他相关研究成果。对深度强化学习在智能制造的典型应用进行分类分析。对深度强化学习存在的问题和未来发展方向进行了讨论。相似文献

18.

分布式强化学习系统的体系结构研究 总被引：2，自引：0，他引：2

仲宇张汝波顾国昌《计算机工程与应用》2003,39(11):111-113

强化学习是一种重要的机器学习方法,随着计算机网络和分布式处理技术的飞速发展,多智能体系统中的分布式强化学习方法正受到越来越多的关注。论文将目前已有的各种分布式强化学习方法总结为中央强化学习、独立强化学习、群体强化学习、社会强化学习四类,然后探讨了这四类分布式强化学习方法的体系结构框架,并给出了这四类分布式强化学习方法的形式化定义。相似文献

19.

Reinforcement Learning and Robust Control for Robot Compliance Tasks

Cheng-Peng Kuan Kuu-young Young 《Journal of Intelligent and Robotic Systems》1998,23(2-4):165-182

The complexity in planning and control of robot compliance tasks mainly results from simultaneous control of both position and force and inevitable contact with environments. It is quite difficult to achieve accurate modeling of the interaction between the robot and the environment during contact. In addition, the interaction with the environment varies even for compliance tasks of the same kind. To deal with these phenomena, in this paper, we propose a reinforcement learning and robust control scheme for robot compliance tasks. A reinforcement learning mechanism is used to tackle variations among compliance tasks of the same kind. A robust compliance controller that guarantees system stability in the presence of modeling uncertainties and external disturbances is used to execute control commands sent from the reinforcement learning mechanism. Simulations based on deburring compliance tasks demonstrate the effectiveness of the proposed scheme. 相似文献

20.

Building a Basic Block Instruction Scheduler with Reinforcement Learning and Rollouts

McGovern Amy Moss Eliot Barto Andrew G. 《Machine Learning》2002,49(2-3):141-160

The execution order of a block of computer instructions on a pipelined machine can make a difference in running time by a factor of two or more. Compilers use heuristic schedulers appropriate to each specific architecture implementation to achieve the best possible program speed. However, these heuristic schedulers are time-consuming and expensive to build. We present empirical results using both rollouts and reinforcement learning to construct heuristics for scheduling basic blocks. In simulation, the rollout scheduler outperformed a commercial scheduler on all benchmarks tested, and the reinforcement learning scheduler outperformed the commercial scheduler on several benchmarks and performed well on the others. The combined reinforcement learning and rollout approach was also very successful. We present results of running the schedules on Compaq Alpha machines and show that the results from the simulator correspond well to the actual run-time results. 相似文献