期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

一种基于案例推理的多agent 强化学习方法研究 总被引：3，自引：0，他引：3

李珺潘启树洪炳殚《机器人》2009,31(4):1

提出一种基于案例推理的多agent 强化学习方法．构建了系统策略案例库,通过判断agent 之间的协作关系选择相应案例库子集．利用模拟退火方法从中寻找最合适的可再用案例策略,agent 按照案例指导执行动作选择．在没有可用案例的情况下,agent 执行联合行为学习（JAL）．在学习结果的基础上实时更新系统策略案例库．追捕问题的仿真结果表明所提方法明显提高了学习速度与收敛性．相似文献

2.

机器学习化数据库系统研究综述

孟小峰马超红杨晨《计算机研究与发展》2019,56(9):1803-1820

数据库系统经过近50年的发展,虽然已经普遍商用,但随着大数据时代的到来,数据库系统在2个方面面临挑战.首先数据量持续增大期望单个查询任务具有更快的处理速度;其次查询负载的快速变化及其多样性使得基于DBA经验的数据库配置和查询优化偏好不能实时地调整为最佳运行时状态.而数据库系统的性能优化进入瓶颈期,优化空间收窄,进一步优化只能依托新的硬件加速器来实现,传统的数据库系统不能够有效利用现代的硬件加速器;数据库系统具有成百个可调参数,面对工作负载频繁变化,大量繁琐的参数配置已经超出DBA的能力,这使得数据库系统面对快速而又多样性的变化缺乏实时响应能力.当下机器学习技术恰好同时符合这2个条件：应用现代加速器以及从众多参数调节经验中学习.机器学习化数据库系统将机器学习技术引入到数据库系统设计中.一方面将顺序扫描转化为计算模型,从而能够利用现代硬件加速平台;另一方面将DBA的经验转化为预测模型,从而使得数据库系统更加智能地动态适应工作负载的快速多样性变化.将对机器学习化数据库系统当前的研究工作进行总结与归纳,主要包括存储管理、查询优化的机器学习化研究以及自动化的数据库管理系统.在对已有技术分析的基础上,指出了机器学习化数据库系统的未来研究方向及可能面临的问题与挑战. 相似文献

3.

基于机器学习的医疗决策支持系统综述

下载免费PDF全文

梁书彤郭茂祖赵玲玲《计算机工程与应用》2019,55(19):1-11

利用健康医疗领域的海量临床数据进行辅助医疗决策支持是智慧医疗的核心技术和必然的发展趋势。医疗决策支持主要包括疾病风险预测与疾病智能诊断两方面，以临床积累和实时获取的多种数据来源为基础，通过多种机器学习算法实现对患者疾病类型的分类或者对患病风险的预测。从医疗决策支持的概念和方法框架出发，按照不同疾病种类，总结了当前采用的机器学习诊断和预测方法，着重介绍这些方法的特点和区别，并对存在的挑战和未来发展进行分析。相似文献

4.

Reinforcement Learning Agents

C. Ribeiro 《Artificial Intelligence Review》2002,17(3):223-250

Reinforcement Learning (RL) is learning through directexperimentation. It does not assume the existence of a teacher thatprovides examples upon which learning of a task takes place. Instead, inRL experience is the only teacher. With historical roots on the study ofbiological conditioned reflexes, RL attracts the interest of Engineersand Computer Scientists because of its theoretical relevance andpotential applications in fields as diverse as Operational Research andIntelligent Robotics.Computationally, RL is intended to operate in a learning environmentcomposed by two subjects: the learner and a dynamic process. Atsuccessive time steps, the learner makes an observation of the processstate, selects an action and applies it back to the process. Its goal isto find out an action policy that controls the behavior of the dynamicprocess, guided by signals (reinforcements) that indicate how badly orwell it has been performing the required task. These signals are usuallyassociated to a dramatic condition – e.g., accomplishment of a subtask(reward) or complete failure (punishment), and the learner tries tooptimize its behavior by using a performance measure (a function of thereceived reinforcements). The crucial point is that in order to do that,the learner must evaluate the conditions (associations between observedstates and chosen actions) that led to rewards or punishments.Starting from basic concepts, this tutorial presents the many flavorsof RL algorithms, develops the corresponding mathematical tools, assesstheir practical limitations and discusses alternatives that have beenproposed for applying RL to realistic tasks. 相似文献

5.

机器学习系统的隐私和安全问题综述

何英哲胡兴波何锦雯孟国柱陈恺《计算机研究与发展》2019,56(10):2049-2070

人工智能已经渗透到生活的各个角落,给人类带来了极大的便利.尤其是近年来,随着机器学习中深度学习这一分支的蓬勃发展,生活中的相关应用越来越多.不幸的是,机器学习系统也面临着许多安全隐患,而机器学习系统的普及更进一步放大了这些风险.为了揭示这些安全隐患并实现一个强大的机器学习系统,对主流的深度学习系统进行了调查.首先设计了一个剖析深度学习系统的分析模型,并界定了调查范围.调查的深度学习系统跨越了4个领域——图像分类、音频语音识别、恶意软件检测和自然语言处理,提取了对应4种类型的安全隐患,并从复杂性、攻击成功率和破坏等多个维度对其进行了表征和度量.随后,调研了针对深度学习系统的防御技术及其特点.最后通过对这些系统的观察,提出了构建健壮的深度学习系统的建议. 相似文献

6.

Training Multiagent Systems by Q‐Learning: Approaches and Empirical Results

下载免费PDF全文

Jose Manuel Lopez‐Guede Borja Fernandez‐Gauna Manuel Graña Ekaitz Zulueta 《Computational Intelligence》2015,31(3):498-512

Multiagent systems are increasingly present in computational environments. However, the problem of agent design or control is an open research field. Reinforcement learning approaches offer solutions that allow autonomous learning with minimal supervision. The Q‐learning algorithm is a model‐free reinforcement learning solution that has proven its usefulness in single‐agent domains; however, it suffers from dimensionality curse when applied to multiagent systems. In this article, we discuss two approaches, namely TRQ‐learning and distributed Q‐learning, that overcome the limitations of Q‐learning offering feasible solutions. We test these approaches in two separate domains. The first is the control of a hose by a team of robots. The second is the trash disposal problem. Computational results show the effectiveness of Q‐learning solutions to multiagent systems’ control. 相似文献

7.

智能控制系统中机器学习的研究 总被引：1，自引：0，他引：1

下载免费PDF全文

谷强汪叔淳《计算机工程与科学》2000,22(1):59-62

本文总结了制造领域中机器学习方法的研究现状,针对智能控制系统所面临的外界环境动态变化和内部技术演化,提出一个多Ａｇｅｎｔ协同实现、多种学习机制集成的学习模型框架;最后,分析并展望了智能制造系统中机器学习研究的发展方向。相似文献

8.

时变机器人系统的重复学习控制: 一种混合学习方案

孙明轩何熊熊陈冰玉《自动化学报》2007,33(11):1189-1195

Repetitive learning control is presented for finite-time-trajectory tracking of uncertain time-varying robotic systems. A hybrid learning scheme is given to cope with the constant and time-varying unknowns in system dynamics, where the time functions are learned in an iterative learning way, without the aid of Taylor expression, while the conventional differential learning method is suggested for estimating the constant ones. It is distinct that the presented repetitive learning control avoids the requirement for initial repositioning at the beginning of each cycle, and the time-varying unknowns are not necessary to be periodic. It is shown that with the adoption of hybrid learning, the boundedness of state variables of the closed-loop system is guaranteed and the tracking error is ensured to converge to zero as iteration increases. The effectiveness of the proposed scheme is demonstrated through numerical simulation. 相似文献

9.

时变机器人系统的重复学习控制：一种混合学习方案(英文) 总被引：1，自引：0，他引：1

孙明轩何熊熊陈冰玉《自动化学报》2007,(11)

Repetitive learning control is presented for finite- time-trajectory tracking of uncertain time-varying robotic sys- tems.A hybrid learning scheme is given to cope with the con- stant and time-varying unknowns in system dynamics,where the time functions are learned in an iterative learning way,without the aid of Taylor expression,while the conventional differential learning method is suggested for estimating the constant ones. It is distinct that the presented repetitive learning control avoids the requirement for initial repositioning at the beginning of each cycle,and the time-varying unknowns are not necessary to be periodic.It is shown that with the adoption of hybrid learning, the boundedness of state variables of the closed-loop system is guaranteed and the tracking error is ensured to converge to zero as iteration increases.The effectiveness of the proposed scheme is demonstrated through numerical simulation. 相似文献

10.

Parallel Learning: a Perspective and a Framework

下载免费PDF全文

Li Li Yilun Lin Nanning Zheng Fei-Yue Wang 《IEEE/CAA Journal of Automatica Sinica》2017,4(3):389-395

相似文献