首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 375 毫秒
1.
Multiagent learning provides a promising paradigm to study how autonomous agents learn to achieve coordinated behavior in multiagent systems. In multiagent learning, the concurrency of multiple distributed learning processes makes the environment nonstationary for each individual learner. Developing an efficient learning approach to coordinate agents’ behavior in this dynamic environment is a difficult problem especially when agents do not know the domain structure and at the same time have only local observability of the environment. In this paper, a coordinated learning approach is proposed to enable agents to learn where and how to coordinate their behavior in loosely coupled multiagent systems where the sparse interactions of agents constrain coordination to some specific parts of the environment. In the proposed approach, an agent first collects statistical information to detect those states where coordination is most necessary by considering not only the potential contributions from all the domain states but also the direct causes of the miscoordination in a conflicting state. The agent then learns to coordinate its behavior with others through its local observability of the environment according to different scenarios of state transitions. To handle the uncertainties caused by agents’ local observability, an optimistic estimation mechanism is introduced to guide the learning process of the agents. Empirical studies show that the proposed approach can achieve a better performance by improving the average agent reward compared with an uncoordinated learning approach and by reducing the computational complexity significantly compared with a centralized learning approach. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

2.
A Reinforcement Learning (RL) algorithm based on eXtended Classifier System (XCS) is used to navigate a spherical robot. Traditional motion planning strategies rely on pre-planned optimal trajectories and feedback control techniques. The proposed learning agent approach enjoys a direct model-free methodology that enables the robot to function in dynamic and/or partially observable environments. The agent uses a set of guard-action rules that determines the motion inputs at each step. Using a number of control inputs (actions) and the developed RL scheme, the agent learns to make near-optimal moves in response to the incoming position/orientation signals. The proposed method employs an improved variant of the XCS as its learning agent. Results of several simulated experiments for the spherical robot show that this approach is capable of planning a near-optimal path to a predefined target from any given position/orientation.  相似文献   

3.
徐春荞  张冰冰  李培华 《计算机应用研究》2021,38(10):3040-3043,3048
域对抗学习是一种主流的域适应方法,它通过分类器和域判别器来学习具有可区分性的域不变特征;然而,现有的域对抗方法大多利用一阶特征来学习域不变特征,忽略了具有更强表达能力的二阶特征.提出了一种条件对抗域适应网络,通过联合建模图像的二阶表征以及特征和分类器预测之间的互协方差以便更有效地学习具有区分性的域不变特征;此外,引入了熵条件来平衡分类器预测的不确定性,以保证特征的可迁移性.提出的方法在两个常用的域适应数据库Office-31和ImageCLEF-DA上进行了验证,实验结果表明该方法优于同类方法并获得了领先的性能.  相似文献   

4.
This paper proposes a path planning technique for autonomous agent(s) located in an unstructured networked distributed environment, where each agent has limited and not complete knowledge of the environment. Each agent has only the knowledge available in the distributed memory of the computing node the agent is running on and the agents share some information learned over a distributed network. In particular, the environment is divided into several sectors with each sector located on a single separate distributed computing node. We consider hybrid reactive-cognitive agent(s) where we use autonomous agent motion planning that is based on the use of a potential field model accompanied by a reinforcement learning as well as boundary detection algorithms. Potential fields are used for fast convergence toward a path in a distributed environment while reenforcement learning is used to guarantee a variety of behavior and consistent convergence in a distributed environment. We show how the agent decision making process is enhanced by the combination of the two techniques in a distributed environment. Furthermore, path retracing is a challenging problem in a distributed environment, since the agent does not have complete knowledge of the environment. We propose a backtracking technique to keep the distributed agent informed all the time of its path information and step count including when migrating from one node to another. Note that no node has knowledge of the entire global path from a source to a goal when such a goal resides on a separate node. Each agent has only knowledge of a partial path (internal to a node) and related number of steps corresponding to the portion of the path that agent traversed when running on the node. In particular, we show how each of the agents(s), starting in one of the many sectors with no initial knowledge of the environment, using the proposed distributed technique, develops its intelligence based on its experience and seamlessly discovers the shortest global path to the target, which is located in a different node, while avoiding any obstacle(s) it encounters in its way, including when transitioning and migrating from one distributed computing node to another. The agent(s) use (s) multiple-token-ring message passing interface (MPI) to perform internode communication. Finally, the experimental results of the proposed method show that single and multiagents sharing the same goal and running on the same or different nodes successfully coordinate the sharing of their respective environment states/information to collaboratively perform their respective tasks. The results also show that distributed multiagent sharing information increases by an order of magnitude the speed of convergence to the optimal shortest path to the goal in comparison with the single-agent case or noninformation sharing multiagent case.  相似文献   

5.
6.
联合嵌入式多标签分类算法   总被引:1,自引:0,他引:1  
刘慧婷  冷新杨  王利利  赵鹏 《自动化学报》2019,45(10):1969-1982
现有的一些多标签分类算法,因多标签数据含有高维的特征或标签信息而变得不可行.为了解决这一问题,提出基于去噪自编码器和矩阵分解的联合嵌入多标签分类算法Deep AE-MF.该算法包括两部分:特征嵌入部分使用去噪自编码器对特征空间学习得到非线性表示,标签嵌入部分则是利用矩阵分解直接学习到标签空间对应的潜在表示与解码矩阵.Deep AE-MF将特征嵌入和标签嵌入的两个阶段进行联合,共同学习一个潜在空间用于模型预测,进而得到一个有效的多标签分类模型.为了进一步提升模型性能,在Deep AE-MF方法中对标签间的负相关信息加以利用.通过在不同数据集上进行实验证明了提出Deep AE-MF方法的有效性和鲁棒性.  相似文献   

7.
In this paper, we consider the problem of leader synchronization in systems with interacting agents in large networks while simultaneously satisfying energy‐related user‐defined distributed optimization criteria. But modeling in large networks is very difficult, and for that reason, we derive a model‐free formulation that is based on a separate distributed Q‐learning function for every agent. Every Q‐function is a parametrization of each agent's control, of the neighborhood controls, and of the neighborhood tracking error. It is also evident that none of the agents has any information on where the leader is connected to and from where she spreads the desired information. The proposed algorithm uses an integral reinforcement learning approach with a separate distributed actor/critic network for each agent: a critic approximator to approximate each value function and an actor approximator to approximate each optimal control law. The derived tuning laws for each actor and critic approximators are designed appropriately by using gradient descent laws. We provide rigorous stability and convergence proofs to show that the closed‐loop system has an asymptotically stable equilibrium point and that the control policies form a graphical Nash equilibrium. We demonstrate the effectiveness of the proposed method on a network consisting of 10 agents. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

8.
Recent research emphasizes more on analyzing multiple features to improve face recognition (FR) performance. One popular scheme is to extend the sparse representation based classification framework with various sparse constraints. Although these methods jointly study multiple features through the constraints, they just process each feature individually such that they overlook the possible high-level relationship among different features. It is reasonable to assume that the low-level features of facial images, such as edge information and smoothed/low-frequency image, can be fused into a more compact and more discriminative representation based on the latent high-level relationship. FR on the fused features is anticipated to produce better performance than that on the original features, since they provide more favorable properties. Focusing on this, we propose two different strategies which start from fusing multiple features and then exploit the dictionary learning (DL) framework for better FR performance. The first strategy is a simple and efficient two-step model, which learns a fusion matrix from training face images to fuse multiple features and then learns class-specific dictionaries based on the fused features. The second one is a more effective model requiring more computational time that learns the fusion matrix and the class-specific dictionaries simultaneously within an iterative optimization procedure. Besides, the second model considers to separate the shared common components from class-specified dictionaries to enhance the discrimination power of the dictionaries. The proposed strategies, which integrate multi-feature fusion process and dictionary learning framework for FR, realize the following goals: (1) exploiting multiple features of face images for better FR performances; (2) learning a fusion matrix to merge the features into a more compact and more discriminative representation; (3) learning class-specific dictionaries with consideration of the common patterns for better classification performance. We perform a series of experiments on public available databases to evaluate our methods, and the experimental results demonstrate the effectiveness of the proposed models.  相似文献   

9.

This article presents the STROBE model: both an agent representation and an agent communication, model based on a social approach, which means interaction centered. This model represents how agents may realize the interactive, dynamic generation of services on the Grid. Dynamically generated services embody a new concept of service implying a collaborative creation of knowledge, i.e., learning; services are constructed interactively between agents depending on a conversation. The approach consists of integrating selected features from multi-agent systems and agent communication, language interpretation in applicative/functional programming and e-learning/human-learning into a unique, original, and simple view that privileges interactions, including control. The main characteristic of STROBE agents is that they develop a language (environment + interpreter) for each of their interlocutors. The model is inscribed within a global approach, defending a shift from the classical algorithmic (control based) view to problem solving in computing to an interaction-based view of social informatics, where artificial as well as human agents operate by communicating as well as by computing. The paper shows how the model may not only account for the classical communicating agent approaches, but also represent a fundamental advance in modeling societies of agents in particular in dynamic service generation scenarios such as those necessary today on the Web and proposed tomorrow for the Grid. Preliminary concrete experimentations illustrate the potential of the model; they are significant examples for a very wide class of computational and learning situations.  相似文献   

10.
Android恶意软件的几何式增长驱动了Android恶意软件自动检测领域的发展。一些工作从可解释性的角度来分析Android恶意软件,通过分析模型获取最大影响的特征,为深度学习模型提供了一定的可解释性。这些方法基于特征相互独立的强假设,仅仅考虑特征各自对模型的影响,而在实际中特征之间总是存在着耦合,仅考虑单个特征对模型的影响,难以反映耦合作用,不能刻画不同类型软件中敏感API的组合模式。为解决该问题,将Android软件刻画成图,并结合图的结构信息和图节点内部的信息提出了一种基于图嵌入的方法来检测Android恶意软件。该方法通过注意力机制学习Android软件的低维稠密嵌入表示。实验结果表明,使用学到的嵌入表示进行恶意软件检测,不仅具有较高的分类精度,还可以通过分析注意力分数较大的路径寻找影响模型决策的模式以及定位恶意行为所涉及的敏感API序列。  相似文献   

11.
In supervised classification, data representation is usually considered at the dataset level: one looks for the ??best?? representation of data assuming it to be the same for all the data in the data space. We propose a different approach where the representations used for classification are tailored to each datum in the data space. One immediate goal is to obtain sparse datum-wise representations: our approach learns to build a representation specific to each datum that contains only a small subset of the features, thus allowing classification to be fast and efficient. This representation is obtained by way of a sequential decision process that sequentially chooses which features to acquire before classifying a particular point; this process is learned through algorithms based on Reinforcement Learning. The proposed method performs well on an ensemble of medium-sized sparse classification problems. It offers an alternative to global sparsity approaches, and is a natural framework for sequential classification problems. The method extends easily to a whole family of sparsity-related problem which would otherwise require developing specific solutions. This is the case in particular for cost-sensitive and limited-budget classification, where feature acquisition is costly and is often performed sequentially. Finally, our approach can handle non-differentiable loss functions or combinatorial optimization encountered in more complex feature selection problems.  相似文献   

12.
郭方洪  何通  吴祥  董辉  刘冰 《控制理论与应用》2022,39(10):1881-1889
随着海量新能源接入到微电网中, 微电网系统模型的参数空间成倍增长, 其能量优化调度的计算难度不断上升. 同时, 新能源电源出力的不确定性也给微电网的优化调度带来巨大挑战. 针对上述问题, 本文提出了一种基于分布式深度强化学习的微电网实时优化调度策略. 首先, 在分布式的架构下, 将主电网和每个分布式电源看作独立智能体. 其次, 各智能体拥有一个本地学习模型, 并根据本地数据分别建立状态和动作空间, 设计一个包含发电成本、交易电价、电源使用寿命等多目标优化的奖励函数及其约束条件. 最后, 各智能体通过与环境交互来寻求本地最优策略, 同时智能体之间相互学习价值网络参数, 优化本地动作选择, 最终实现最小化微电网系统运行成本的目标. 仿真结果表明, 与深度确定性策略梯度算法(Deep Deterministic Policy Gradient, DDPG)相比, 本方法在保证系统稳定以及求解精度的前提下, 训练速度提高了17.6%, 成本函数值降低了67%, 实现了微电网实时优化调度.  相似文献   

13.
侯坤池  王楠  张可佳  宋蕾  袁琪  苗凤娟 《计算机应用研究》2022,39(4):1071-1074+1104
联邦学习是一种新型的分布式机器学习方法,可以使得各客户端在不分享隐私数据的前提下共同建立共享模型。然而现有的联邦学习框架仅适用于监督学习,即默认所有客户端数据均带有标签。由于现实中标记数据难以获取,联邦学习模型训练的前提假设通常很难成立。为解决此问题,对原有联邦学习进行扩展,提出了一种基于自编码神经网络的半监督联邦学习模型ANN-SSFL,该模型允许无标记的客户端参与联邦学习。无标记数据利用自编码神经网络学习得到可被分类的潜在特征,从而在联邦学习中提供无标记数据的特征信息来作出自身贡献。在MNIST数据集上进行实验,实验结果表明,提出的ANN-SSFL模型实际可行,在监督客户端数量不变的情况下,增加无监督客户端可以提高原有联邦学习精度。  相似文献   

14.
An efficient model for communications between CAD, CAPP, and CAM applications in distributed manufacturing planning environment has been seen as key ingredient for CIM. Integration of design model with process and scheduling information in real-time is necessary in order to increase product quality, reduce the cost, and shorten the product manufacturing cycle. This paper describes an approach to integrate key product realization activities using neutral data representation. The representation is based on established standards for product data exchange and serves as a prototype implementation of these standards. The product and process models are based on object-oriented representation of geometry, features, and resulting manufacturing processes. Relationships between objects are explicitly represented in the model (for example, feature precedence relations, process sequences, etc.). The product model is developed using XML-based representation for product data required for process planning and the process model also uses XML representation of data required for scheduling and FMS control. The procedures for writing and parsing XML representations have been developed in object-oriented approach, in such a way that each object from object-oriented model is responsible for storing its own data into XML format. Similar approach is adopted for reading and parsing of the XML model. Parsing is performed by a stack of XML handlers, each corresponding to a particular object in XML hierarchical model. This approach allows for very flexible representation, in such a way that only a portion of the model (for example, only feature data, or only the part of process plan for a single machine) may be stored and successfully parsed into another application. This is very useful approach for direct distributed applications, in which data are passed in the form of XML streams to allow real-time on-line communication. The feasibility of the proposed model is verified in a couple of scenarios for distributed manufacturing planning that involves feature mapping from CAD file, process selection for several part designs integrated with scheduling and simulation of the FMS model using alternative routings.  相似文献   

15.
Toward a Model of Intelligence as an Economy of Agents   总被引:2,自引:0,他引:2  
Baum  Eric B. 《Machine Learning》1999,35(2):155-185
A market-based algorithm is presented which autonomously apportions complex tasks to multiple cooperating agents giving each agent the motivation of improving performance of the whole system. A specific model, called The Hayek Machine is proposed and tested on a simulated Blocks World (BW) planning problem. Hayek learns to solve more complex BW problems than any previous learning algorithm. Given intermediate reward and simple features, it has learned to efficiently solve arbitrary BW problems. The Hayek Machine can also be seen as a model of evolutionary economics.  相似文献   

16.
任迎春  王志成  陈宇飞  赵卫东  彭磊 《计算机科学》2016,43(8):277-281, 296
针对稀疏保持投影算法在特征提取过程中无监督和L1范数优化的计算量较大的问题,提出一种基于流形学习和稀疏约束的快速特征提取算法。首先通过逐类PCA构造级联字典,并基于该字典通过最小二乘法快速学习稀疏保持结构;其次构造用于描述不同子流形距离的局部类间散度函数;然后整合所学习到的稀疏表示信息和局部类间散度信息以达到既考虑判别效率又保持稀疏表示结构的目的;所提算法最终转化为一个求解广义特征值问题。在公共人脸数据库(Yale,ORL和Extended Yale B)中 的 测试结果验证了该方法的可行性和有效性。  相似文献   

17.
In this paper, we propose an efficient agent for competing in Cliff-Edge (CE) and simultaneous Cliff-Edge (SCE) situations. In CE interactions, which include common interactions such as sealed-bid auctions, dynamic pricing and the ultimatum game (UG), the probability of success decreases monotonically as the reward for success increases. This trade-off exists also in SCE interactions, which include simultaneous auctions and various multi-player ultimatum games, where the agent has to decide about more than one offer or bid simultaneously. Our agent competes repeatedly in one-shot interactions, each time against different human opponents. The agent learns the general pattern of the population’s behavior, and its performance is evaluated based on all of the interactions in which it participates. We propose a generic approach which may help the agent compete against unknown opponents in different environments where CE and SCE interactions exist, where the agent has a relatively large number of alternatives and where its achievements in the first several dozen interactions are important. The underlying mechanism we propose for CE interactions is a new meta-algorithm, deviated virtual learning (DVL), which extends existing methods to efficiently cope with environments comprising a large number of alternative decisions at each decision point. Another competitive approach is the Bayesian approach, which learns the opponents’ statistical distribution, given prior knowledge about the type of distribution. For the SCE, we propose the simultaneous deviated virtual reinforcement learning algorithm (SDVRL), the segmentation meta-algorithm as a method for extending different basic algorithms, and a heuristic called fixed success probabilities (FSP). Experiments comparing the performance of the proposed algorithms with algorithms taken from the literature, as well as other intuitive meta-algorithms, reveal superiority of the proposed algorithms in average payoff and stability as well as in accuracy in converging to the optimal action, both in CE and SCE problems.  相似文献   

18.
A new multi-objective non-Darwinian-type evolutionary computation approach based on learnable evolution model (LEM) is proposed for solving the robot path planning problem. The multi-objective property of this approach is governed by a robust strength Pareto evolutionary algorithm (SPEA) incorporated in the LEM algorithm presented here. Learnable evolution model includes a machine learning method, like the decision trees, that can detect the right directions of the evolution and leads to large improvements in the fitness of the individuals. Several new refiner operators are proposed to improve the objectives of the individuals in the evolutionary process. These objectives are: the path length, the path safety and the path smoothness. A modified integer coding path representation scheme is proposed where the edge-fixing and top-row fixing procedures are performed implicitly. This proposed robot path planning problem solving approach is assessed on eight realistic scenarios in order to verify the performance thereof. Computer simulations reveal that this proposed approach exhibits much higher hypervolume and set coverage in comparison with other similar approaches. The experimental results confirm that the proposed approach performs in the workspaces with a dense set of obstacles in a significant manner.  相似文献   

19.
Autonomous Agents that Learn to Better Coordinate   总被引:1,自引:1,他引:1  
A fundamental difficulty faced by groups of agents that work together is how to efficiently coordinate their efforts. This coordination problem is both ubiquitous and challenging, especially in environments where autonomous agents are motivated by personal goals.Previous AI research on coordination has developed techniques that allow agents to act efficiently from the outset based on common built-in knowledge or to learn to act efficiently when the agents are not autonomous. The research described in this paper builds on those efforts by developing distributed learning techniques that improve coordination among autonomous agents.The techniques presented in this work encompass agents who are heterogeneous, who do not have complete built-in common knowledge, and who cannot coordinate solely by observation. An agent learns from her experiences so that her future behavior more accurately reflects what works (or does not work) in practice. Each agent stores past successes (both planned and unplanned) in their individual casebase. Entries in a casebase are represented as coordinated procedures and are organized around learned expectations about other agents.It is a novel approach for individuals to learn procedures as a means for the group to coordinate more efficiently. Empirical results validate the utility of this approach. Whether or not the agents have initial expertise in solving coordination problems, the distributed learning of the individual agents significantly improves the overall performance of the community, including reducing planning and communication costs.  相似文献   

20.
针对海量、异构三维形状匹配与智能检索技术的需求,提出了一种基于级联卷积神经网络(F-PointCNN)深度特征融合的三维形状局部匹配方法.首先,采用特征袋模型,提出几何图像表示方法,该几何图像不仅能够有效区分同类异构的非刚性三维模型,而且能够揭示大尺度不完整三维模型的结构相似性.其次,构建级联卷积神经网络学习框架F-PointCNN,其中,BoF-CNN从几何图像中学习深度全局特征,建立融合局部特征与全局特征的点特征表示;进而对Point-CNN进行点特征的细化与提纯,生成具有丰富信息的深度融合特征,有效提高形状特征的区分性与鲁棒性.最终,通过交叉矩阵度量方法高效实现非刚性三维模型的局部形状匹配.在公开的非刚性三维模型数据库的实验结果表明,该方法提取的特征在大尺度变换的形状分类及局部形状匹配中具有更强的识别力与更高的匹配精度.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号