首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper, a multi-agent reinforcement learning method based on action prediction of other agent is proposed. In a multi-agent system, action selection of the learning agent is unavoidably impacted by other agents’ actions. Therefore, joint-state and joint-action are involved in the multi-agent reinforcement learning system. A novel agent action prediction method based on the probabilistic neural network (PNN) is proposed. PNN is used to predict the actions of other agents. Furthermore, the sharing policy mechanism is used to exchange the learning policy of multiple agents, the aim of which is to speed up the learning. Finally, the application of presented method to robot soccer is studied. Through learning, robot players can master the mapping policy from the state information to the action space. Moreover, multiple robots coordination and cooperation are well realized.  相似文献   

2.
3.
Protecting client privacy with trusted computing at the server   总被引:2,自引:0,他引:2  
Current trusted-computing initiatives usually involve large organizations putting physically secure hardware on user machines, potentially violating user privacy. Yet, it's possible to exploit robust server-side secure hardware to enhance user privacy Two case studies demonstrate using secure coprocessors at the server.  相似文献   

4.
5.
6.
It is reported that client/server technology can make it easier for users to access information, to use and to develop applications, and to manage a distributed computing environment. This paper describes the implementation of client/server technology to an inventory management system. Methodology developed and barriers faced in this study are useful for practitioners in building client/server applications.  相似文献   

7.
8.
With rapid advances in mobile computing, multi-core processors and expanded memory resources are being made available in new mobile devices. This trend will allow a wider range of existing applications to be migrated to mobile devices, for example, running desktop applications in IA-32 (x86) binaries on ARM-based mobile devices transparently using dynamic binary translation (DBT). However, the overall performance could significantly affect the energy consumption of the mobile devices because it is directly linked to the number of instructions executed and the overall execution time of the translated code. Hence, even though the capability of today’s mobile devices will continue to grow, the concern over translation efficiency and energy consumption will put more constraints on a DBT for mobile devices, in particular, for thin mobile clients than that for severs. With increasing network accessibility and bandwidth in various environments, it makes many network servers highly accessible to thin mobile clients. Those network servers are usually equipped with a substantial amount of resources. This provides an opportunity for DBT on thin clients to leverage such powerful servers. However, designing such a DBT for a client/server environment requires many critical considerations.In this work, we looked at those design issues and developed a distributed DBT system based on a client/server model. It consists of two dynamic binary translators. An aggressive dynamic binary translator/optimizer on the server to service the translation/optimization requests from thin clients, and a thin DBT on each thin client to perform lightweight binary translation and basic emulation functions for its own. With such a two-translator client/server approach, we successfully off-load the DBT overhead of the thin client to the server and achieve a significant performance improvement over the non-client/server model. Experimental results show that the DBT of the client/server model could achieve 37% and 17% improvement over that of non-client/server model for x86/32-to-ARM emulation using MiBench and SPEC CINT2006 benchmarks with test inputs, respectively, and 84% improvement using SPLASH-2 benchmarks running two emulation threads.  相似文献   

9.
张楠  张振国 《微计算机信息》2006,22(27):200-202
传输层协议的选择影响着带宽利用率、响应时间等重要的网络性能指标。本文分析比较了三种标准的传输层协议,提出了T/TCP可能是未来的主要趋势。  相似文献   

10.
This paper proposes a trusted decentralized access control (TDAC) framework for the client/server architecture. As the fundamental principle, TDAC enforces access control policies at the client side and protects sensitive objects at the server side by leveraging trusted computing technologies. Compared with the previous work of Sandhu and Zhang (2005), TDAC uses fewer requirements for trusted components. To implement TDAC, we design a private trusted reference monitor that runs at the client side, evaluates an access control request, and signs a temporary access control credential for a client application trustworthily; we also design a master reference monitor that runs at the server side, evaluates the request from the client application only according to the temporary access control credential. As a typical application, TDAC can protect client's private context data in subject-context aware access control.  相似文献   

11.
The robot soccer game has been proposed as a benchmark problem for the artificial intelligence and robotic researches. Decision-making system is the most important part of the robot soccer system. As the environment is dynamic and complex, one of the reinforcement learning (RL) method named FNN-RL is employed in learning the decision-making strategy. The FNN-RL system consists of the fuzzy neural network (FNN) and RL. RL is used for structure identification and parameters tuning of FNN. On the other hand, the curse of dimensionality problem of RL can be solved by the function approximation characteristics of FNN. Furthermore, the residual algorithm is used to calculate the gradient of the FNN-RL method in order to guarantee the convergence and rapidity of learning. The complex decision-making task is divided into multiple learning subtasks that include dynamic role assignment, action selection, and action implementation. They constitute a hierarchical learning system. We apply the proposed FNN-RL method to the soccer agents who attempt to learn each subtask at the various layers. The effectiveness of the proposed method is demonstrated by the simulation and the real experiments.  相似文献   

12.
We deal with a special class of games against nature which correspond to subsymbolic learning problems where we know a local descent direction in the error landscape but not the amount gained at each step of the learning procedure. Namely, Alice and Bob play a game where the probability of victory grows monotonically by unknown amounts with the resources each employs. For a fixed effort on Alice’s part Bob increases his resources on the basis of the results of the individual contests (victory, tie or defeat). Quite unlike the usual ones in game theory, his aim is to stop as soon as the defeat probability goes under a given threshold with high confidence. We adopt such a game policy as an archetypal remedy to the general overtraining threat of learning algorithms. Namely, we deal with the original game in a computational learning framework analogous to the Probably Approximately Correct formulation. Therein, a wise use of a special inferential mechanism (known as twisting argument) highlights relevant statistics for managing different trade-offs between observability and controllability of the defeat probability. With similar statistics we discuss an analogous trade-off at the basis of the stopping criterion of subsymbolic learning procedures. As a conclusion, we propose a principled stopping rule based solely on the behavior of the training session, hence without distracting examples into a test set.  相似文献   

13.
This paper presents a distributed operating system modeled as an abstract machine that provides all the distributed processes with the same set of services.The kernel of our operating system supports services which are achieved by a remote procedure call on requests by parallel processes.Therefore,a scheme for solving the client-server relationship is required.In our system there are more than one clients and,at least,a receive would be required for each.Similarly,there are more than one servers such that the send in a client should produce a message that can be received by every server.Consequently,a mechanism well suited for programming multiple-clients/single-server and single-client/multiple-servers interactions is proposed.  相似文献   

14.
Punitha  V.  Mala  C. 《Neural computing & applications》2021,33(4):1279-1296
Neural Computing and Applications - Server farms used in web hosting and commercial applications connect multiple servers. Edge computing being a realm of cloud technology is orchestrated with...  相似文献   

15.
Ke  Minlong  Fernanda L.  Xin   《Neurocomputing》2009,72(13-15):2796
Negative correlation learning (NCL) is a successful approach to constructing neural network ensembles. In batch learning mode, NCL outperforms many other ensemble learning approaches. Recently, NCL has also shown to be a potentially powerful approach to incremental learning, while the advantages of NCL have not yet been fully exploited. In this paper, we propose a selective NCL (SNCL) algorithm for incremental learning. Concretely, every time a new training data set is presented, the previously trained neural network ensemble is cloned. Then the cloned ensemble is trained on the new data set. After that, the new ensemble is combined with the previous ensemble and a selection process is applied to prune the whole ensemble to a fixed size. This paper is an extended version of our preliminary paper on SNCL. Compared to the previous work, this paper presents a deeper investigation into SNCL, considering different objective functions for the selection process and comparing SNCL to other NCL-based incremental learning algorithms on two more real world bioinformatics data sets. Experimental results demonstrate the advantage of SNCL. Further, comparisons between SNCL and other existing incremental learning algorithms, such Learn++ and ARTMAP, are also presented.  相似文献   

16.
多智能体强化学习及其在足球机器人角色分配中的应用   总被引:2,自引:0,他引:2  
足球机器人系统是一个典型的多智能体系统, 每个机器人球员选择动作不仅与自身的状态有关, 还要受到其他球员的影响, 因此通过强化学习来实现足球机器人决策策略需要采用组合状态和组合动作. 本文研究了基于智能体动作预测的多智能体强化学习算法, 使用朴素贝叶斯分类器来预测其他智能体的动作. 并引入策略共享机制来交换多智能体所学习的策略, 以提高多智能体强化学习的速度. 最后, 研究了所提出的方法在足球机器人动态角色分配中的应用, 实现了多机器人的分工和协作.  相似文献   

17.
Soccer is a competitive and collective sport in which teammates try to combine the execution of basic actions (cooperative behavior) to lead their team to more advantageous situations. The ability to recognize, extract and reproduce such behaviors can prove useful to improve the performance of a team in future matches. This work describes a methodology for achieving just that makes use of a plan definition language to abstract the representation of relevant behaviors in order to promote their reuse. Experiments were conducted based on a set of game log files generated by the Soccer Server simulator which supports the RoboCup 2D simulated robotic soccer league. The effectiveness of the proposed approach was verified by focusing primarily on the analysis of behaviors which started from set-pieces and led to the scoring of goals while the ball possession was kept. One of the results obtained showed that a significant part of the total goals scored was based on this type of behaviors, demonstrating the potential of conducting this analysis. Other results allowed us to assess the complexity of these behaviors and infer meaningful guidelines to consider when defining plans from scratch. Some possible extensions to this work include assessing which plans have the ability to maximize the creation of goal opportunities by countering the opponent’s team strategy and how the effectiveness of plans can be improved using optimization techniques.  相似文献   

18.
Berrar  Daniel  Lopes  Philippe  Dubitzky  Werner 《Machine Learning》2019,108(1):97-126

The task of the 2017 Soccer Prediction Challenge was to use machine learning to predict the outcome of future soccer matches based on a data set describing the match outcomes of 216,743 past soccer matches. One of the goals of the Challenge was to gauge where the limits of predictability lie with this type of commonly available data. Another goal was to pose a real-world machine learning challenge with a fixed time line, involving the prediction of real future events. Here, we present two novel ideas for integrating soccer domain knowledge into the modeling process. Based on these ideas, we developed two new feature engineering methods for match outcome prediction, which we denote as recency feature extraction and rating feature learning. Using these methods, we constructed two learning sets from the Challenge data. The top-ranking model of the 2017 Soccer Prediction Challenge was our k-nearest neighbor model trained on the rating feature learning set. In further experiments, we could slightly improve on this performance with an ensemble of extreme gradient boosted trees (XGBoost). Our study suggests that a key factor in soccer match outcome prediction lies in the successful incorporation of domain knowledge into the machine learning modeling process.

  相似文献   

19.
In this article, a generalisation of the vertex colouring problem known as bandwidth multicolouring problem (BMCP), in which a set of colours is assigned to each vertex such that the difference between the colours, assigned to each vertex and its neighbours, is by no means less than a predefined threshold, is considered. It is shown that the proposed method can be applied to solve the bandwidth colouring problem (BCP) as well. BMCP is known to be NP-hard in graph theory, and so a large number of approximation solutions, as well as exact algorithms, have been proposed to solve it. In this article, two learning automata-based approximation algorithms are proposed for estimating a near-optimal solution to the BMCP. We show, for the first proposed algorithm, that by choosing a proper learning rate, the algorithm finds the optimal solution with a probability close enough to unity. Moreover, we compute the worst-case time complexity of the first algorithm for finding a 1/(1–?) optimal solution to the given problem. The main advantage of this method is that a trade-off between the running time of algorithm and the colour set size (colouring optimality) can be made, by a proper choice of the learning rate also. Finally, it is shown that the running time of the proposed algorithm is independent of the graph size, and so it is a scalable algorithm for large graphs. The second proposed algorithm is compared with some well-known colouring algorithms and the results show the efficiency of the proposed algorithm in terms of the colour set size and running time of algorithm.  相似文献   

20.
In this paper, we bring into the scheduling field a new model of the learning effect, where in two ways the existing approach is generalized. First we relax one of the rigorous constraints, and thus in our model each job can provide different experience to the processor. Second we formulate the job processing time as a non-increasing k-stepwise function, that, in general, is not restricted to a certain learning curve, thereby it can accurately fit every possible shape of a learning function. Furthermore, we prove that the problem of makespan minimization with the considered model is polynomially solvable if every job provides the same experience to the processor, and it becomes NP-hard if the experiences are diversified. The most essential result is a pseudopolynomial time algorithm that solves optimally the makespan minimization problem with any function of an experience-based learning model reduced into the form of the k-stepwise function.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号