期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Pricing in Agent Economies Using Multi-Agent Q-Learning

Tesauro Gerald Kephart Jeffrey O. 《Autonomous Agents and Multi-Agent Systems》2002,5(3):289-304

This paper investigates how adaptive software agents may utilize reinforcement learning algorithms such as Q-learning to make economic decisions such as setting prices in a competitive marketplace. For a single adaptive agent facing fixed-strategy opponents, ordinary Q-learning is guaranteed to find the optimal policy. However, for a population of agents each trying to adapt in the presence of other adaptive agents, the problem becomes non-stationary and history dependent, and it is not known whether any global convergence will be obtained, and if so, whether such solutions will be optimal. In this paper, we study simultaneous Q-learning by two competing seller agents in three moderately realistic economic models. This is the simplest case in which interesting multi-agent phenomena can occur, and the state space is small enough so that lookup tables can be used to represent the Q-functions. We find that, despite the lack of theoretical guarantees, simultaneous convergence to self-consistent optimal solutions is obtained in each model, at least for small values of the discount parameter. In some cases, exact or approximate convergence is also found even at large discount parameters. We show how the Q-derived policies increase profitability and damp out or eliminate cyclic price wars compared to simpler policies based on zero lookahead or short-term lookahead. In one of the models (the Shopbot model) where the sellers' profit functions are symmetric, we find that Q-learning can produce either symmetric or broken-symmetry policies, depending on the discount parameter and on initial conditions. 相似文献

2.

Continuous-Action Q-Learning 总被引：1，自引：0，他引：1

Millán José del R. Posenato Daniele Dedieu Eric 《Machine Learning》2002,49(2-3):247-265

This paper presents a Q-learning method that works in continuous domains. Other characteristics of our approach are the use of an incremental topology preserving map (ITPM) to partition the input space, and the incorporation of bias to initialize the learning process. A unit of the ITPM represents a limited region of the input space and maps it onto the Q-values of M possible discrete actions. The resulting continuous action is an average of the discrete actions of the winning unit weighted by their Q-values. Then, TD() updates the Q-values of the discrete actions according to their contribution. Units are created incrementally and their associated Q-values are initialized by means of domain knowledge. Experimental results in robotics domains show the superiority of the proposed continuous-action Q-learning over the standard discrete-action version in terms of both asymptotic performance and speed of learning. The paper also reports a comparison of discounted-reward against average-reward Q-learning in an infinite horizon robotics task. 相似文献

3.

Incremental Multi-Step Q-Learning 总被引：23，自引：0，他引：23

Peng Jing Williams Ronald J. 《Machine Learning》1996,22(1-3):283-290

This paper presents a novel incremental algorithm that combines Q-learning, a well-known dynamic-programming based reinforcement learning method, with the TD() return estimation process, which is typically used in actor-critic learning, another well-known dynamic-programming based reinforcement learning method. The parameter is used to distribute credit throughout sequences of actions, leading to faster learning and also helping to alleviate the non-Markovian effect of coarse state-space quantization. The resulting algorithm, Q()-learning, thus combines some of the best features of the Q-learning and actor-critic learning paradigms. The behavior of this algorithm has been demonstrated through computer simulations. 相似文献

4.

Enhancing the performance of an agent-based manufacturing system through learning and forecasting 总被引：3，自引：0，他引：3

Weiming Shen Francisco Maturana Douglas H. Norrie 《Journal of Intelligent Manufacturing》2000,11(4):365-380

Agent-based technology has been identified as an important approach for developing next generation manufacturing systems. One of the key techniques needed for implementing such advanced systems will be learning. This paper first discusses learning issues in agent-based manufacturing systems and reviews related approaches, then describes how to enhance the performance of an agent-based manufacturing system through learning from history (based on distributed case-based learning and reasoning) and learning from the future (through system forecasting simulation). Learning from history is used to enhance coordination capabilities by minimizing communication and processing overheads. Learning from the future is used to adjust promissory schedules through forecasting simulation, by taking into account the shop floor interactions, production and transportation time. Detailed learning and reasoning mechanisms are described and partial experimental results are presented. 相似文献

5.

Hardness Results for Learning First-Order Representations and Programming by Demonstration

Cohen William W. 《Machine Learning》1998,30(1):57-87

相似文献

6.

The role of “craft language” in learning “Waza”

Kumiko Ikuta 《AI & Society》1990,4(2):137-146

The role of craft language in the process of teaching (learning) Waza (skill) will be discussed from the perspective of human intelligence.It may be said that the ultimate goal of learning Waza in any Japanese traditional performance is not the perfect reproduction of the teaching (learning) process of Waza. In fact, a special metaphorical language (craft language) is used, which has the effect of encouraging the learner to activate his creative imagination. It is through this activity that the he learns his own habitus (Kata).It is suggested that, in considering the difference of function between natural human intelligence and artificial intelligence, attention should be paid to the imaginative activity of the learner as being an essential factor for mastering Kata.This article is a modified English version of Chapter 5 of my bookWaza kara shiru (Learning from Skill), Tokyo University Press, 1987, pp. 93–105. 相似文献

7.

Rapid Concept Learning for Mobile Robots

Sridhar Mahadevan Georgios Theocharous Nikfar Khaleeli 《Autonomous Robots》1998,5(3-4):239-251

Concept learning in robotics is an extremely challenging problem: sensory data is often high dimensional, and noisy due to specularities and other irregularities. In this paper, we investigate two general strategies to speed up learning, based on spatial decomposition of the sensory representation, and simultaneous learning of multiple classes using a shared structure. We study two concept learning scenarios: a hallway navigation problem, where the robot has to induce features such as opening or wall. The second task is recycling, where the robot has to learn to recognize objects, such as a trash can. We use a common underlying function approximator in both studies in the form of a feedforward neural network, with several hundred input units and multiple output units. Despite the high degree of freedom afforded by such an approximator, we show the two strategies provide sufficient bias to achieve rapid learning. We provide detailed experimental studies on an actual mobile robot called PAVLOV to illustrate the effectiveness of this approach. 相似文献

8.

On the Learnability of Disjunctive Normal Form Formulas

Aizenstein Howard Pitt Leonard 《Machine Learning》1995,19(3):183-208

We present two related results about the learnability of disjunctive normal form (DNF) formulas. First we show that a common approach for learning arbitrary DNF formulas requires exponential time. We then contrast this with a polynomial time algorithm for learning most (rather than all) DNF formulas. A natural approach for learning boolean functions involves greedily collecting the prime implicants of the hidden function. In a seminal paper of learning theory, Valiant demonstrated the efficacy of this approach for learning monotone DNF, and suggested this approach for learning DNF. Here we show that no algorithm using such an approach can learn DNF in polynomial time. We show this by constructing a counterexample DNF formula which would force such an algorithm to take exponential time. This counterexample seems to capture much of what makes DNF hard to learn, and thus is useful to consider when evaluating the run-time of a proposed DNF learning algorithm. This hardness result, as well as other hardness results for learning DNF, relies on the construction of particular hard-to-learn formulas, formulas that appear to be relatively rare. This raises the question of whether most DNF formulas are learnable. For certain natural definitions of most DNF formulas, we answer this question affirmatively. 相似文献

9.

Bridging the educational divide

M.?Pieper Email author H.?Morasch G.?Piéla 《Universal Access in the Information Society》2003,2(3):243-254

The sharpest visible divide in Internet utilisation, which has deepened in recent years, is an educational one. Especially with regard to the learning disabled, the educational digital divide requires the improvement of inclusive didactical measures to promote media competence. A major prerequisite, which as a basic architectural principle determines systems design, in this respect demands support of evolutionary learning by tutorial learning systems designed as guidance systems which accord closely with the individual pupils evolutionary process. 相似文献

10.

Improving Generalization with Active Learning 总被引：29，自引：0，他引：29

Cohn David Atlas Les Ladner Richard 《Machine Learning》1994,15(2):201-221

Active learning differs from learning from examples in that the learning algorithm assumes at least some control over what part of the input domain it receives information about. In some situations, active learning is provably more powerful than learning from examples alone, giving better generalization for a fixed number of training examples.In this article, we consider the problem of learning a binary concept in the absence of noise. We describe a formalism for active concept learning calledselective sampling and show how it may be approximately implemented by a neural network. In selective sampling, a learner receives distribution information from the environment and queries an oracle on parts of the domain it considers useful. We test our implementation, called anSG-network, on three domains and observe significant improvement in generalization.A preliminary version of this article appears as Cohn et al. (1990). 相似文献

11.

Learning to Recognize Volcanoes on Venus 总被引：1，自引：0，他引：1

Burl Michael C. Asker Lars Smyth Padhraic Fayyad Usama Perona Pietro Crumpler Larry Aubele Jayne 《Machine Learning》1998,30(2-3):165-194

Dramatic improvements in sensor and image acquisition technology have created a demand for automated tools that can aid in the analysis of large image databases. We describe the development of JARtool, a trainable software system that learns to recognize volcanoes in a large data set of Venusian imagery. A machine learning approach is used because it is much easier for geologists to identify examples of volcanoes in the imagery than it is to specify domain knowledge as a set of pixel-level constraints. This approach can also provide portability to other domains without the need for explicit reprogramming; the user simply supplies the system with a new set of training examples. We show how the development of such a system requires a completely different set of skills than are required for applying machine learning to toy world domains. This paper discusses important aspects of the application process not commonly encountered in the toy world, including obtaining labeled training data, the difficulties of working with pixel data, and the automatic extraction of higher-level features. 相似文献

12.

On the Complexity of Function Learning

Auer Peter Long Philip M. Maass Wolfgang Woeginger Gerhard J. 《Machine Learning》1995,18(2-3):187-230

The majority of results in computational learning theory are concerned with concept learning, i.e. with the special case of function learning for classes of functions with range {0, 1}. Much less is known about the theory of learning functions with a larger range such as or . In particular relatively few results exist about the general structure of common models for function learning, and there are only very few nontrivial function classes for which positive learning results have been exhibited in any of these models.We introduce in this paper the notion of a binary branching adversary tree for function learning, which allows us to give a somewhat surprising equivalent characterization of the optimal learning cost for learning a class of real-valued functions (in terms of a max-min definition which does not involve any learning model).Another general structural result of this paper relates the cost for learning a union of function classes to the learning costs for the individual function classes.Furthermore, we exhibit an efficient learning algorithm for learning convex piecewise linear functions from ^d into . Previously, the class of linear functions from ^d into was the only class of functions with multidimensional domain that was known to be learnable within the rigorous framework of a formal model for online learning.Finally we give a sufficient condition for an arbitrary class of functions from into that allows us to learn the class of all functions that can be written as the pointwise maximum ofk functions from . This allows us to exhibit a number of further nontrivial classes of functions from into for which there exist efficient learning algorithms. 相似文献

13.

Toward Third Generation Threaded Discussions for Mobile Learning: Opportunities and Challenges for Ubiquitous Collaborative Environments

Timothy?R.?Hill Email author Malu?Roldan 《Information Systems Frontiers》2005,7(1):55-70

The mobile communication revolution has led to pervasive connectedness—as evidenced by the explosive growth of instant messaging in the home, and more recently, the enterprise–and, together with the convergence of mobile computing, provides a basis for extending collaborative environments toward truly ubiquitous immersion. Leveraging the true anytime/anywhere access afforded by mobile computing, it becomes possible to develop applications that not only are capable of responding to users whenever/wherever, on demand, but that also may actively seek out and engage users when the need arises. Thus, immersive environments need no longer be thought of strictly in terms of physical immersion with clearly discernable enter and exit events, but rather they may be extended, through mobile-enabled computing, toward ubiquity in terms of both time and space. Based on Media Synchronicity Theory, potential benefits are envisioned, particularly in the case of collaborative learning environments, from shortened response cycles and increased real time interaction opportunities. At the same time, a number of challenging issues must be addressed in designing such an environment to ensure user acceptance and to maximize realization of the potential. Third Generation (3G) Threaded Discussion has been conceptualized as an environment, well suited to mobile learning (m-learning) that could leverage mobile-enabled ubiquity to achieve a degree of extended immersion and thereby accrue the associated collaboration benefits. Exploring this conceptualization serves to help surface both the opportunities and the challenges associated with such environments and to identify promising design approaches, such as the use of intelligent agents.This revised version was published online in March 2005 with corrections to the cover date 相似文献

14.

U.S. Cybercrime Law: Defining Offenses

Susan W. Brenner 《Information Systems Frontiers》2004,6(2):115-132

In recent years, a new term has arisen—cybercrime—which essentially denotes the use of computer technology to commit or to facilitate the commission of unlawful acts, or crimes. This article explains why we treat cybercrime as a special class of crime and why we need special statutes to define cybercrime offenses. It explains the relationship between state and federal law, notes the various types of cybercrimes and surveys the offenses that are created by state and federal law in the United States. 相似文献

15.

Networked Collaborative Learning in the Study of Modern History and Literature

Guglielmo Trentin 《Computers and the Humanities》2004,38(3):299-315

Many teachers adopt networked collaborative learning strategies even though these approaches systematically increase the time needed to deal with a given subject. But who's making them do it?. Probably there has to be a return on investment, in terms of time and obviously in terms of educational results, which justifies that commitment. After surveying the particular features of two experimental projects based on networked collaborative learning, the paper will then offer a series of thoughts triggered by observation of the results and the dynamics generated by this specific approach. The purpose of these thoughts is to identify some key factors that make it possible to measure the real added value produced by network collaboration in terms of the acquisition of skills, knowledge, methods and attitudes that go beyond the mere learning of contents (however fundamental this may be). And it is precisely on the basis of these considerations that teachers usually answer the above question, explaining who (or what) made them do it!. 相似文献

16.

Algorithms and Lower Bounds for On-Line Learning of Geometrical Concepts

Maass Wolfgang Turán György 《Machine Learning》1994,14(3):251-269

The complexity of on-line learning is investigated for the basic classes of geometrical objects over a discrete (digitized) domain. In particular, upper and lower bounds are derived for the complexity of learning algorithms for axis-parallel rectangles, rectangles in general position, balls, halfspaces, intersections of half-spaces, and semi-algebraic sets. The learning model considered is the standard model for on-line learning from counterexamples. 相似文献

17.

A Typology of Translation Problems for Eurotra Translation Machines

Andrew Way Ian Crookston Jane Shelton 《Machine Translation》1997,12(4):323-374

This paper presents a detailed study of Eurotra Machine Translation engines, namely the mainstream Eurotra software known as the E-Framework, and two unofficial spin-offs – the C,A,T and Relaxed Compositionality translator notations – with regard to how these systems handle hard cases, and in particular their ability to handle combinations of such problems. In the C,A,T translator notation, some cases of complex transfer are wild, meaning roughly that they interact badly when presented with other complex cases in the same sentence. The effect of this is that each combination of a wild case and another complex case needs ad hoc treatment. The E-Framework is the same as the C,A,T notation in this respect. In general, the E-Framework is equivalent to the C,A,T notation for the task of transfer. The Relaxed Compositionality translator notation is able to handle each wild case (bar one exception) with a single rule even where it appears in the same sentence as other complex cases. 相似文献

18.

Minimization of norms and logarithmic norms by diagonal similarities

Dr. T. Ström 《Computing》1972,10(1-2):1-7

It is a commonly occurring problem to find good norms · or logarithmic norms (·) for a given matrix in the sense that they should be close to respectively the spectral radius (A) and the spectral abscissa (A). Examples may be the certification thatA is convergent, i.e. (A)A<1 or stable, i.e. (A)(A)<0. Often the ordinary norms do not suffice and one would like to try simple modifications of them such as using an ordinary norm for a diagonally transformed matrix. This paper treats this problem for some of the ordinary norms.

Minimisierung von Normen und Logarithmischen Normen durch Diagonale Transformationen

Zusammenfassung Ein oft vorkommendes praktisches Problem ist die Konstruktion von guten Normen · und logarithmischen Normen (·) für eine gegebene MatrixA. Mit gut wird dann verstanden, daß A den Spektralradius (A)=max |₁| und (A) die Spektralabszisse (A)=max Re _i gut approximieren. Beispiele findet man für konvergente Matrizen wo (A)A<1 gewünscht ist, und für stabile Matrizen wo (A)(A)<0 zu zeigen ist. Wir untersuchen hier, wie weit man mit Diagonaltransformationen und dengewöhnlichsten Normen kommen kann.

相似文献

19.

Efficient Construction of Regression Trees with Range and Region Splitting

Morimoto Yasuhiko Ishii Hiromu Morishita Shinichi 《Machine Learning》2001,45(3):235-259

We propose a method for constructing regression trees with range and region splitting. We present an efficient algorithm for computing the optimal two-dimensional region that minimizes the mean squared error of an objective numeric attribute in a given database. As two-dimensional regions, we consider a class R of grid-regions, such as x-monotone, rectilinear-convex, and rectangular, in the plane associated with two numeric attributes. We compute the optimal region R. We propose to use a test that splits data into those that lie inside the region R and those that lie outside the region in the construction of regression trees. Experiments confirm that the use of region splitting gives compact and accurate regression trees in many domains. 相似文献

20.

A Stylometric Analysis of Ya?ar Kemal’s ? nce Memed Tetralogy

Jon M. Patton Fazli Can 《Computers and the Humanities》2004,38(4):457-467

We analyze four nce Memed novels of Yaar Kemal using six style markers: most frequent words, syllable counts, word type – or part of speech – information, sentence length in terms of words, word length in text, and word length in vocabulary. For analysis we divide each novel into five thousand word text blocks and count the frequencies of each style marker in these blocks. The style markers showing the best separation are most frequent words and sentence lengths. We use stepwise discriminant analysis to determine the best discriminators of each style marker. We then use these markers in cross validation based discriminant analysis. Further investigation based on multiple analysis of variance (MANOVA) reveals how the attributes of each style marker group distinguish among the volumes. 相似文献