首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
We consider a class of games with real-valued strategies and payoff information available only in the form of data from a given sample of strategy profiles. Solving such games with respect to the underlying strategy space requires generalizing from the data to a complete payoff-function representation. We address payoff-function learning as a standard regression problem, with provision for capturing known structure (e.g., symmetry) in the multiagent environment. To measure learning performance, we consider the relative utility of prescribed strategies, rather than the accuracy of payoff functions per se. We demonstrate our approach and evaluate its effectiveness on two examples: a two-player version of the first-price sealed-bid auction (with known analytical form), and a five-player market-based scheduling game (with no known solution). Additionally, we explore the efficacy of using relative utility of strategies as a target of supervised learning and as a learning model selector. Our experiments demonstrate its effectiveness in the former case, though not in the latter.  相似文献   

2.
A linear differential game whose payoff is determined by a convex function of the state vector at a given terminal time is investigated. An explicit expression for the greatest lower bound of the value of the game is obtained. A simple condition which guarantees the existence of the value of the game is derived by a geometric approach. Under this condition, instead of solving the Bellman equation, the value of the game can be calculated by maximizing a function on the unit sphere of the output space whose dimension is usually much less than that of the state space. The method of synthesizing the optimal strategies is also derived. Another game called a minimax energy game whose payoff is given by the difference between the consumed energies of both players is treated briefly, and an extension of the concept of controllability is discussed.  相似文献   

3.
Tropical polyhedra have been recently used to represent disjunctive invariants in static analysis. To handle larger instances, tropical analogues of classical linear programming results need to be developed. This motivation leads us to study the tropical analogue of the classical linear-fractional programming problem. We construct an associated parametric mean payoff game problem, and show that the optimality of a given point, or the unboundedness of the problem, can be certified by exhibiting a strategy for one of the players having certain infinitesimal properties (involving the value of the game and its derivative) that we characterize combinatorially. We use this idea to design a Newton-like algorithm to solve tropical linear-fractional programming problems, by reduction to a sequence of auxiliary mean payoff game problems.  相似文献   

4.
5.
We propose a numerical method for an approximate construction of the game value and optimal control laws in a linear-convex positional differential game with a quality index that evaluates the collection of deviations of the motion trajectory at given time moments from given targets. We show the algorithmic complexity of this method, details of software implementation, and results of numerical experiments.  相似文献   

6.
7.
We give a simple, direct, and constructive proof of memoryless determinacy for parity and mean payoff games. First, we prove by induction that the finite duration versions of these games, played until some vertex is repeated, are determined and both players have memoryless winning strategies. In contrast to the proof of Ehrenfeucht and Mycielski, Internat. J. Game Theory, 8 (1979) 109–113, our proof does not refer to the infinite-duration versions. Second, we show that memoryless determinacy straightforwardly generalizes to infinite duration versions of parity and mean payoff games.  相似文献   

8.
In game theory, an Evolutionarily Stable Set (ES set) is a set of Nash Equilibrium (NE) strategies that give the same payoffs. Similar to an Evolutionarily Stable Strategy (ES strategy), an ES set is also a strict NE. This work investigates the evolutionary stability of classical and quantum strategies in the quantum penny flip games. In particular, we developed an evolutionary game theory model to conduct a series of simulations where a population of mixed classical strategies from the ES set of the game were invaded by quantum strategies. We found that when only one of the two players’ mixed classical strategies were invaded, the results were different. In one case, due to the interference phenomenon of superposition, quantum strategies provided more payoff, hence successfully replaced the mixed classical strategies in the ES set. In the other case, the mixed classical strategies were able to sustain the invasion of quantum strategies and remained in the ES set. Moreover, when both players’ mixed classical strategies were invaded by quantum strategies, a new quantum ES set was emerged. The strategies in the quantum ES set give both players payoff 0, which is the same as the payoff of the strategies in the mixed classical ES set of this game.  相似文献   

9.
The puzzle of altruistic behaviours among multi-agent systems poses a dilemma, which has been an overlapping topic that covers many subjects. The public goods game can be regarded as a paradigm for modelling and exploring it. In the traditional definition of public goods game, the equally divided benefit among all participants leads to the dominance of defection. Much effort has been made to explain the evolution of cooperation, including the model in which the payoff ceilings for defectors are introduced. Further, we study a three-strategy evolutionary public goods game by providing the role of being loners. The payoff ceilings will take effect when the number of cooperators exceeds some threshold. Analysis results by following the replicator dynamics indicate that lower values of the payoff ceilings can better promote levels of public cooperation. Importantly, a remarkable cyclic route has been found: when receiving relative lower benefits, loners act as catalysts, helping the population to escape from mutual defection to cooperation. And, the stable equilibrium points from cooperation to isolation can be realised by improving the fixed payoffs of loners. Finally, broader ceilings also for cooperators provide us more hints about how to suppress the spreading of defectors under certain conditions.  相似文献   

10.
An optimal incentive strategy by which the leader suffers the least losses in punishing the follower's deviation from the decision desired by the leader is defined. Sufficient conditions for the existence of optimal incentive strategy are given. Static and dynamic leader-follower games with quadratic cost functionals are investigated. It is shown that leader-follower games with quadratic cost functionals admit optimal incentive strategies under the follower's decision variable being scalar, there exists a unique linear optimal incentive strategy. Such an incentive strategy can be explicitly determined  相似文献   

11.
12.
Real-time strategy games share many aspects with real situations in domains such as battle planning, air traffic control, and emergency response team management which makes them appealing test-beds for Artificial Intelligence (AI) and machine learning. End-user annotations could help to provide supplemental information for learning algorithms, especially when training data is sparse. This paper presents a formative study to uncover how experienced users explain game play in real-time strategy games. We report the results of our analysis of explanations and discuss their characteristics that could support the design of systems for use by experienced real-time strategy game users in specifying or annotating strategy-oriented behavior.  相似文献   

13.
A multilevel feedback Stackelberg strategy is formulated for systems withMplayers arranged in a linear hierarchy of decision making. For a linear-quadratic discrete-time game, sufficient conditions are given for existence and uniqueness of solution.  相似文献   

14.
Interactive texture-based volume rendering for large data sets   总被引:6,自引:0,他引:6  
To employ direct volume rendering, TRex uses parallel graphics hardware, software-based compositing, and high-performance I/O to provide near-interactive display rates for time-varying, terabyte-sized data sets. We present a scalable, pipelined approach for rendering data sets too large for a single graphics card. To do so, we take advantage of multiple hardware rendering units and parallel software compositing. The goals of TRex, our system for interactive volume rendering of large data sets, are to provide near-interactive display rates for time-varying, terabyte-sized uniformly sampled data sets and provide a low-latency platform for volume visualization in immersive environments. We consider 5 frames per second (fps) to be near-interactive rates for normal viewing environments and immersive environments to have a lower bound frame rate of l0 fps. Using TRex for virtual reality environments requires low latency - around 50 ms per frame or 100 ms per view update or stereo pair. To achieve lower latency renderings, we either render smaller portions of the volume on more graphics pipes or subsample the volume to render fewer samples per frame by each graphics pipe. Unstructured data sets must be resampled to appropriately leverage the 3D texture volume rendering method  相似文献   

15.
16.
In this paper, we present a robust and accurate algorithm for interactive image segmentation. The level set method is clearly advantageous for image objects with a complex topology and fragmented appearance. Our method integrates discriminative classification models and distance transforms with the level set method to avoid local minima and better snap to true object boundaries. The level set function approximates a transformed version of pixelwise posterior probabilities of being part of a target object. The evolution of its zero level set is driven by three force terms, region force, edge field force, and curvature force. These forces are based on a probabilistic classifier and an unsigned distance transform of salient edges. We further propose a technique that improves the performance of both the probabilistic classifier and the level set method over multiple passes. It makes the final object segmentation less sensitive to user interactions. Experiments and comparisons demonstrate the effectiveness of our method.  相似文献   

17.
We consider the problem of classification of an object given multiple observations that possibly include different transformations. The possible transformations of the object generally span a low-dimensional manifold in the original signal space. We propose to take advantage of this manifold structure for the effective classification of the object represented by the observation set. In particular, we design a low complexity solution that is able to exploit the properties of the data manifolds with a graph-based algorithm. Hence, we formulate the computation of the unknown label matrix as a smoothing process on the manifold under the constraint that all observations represent an object of one single class. It results into a discrete optimization problem, which can be solved by an efficient and simple, yet effective, algorithm. We demonstrate the performance of the proposed graph-based algorithm in the classification of sets of multiple images. Moreover, we show its high potential in video-based face recognition, where it outperforms state-of-the-art solutions that fall short of exploiting the manifold structure of the face image data sets.  相似文献   

18.
We describe and implement an algorithm for computing the set of reachable states of a continuous dynamic game. The algorithm is based on a proof that the reachable set is the zero sublevel set of the viscosity solution of a particular time-dependent Hamilton-Jacobi-Isaacs partial differential equation. While alternative techniques for computing the reachable set have been proposed, the differential game formulation allows treatment of nonlinear systems with inputs and uncertain parameters. Because the time-dependent equation's solution is continuous and defined throughout the state space, methods from the level set literature can be used to generate more accurate approximations than are possible for formulations with potentially discontinuous solutions. A numerical implementation of our formulation is described and has been released on the web. Its correctness is verified through a two vehicle, three dimensional collision avoidance example for which an analytic solution is available.  相似文献   

19.
In complex reasoning tasks it is often the case that there is no single, correct set of conclusions given some initial information. Instead, there may be several such conclusion sets, which we will call belief sets. In the present paper we introduce nonmonotonic belief set operators and selection operators to formalize and to analyze structural aspects of reasoning with multiple belief sets. We define and investigate formal properties of belief set operators as absorption, congruence, supradeductivity and weak belief monotony. Furthermore, it is shown that for each belief set operator satisfying strong belief cumulativity there exists a largest monotonic logic underlying it, thus generalizing a result for nonmonotonic inference operations. Finally, we study abstract properties of selection operators connected to belief set operators, which are used to choose some of the possible belief sets. This revised version was published online in June 2006 with corrections to the Cover Date.  相似文献   

20.
Many complex systems, whether biological, sociological, or physical ones, can be represented using networks. In these networks, a node represents an entity, and an arc represents a relationship/constraint between two entities. In discrete dynamics, one can construct a series of networks with each network representing a time snapshot of interaction among the different components in the system. Understanding these networks is a key to understand the dynamics of real and artificial systems. Network motifs are small graphs-usually three to four nodes-representing local structures. They have been widely used in studying complex systems and in characterizing features on the system level by analyzing locally how the substructures are formed. Frequencies of different network motifs have been shown in the literature to vary from one network to another, and conclusions hypothesized that these variations are due to the evolution/dynamics of the system. In this paper, we show for the first time that in strategy games, each game (i.e., type of dynamism) has its own signature of motifs and that this signature is maintained during the evolution of the game. We reveal that deterministic strategy games have unique footprints (motifs' count) that can be used to recognize and classify the game's type and that these footprints are consistent along the evolutionary path of the game. The findings of this paper have significance for a wide range of fields in cybernetics.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号