首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Lower Bound Methods and Separation Results for On-Line Learning Models   总被引:4,自引:4,他引:0  
Maass  Wolfgang  Turán  György 《Machine Learning》1992,9(2-3):107-145
We consider the complexity of concept learning in various common models for on-line learning, focusing on methods for proving lower bounds to the learning complexity of a concept class. Among others, we consider the model for learning with equivalence and membership queries. For this model we give lower bounds on the number of queries that are needed to learn a concept class in terms of the Vapnik-Chervonenkis dimension of , and in terms of the complexity of learning with arbitrary equivalence queries. Furthermore, we survey other known lower bound methods and we exhibit all known relationships between learning complexities in the models considered and some relevant combinatorial parameters. As it turns out, the picture is almost complete. This paper has been written so that it can be read without previous knowledge of Computational Learning Theory.  相似文献   

2.
Long  Philip M. 《Machine Learning》1999,37(3):337-354
We show that a bound on the rate of drift of the distribution generating the examples is sufficient for agnostic learning to relative accuracy , where c > 0 is a constant; this matches a known necessary condition to within a constant factor. We establish a sufficient condition for the realizable case, also matching a known necessary condition to within a constant factor. We provide a relatively simple proof of a bound of + on the sample complexity of agnostic learning in a fixed environment.  相似文献   

3.
The unit price seat reservation problem is investigated. The seat reservation problem is the problem of assigning seat numbers on-line to requests for reservations in a train traveling through k stations. We are considering the version where all tickets have the same price and where requests are treated fairly, that is, a request which can be fulfilled must be granted.For fair deterministic algorithms, we provide an asymptotically matching upper bound to the existing lower bound which states that all fair algorithms for this problem are -competitive on accommodating sequences, when there are at least three seats.Additionally, we give an asymptotic upper bound of for fair randomized algorithms against oblivious adversaries.We also examine concrete on-line algorithms, First-Fit and Random for the special case of two seats. Tight analyses of their performance are given.  相似文献   

4.
We present results of computational experiments with an extension of the Perceptron algorithm by a special type of simulated annealing. The simulated annealing procedure employs a logarithmic cooling schedule , where is a parameter that depends on the underlying configuration space. For sample sets S of n-dimensional vectors generated by randomly chosen polynomials , we try to approximate the positive and negative examples by linear threshold functions. The approximations are computed by both the classical Perceptron algorithm and our extension with logarithmic cooling schedules. For and , the extension outperforms the classical Perceptron algorithm by about 15% when the sample size is sufficiently large. The parameter was chosen according to estimations of the maximum escape depth from local minima of the associated energy landscape.   相似文献   

5.
A novel optimal order optimal resource parallel multibody algorithm with general system applicability is derived directly from the sequential recursive methods and the most recent developments in recursive constraint treatments. This new Recursive Coordinate Reduction Parallelism (RCRP) is the first optimal order parallel direct method with a sequential implementation that is exactly the efficient algorithm. Consequently, the RCRP sets new benchmarks for performance over a wide range of problem size and parallel resources. Comparisons to existing methods also demonstrate that the RCRP is presently the best general parallel method.  相似文献   

6.
The Laplace matrix is the matrix L = (ij) n × n with nonpositive off-diagonal elements and zero row sums. A weighted orgraph corresponds to each Laplace matrix, its properties being closely related to the algebraic properties of the Laplace matrix. The normalized Laplace matrix 719-01 is the Laplace matrix where –1/n ij 0 for all i j. The paper was devoted to the spectrum of the Laplace matrices and to the relationship between the spectra of the Laplace and stochastic matrices. The normalized Laplace matrices were proved to be semiconvergent. It was established that the multiplicity of the eigenvalue 0 of the matrix is equal to the in-forest dimension of the corresponding orgraph, and the multiplicity of the eigenvalue 1 is one less than the in-forest dimension of the complementary orgraph. The spectra of the matrices belong to the intersection of two circles of radius 1 – 1/n centered at the points 1/n and 1 – 1/n, respectively. Additionally, the domain that comprises them is included in the intersection of two angles (defined in the paper) with vertices 0 and 1 and the band (at the limit |Im(z)| < 1/). A polygon with all points being the eigenvalues of the normalized n-order Laplace matrices was constructed.__________Translated from Avtomatika i Telemekhanika, No. 5, 2005, pp. 47–62.Original Russian Text Copyright © 2005 by Agaev, Chebotarev.This work was supported by the Russian Foundation for Basic Research, project no. 02-01-00614.  相似文献   

7.
New optimal control problems are considered for distributed systems described by elliptic equations with conjugate conditions and a quadratic minimized function. Highly accurate computational discretization schemes are constructed for the case where a feasible control set coincides with the full Hilbert space of controls.  相似文献   

8.
The present paper proposes a new learning model—called stochastic finite learning—and shows the whole class of pattern languages to be learnable within this model.This main result is achieved by providing a new and improved average-case analysis of the Lange–Wiehagen (New Generation Computing, 8, 361–370) algorithm learning the class of all pattern languages in the limit from positive data. The complexity measure chosen is the total learning time, i.e., the overall time taken by the algorithm until convergence. The expectation of the total learning time is carefully analyzed and exponentially shrinking tail bounds for it are established for a large class of probability distributions. For every pattern containing k different variables it is shown that Lange and Wiehagen's algorithm possesses an expected total learning time of , where and are two easily computable parameters arising naturally from the underlying probability distributions, and E[] is the expected example string length.Finally, assuming a bit of domain knowledge concerning the underlying class of probability distributions, it is shown how to convert learning in the limit into stochastic finite learning.  相似文献   

9.
Auer  Peter  Warmuth  Manfred K. 《Machine Learning》1998,32(2):127-150
Littlestone developed a simple deterministic on-line learning algorithm for learning k-literal disjunctions. This algorithm (called ) keeps one weight for each of then variables and does multiplicative updates to its weights. We develop a randomized version of and prove bounds for an adaptation of the algorithm for the case when the disjunction may change over time. In this case a possible target disjunction schedule is a sequence of disjunctions (one per trial) and the shift size is the total number of literals that are added/removed from the disjunctions as one progresses through the sequence.We develop an algorithm that predicts nearly as well as the best disjunction schedule for an arbitrary sequence of examples. This algorithm that allows us to track the predictions of the best disjunction is hardly more complex than the original version. However, the amortized analysis needed for obtaining worst-case mistake bounds requires new techniques. In some cases our lower bounds show that the upper bounds of our algorithm have the right constant in front of the leading term in the mistake bound and almost the right constant in front of the second leading term. Computer experiments support our theoretical findings.  相似文献   

10.
Inducing Multi-Level Association Rules from Multiple Relations   总被引:4,自引:0,他引:4  
  相似文献   

11.
Do  Dang-Khoa 《Reliable Computing》2004,10(6):489-500
The spigot approach used in the previous paper (Reliable Computing 7 (3) (2001), pp. 247–273) for root computation is now applied to natural logarithms. The logarithm ln Q with Q , Q > 1 is decomposed into a sum of two addends k 1× ln Q 1+k 2× ln Q 2 with k 1, k 2 , then each of them is computed by the spigot algorithm and summation is carried out using integer arithmetic. The whole procedure is not literally a spigot algorithm, but advantages are the same: only integer arithmetic is needed whereas arbitrary accuracy is achieved and absolute reliability is guaranteed. The concrete procedure based on the decomposition with p, q ( – {0}), p < q is simple and ready for implementation. In addition to the mentioned paper, means for determining an upper bound for the biggest integer occurring in the process of spigot computing are now provided, which is essential for the reliability of machine computation.  相似文献   

12.
13.
This work is concerned with online learning from expert advice. Extensive work on this problem generated numerous expert advice algorithms whose total loss is provably bounded above in terms of the loss incurred by the best expert in hindsight. Such algorithms were devised for various problem variants corresponding to various loss functions. For some loss functions, such as the square, Hellinger and entropy losses, optimal algorithms are known. However, for two of the most widely used loss functions, namely the 0/1 and absolute loss, there are still gaps between the known lower and upper bounds.In this paper we present two new expert advice algorithms and prove for them the best known 0/1 and absolute loss bounds. Given an expert advice algorithm ALG, the goal is to form an upper bound on the regret L ALGL* of ALG, where L ALG is the loss of ALG and L* is the loss of the best expert in hindsight. Typically, regret bounds of a canonical form C · are sought where N is the number of experts and C is a constant. So far, the best known constant for the absolute loss function is C = 2.83, which is achieved by the recent IAWM algorithm of Auer et al. (2002). For the 0/1 loss function no bounds of this canonical form are known and the best known regret bound is , where C 1 = e – 2 and C 2 = 2 . This bound is achieved by a P-norm algorithm of Gentile and Littlestone (1999). Our first algorithm is a randomized extension of the guess and double algorithm of Cesa-Bianchi et al. (1997). While the guess and double algorithm achieves a canonical regret bound with C = 3.32, the expected regret of our randomized algorithm is canonically bounded with C = 2.49 for the absolute loss function. The algorithm utilizes one random choice at the start of the game. Like the deterministic guess and double algorithm, a deficiency of our algorithm is that it occasionally restarts itself and therefore forgets what it learned. Our second algorithm does not forget and enjoys the best known asymptotic performance guarantees for both the absolute and 0/1 loss functions. Specifically, in the case of the absolute loss, our algorithm is canonically bounded with C approaching and in the case of the 0/1 loss, with C approaching 3/ . In the 0/1 loss case the algorithm is randomized and the bound is on the expected regret.  相似文献   

14.
The unit ball random geometric graph has as its vertices n points distributed independently and uniformly in the unit ball in , with two vertices adjacent if and only if their ℓp-distance is at most λ. Like its cousin the Erdos-Renyi random graph, G has a connectivity threshold: an asymptotic value for λ in terms of n, above which G is connected and below which G is disconnected. In the connected zone we determine upper and lower bounds for the graph diameter of G. Specifically, almost always, , where is the ℓp-diameter of the unit ball B. We employ a combination of methods from probabilistic combinatorics and stochastic geometry.  相似文献   

15.
Adaptive algorithms for real-time and proactive detection of network/service anomalies, i.e., soft performance degradations, in transaction-oriented wide area networks (WANs) have been developed. These algorithms (i) adaptively sample and aggregate raw transaction records to compute service-class based traffic intensities, in which potential network anomalies are highlighted; (ii) construct dynamic and service-class based performance thresholds for detecting network and service anomalies; and (iii) perform service-class based and real-time network anomaly detection. These anomaly detection algorithms are implemented as a real-time software system called TRISTAN ( ansaction n antaneous nomaly otification), which is deployed in the AT&T Transaction Access Services (TAS) network. The TAS network is a commercially important, high volume (millions of transactions per day), multiple service classes (tens), hybrid telecom and data WAN that services transaction traffic such as credit card transactions in the US and neighboring countries. TRISTAN is demonstrated to be capable of automatically and adaptively detecting network/service anomalies and correctly identifying the corresponding "guilty" service classes in TAS. TRISTAN can detect network/service faults that elude detection by the traditional alarm-based network monitoring systems.  相似文献   

16.
This paper treates classes in the polynomial hierarchy of type two, , that were first developed by Townsend as a natural extension of the Meyer-Stockmeyer polynomial hierarchy in complexity theory. For these classes, it is discussed whether each of them has the extension property and the three recursion-theoretic properties: separation, reduction, and pre-wellordering. This paper shows that every 0$$ " align="middle" border="0"> , lacks the pre-wellordering property by using a probabilistic argument on constant-depth Boolean circuits. From the assumption NP = coNP it follows by a pruning argument that has the separation and extension properties.  相似文献   

17.
Binhai Zhu 《GeoInformatica》2000,4(3):317-334
This paper studies the idea of answering range searching queries using simple data structures. The only data structure we need is the Delaunay Triangulation of the input points. The idea is to first locate a vertex of the (arbitrary) query polygon and walk along the boundary of the polygon in the Delaunay Triangulation and report all the points enclosed by the query polygon. For a set of uniformly distributed random points in 2-D and a query polygon the expected query time of this algorithm is O(n 1/3 + Q + E K + L r n 1/2), where Q is the size of the query polygon , {\bf E}K = O(n\bcdot area is the expected number of output points, L r is a parameter related to the shape of the query polygon and n, and L r is always bounded by the sum of the edge lengths of . Theoretically, when L r = O(1/n1/6) the expected query time is O(n1/3 + Q + E K), which improves the best known average query time for general range searching. Besides the theoretical meaning, the good property of this algorithm is that once the Delaunay Triangulation is given, no additional preprocessing is needed. In order to obtain empirical results, we design a new algorithm for generating random simple polygons within a given domain. Our empirical results show that the constant coefficient of the algorithm is small, at least for the special (practical) cases when the query polygon is either a triangle (simplex range searching) or an axis-parallel box (orthogonal range searching) and for the general case when the query polygons are generated by our new polygon-generating algorithms and their sizes are relatively small.  相似文献   

18.
A binary code is called ℤ4-linear if its quaternary Gray map preimage is linear. We show that the set of all quaternary linear Preparata codes of length n = 2m, m odd, m ≥ 3, is nothing more than the set of codes of the form with
where T λ(⋅) and S ψ (⋅) are vector fields of a special form defined over the binary extended linear Hamming code H n of length n. An upper bound on the number of nonequivalent quaternary linear Preparata codes of length n is obtained, namely, . A representation for binary Preparata codes contained in perfect Vasil’ev codes is suggested.__________Translated from Problemy Peredachi Informatsii, No. 2, 2005, pp. 50–62.Original Russian Text Copyright © 2005 by Tokareva.Supported in part by the Ministry of Education of the Russian Federation program “Development of the Scientific Potential of the Higher School,” project no. 512.  相似文献   

19.
Any given n×n matrix A is shown to be a restriction, to the A-invariant subspace, of a nonnegative N×N matrix B of spectral radius (B) arbitrarily close to (A). A difference inclusion , where is a compact set of matrices, is asymptotically stable if and only if can be extended to a set of nonnegative matrices B with or . Similar results are derived for differential inclusions.  相似文献   

20.
The Convergence of TD(λ) for General λ   总被引:1,自引:0,他引:1  
Peter Dayan 《Machine Learning》1992,8(3-4):341-362
The method of temporal differences (TD) is one way of making consistent predictions about the future. This paper uses some analysis of Watkins (1989) to extend a convergence theorem due to Sutton (1988) from the case which only uses information from adjacent time steps to that involving information from arbitrary ones.It also considers how this version of TD behaves in the face of linearly dependent representations for states—demonstrating that it still converges, but to a different answer from the least mean squares algorithm. Finally it adapts Watkins' theorem that -learning, his closely related prediction and action learning method, converges with probability one, to demonstrate this strong form of convergence for a slightly modified version of TD.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号