首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We generalize the recent relative loss bounds for on-line algorithms where the additional loss of the algorithm on the whole sequence of examples over the loss of the best expert is bounded. The generalization allows the sequence to be partitioned into segments, and the goal is to bound the additional loss of the algorithm over the sum of the losses of the best experts for each segment. This is to model situations in which the examples change and different experts are best for certain segments of the sequence of examples. In the single segment case, the additional loss is proportional to log n, where n is the number of experts and the constant of proportionality depends on the loss function. Our algorithms do not produce the best partition; however the loss bound shows that our predictions are close to those of the best partition. When the number of segments is k+1 and the sequence is of length &ell, we can bound the additional loss of our algorithm over the best partition by . For the case when the loss per trial is bounded by one, we obtain an algorithm whose additional loss over the loss of the best partition is independent of the length of the sequence. The additional loss becomes , where L is the loss of the best partitionwith k+1 segments. Our algorithms for tracking the predictions of the best expert aresimple adaptations of Vovk's original algorithm for the single best expert case. As in the original algorithms, we keep one weight per expert, and spend O(1) time per weight in each trial.  相似文献   

2.
This work is concerned with online learning from expert advice. Extensive work on this problem generated numerous expert advice algorithms whose total loss is provably bounded above in terms of the loss incurred by the best expert in hindsight. Such algorithms were devised for various problem variants corresponding to various loss functions. For some loss functions, such as the square, Hellinger and entropy losses, optimal algorithms are known. However, for two of the most widely used loss functions, namely the 0/1 and absolute loss, there are still gaps between the known lower and upper bounds.In this paper we present two new expert advice algorithms and prove for them the best known 0/1 and absolute loss bounds. Given an expert advice algorithm ALG, the goal is to form an upper bound on the regret L ALGL* of ALG, where L ALG is the loss of ALG and L* is the loss of the best expert in hindsight. Typically, regret bounds of a canonical form C · are sought where N is the number of experts and C is a constant. So far, the best known constant for the absolute loss function is C = 2.83, which is achieved by the recent IAWM algorithm of Auer et al. (2002). For the 0/1 loss function no bounds of this canonical form are known and the best known regret bound is , where C 1 = e – 2 and C 2 = 2 . This bound is achieved by a P-norm algorithm of Gentile and Littlestone (1999). Our first algorithm is a randomized extension of the guess and double algorithm of Cesa-Bianchi et al. (1997). While the guess and double algorithm achieves a canonical regret bound with C = 3.32, the expected regret of our randomized algorithm is canonically bounded with C = 2.49 for the absolute loss function. The algorithm utilizes one random choice at the start of the game. Like the deterministic guess and double algorithm, a deficiency of our algorithm is that it occasionally restarts itself and therefore forgets what it learned. Our second algorithm does not forget and enjoys the best known asymptotic performance guarantees for both the absolute and 0/1 loss functions. Specifically, in the case of the absolute loss, our algorithm is canonically bounded with C approaching and in the case of the 0/1 loss, with C approaching 3/ . In the 0/1 loss case the algorithm is randomized and the bound is on the expected regret.  相似文献   

3.
Servedio  R. 《Machine Learning》2002,47(2-3):133-151
We describe a novel family of PAC model algorithms for learning linear threshold functions. The new algorithms work by boosting a simple weak learner and exhibit sample complexity bounds remarkably similar to those of known online algorithms such as Perceptron and Winnow, thus suggesting that these well-studied online algorithms in some sense correspond to instances of boosting. We show that the new algorithms can be viewed as natural PAC analogues of the online p-norm algorithms which have recently been studied by Grove, Littlestone, and Schuurmans (1997, Proceedings of the Tenth Annual Conference on Computational Learning Theory (pp. 171–183) and Gentile and Littlestone (1999, Proceedings of the Twelfth Annual Conference on Computational Learning Theory (pp. 1–11). As special cases of the algorithm, by taking p = 2 and p = we obtain natural boosting-based PAC analogues of Perceptron and Winnow respectively. The p = case of our algorithm can also be viewed as a generalization (with an improved sample complexity bound) of Jackson and Craven's PAC-model boosting-based algorithm for learning sparse perceptrons (Jackson & Craven, 1996, Advances in neural information processing systems 8, MIT Press). The analysis of the generalization error of the new algorithms relies on techniques from the theory of large margin classification.  相似文献   

4.
Auer  Peter  Warmuth  Manfred K. 《Machine Learning》1998,32(2):127-150
Littlestone developed a simple deterministic on-line learning algorithm for learning k-literal disjunctions. This algorithm (called ) keeps one weight for each of then variables and does multiplicative updates to its weights. We develop a randomized version of and prove bounds for an adaptation of the algorithm for the case when the disjunction may change over time. In this case a possible target disjunction schedule is a sequence of disjunctions (one per trial) and the shift size is the total number of literals that are added/removed from the disjunctions as one progresses through the sequence.We develop an algorithm that predicts nearly as well as the best disjunction schedule for an arbitrary sequence of examples. This algorithm that allows us to track the predictions of the best disjunction is hardly more complex than the original version. However, the amortized analysis needed for obtaining worst-case mistake bounds requires new techniques. In some cases our lower bounds show that the upper bounds of our algorithm have the right constant in front of the leading term in the mistake bound and almost the right constant in front of the second leading term. Computer experiments support our theoretical findings.  相似文献   

5.
Efficient Algorithms for Interface Timing Verification   总被引:2,自引:0,他引:2  
This paper presents algorithms for computing separations between events that are constrained to obey prespecified relationships in their relative time of occurrence. The algorithms are useful for interface timing verification, where event separations are checked against timing requirements. The first algorithm computes separations when only linear and max constraints exist. The algorithm must converge to correct maximum separation values in a finite number of steps, or report an inconsistence of the constraints, irrespective of the existence of infinite constraint bounds or infinite event separations. It is conjectured to run in time, where V is the number of events, and E is the number of relationships between them. The other algorithms extend the first, and compute event separations in the NP-complete version of the problem where min constraints exist. Experiments demonstrate the algorithms are efficient in practice.  相似文献   

6.
This paper addresses an algorithmic problem related to associative algebras. We show that the problem of deciding if the index of a given central simple algebra over an algebraic number field isd, whered is a given natural number, belongs to the complexity classN P co N P. As consequences, we obtain that the problem of deciding if is isomorphic to a full matrix algebra over the ground field and the problem of deciding if is a skewfield both belong toN P co N P. These results answer two questions raised in [25]. The algorithms and proofs rely mostly on the theory of maximal orders over number fields, a noncommutative generalization of algebraic number theory. Our results include an extension to the noncommutative case of an algorithm given by Huang for computing the factorization of rational primes in number fields and of a method of Zassenhaus for testing local maximality of orders in number fields.  相似文献   

7.
This article studies self-directed learning, a variant of the on-line (or incremental) learning model in which the learner selects the presentation order for the instances. Alternatively, one can view this model as a variation of learning with membership queries in which the learner is only charged for membership queries for which it could not predict the outcome. We give tight bounds on the complexity of self-directed learning for the concept classes of monomials, monotone DNF formulas, and axis-parallel rectangles in {0, 1, , n – 1} d . These results demonstrate that the number of mistakes under self-directed learning can be surprisingly small. We then show that learning complexity in the model of self-directed learning is less than that of all other commonly studied on-line and query learning models. Next we explore the relationship between the complexity of self-directed learning and the Vapnik-Chervonenkis (VC-)dimension. We show that, in general, the VC-dimension and the self-directed learning complexity are incomparable. However, for some special cases, we show that the VC-dimension gives a lower bound for the self-directed learning complexity. Finally, we explore a relationship between Mitchell's version space algorithm and the existence of self-directed learning algorithms that make few mistakes.  相似文献   

8.
Given an integerk, and anarbitrary integer greater than , we prove a tight bound of on the time required to compute with operations {+, –, *, /, ·, }, and constants {0, 1}. In contrast, when the floor operation is not available this computation requires (k) time. Using the upper bound, we give an time algorithm for computing log loga, for alln-bit integersa. This upper bound matches the lower bound for computing this function given by Mansour, Schieber, and Tiwari. To the best of our knowledge these are the first non-constant tight bounds for computations involving the floor operation.  相似文献   

9.
Lin HT  Li L 《Neural computation》2012,24(5):1329-1367
We present a reduction framework from ordinal ranking to binary classification. The framework consists of three steps: extracting extended examples from the original examples, learning a binary classifier on the extended examples with any binary classification algorithm, and constructing a ranker from the binary classifier. Based on the framework, we show that a weighted 0/1 loss of the binary classifier upper-bounds the mislabeling cost of the ranker, both error-wise and regret-wise. Our framework allows not only the design of good ordinal ranking algorithms based on well-tuned binary classification approaches, but also the derivation of new generalization bounds for ordinal ranking from known bounds for binary classification. In addition, our framework unifies many existing ordinal ranking algorithms, such as perceptron ranking and support vector ordinal regression. When compared empirically on benchmark data sets, some of our newly designed algorithms enjoy advantages in terms of both training speed and generalization performance over existing algorithms. In addition, the newly designed algorithms lead to better cost-sensitive ordinal ranking performance, as well as improved listwise ranking performance.  相似文献   

10.
Enlarging the Margins in Perceptron Decision Trees   总被引:4,自引:0,他引:4  
Capacity control in perceptron decision trees is typically performed by controlling their size. We prove that other quantities can be as relevant to reduce their flexibility and combat overfitting. In particular, we provide an upper bound on the generalization error which depends both on the size of the tree and on the margin of the decision nodes. So enlarging the margin in perceptron decision trees will reduce the upper bound on generalization error. Based on this analysis, we introduce three new algorithms, which can induce large margin perceptron decision trees. To assess the effect of the large margin bias, OC1 (Journal of Artificial Intelligence Research, 1994, 2, 1–32.) of Murthy, Kasif and Salzberg, a well-known system for inducing perceptron decision trees, is used as the baseline algorithm. An extensive experimental study on real world data showed that all three new algorithms perform better or at least not significantly worse than OC1 on almost every dataset with only one exception. OC1 performed worse than the best margin-based method on every dataset.  相似文献   

11.
The algebraic immunity of a Boolean function is a parameter that characterizes the possibility to bound this function from above or below by a nonconstant Boolean function of a low algebraic degree. We obtain lower bounds on the algebraic immunity for a class of functions expressed through the inversion operation in the field GF(2 n ), as well as for larger classes of functions defined by their trace forms. In particular, for n ≥ 5, the algebraic immunity of the function Tr n (x ?1) has a lower bound ?2√n + 4? ? 4, which is close enough to the previously obtained upper bound ?√n? + ?n/?√n?? ? 2. We obtain a polynomial algorithm which, give a trace form of a Boolean function f, computes generating sets of functions of degree ≤ d for the following pair of spaces. Each function of the first (linear) space bounds f from below, and each function of the second (affine) space bounds f from above. Moreover, at the output of the algorithm, each function of a generating set is represented both as its trace form and as a polynomial of Boolean variables.  相似文献   

12.
The On-Line Multiprocessor Scheduling Problem with Known Sum of the Tasks   总被引:2,自引:0,他引:2  
In this paper we investigate a semi on-line multiprocessor scheduling problem. The problem is the classical on-line multiprocessor problem where the total sum of the tasks is known in advance. We show an asymptotic lower bound on the performance ratio of any algorithm (as the number of processors gets large), and present an algorithm which has performance ratio at most for any number of processors. When compared with known general lower bounds, this result indicates that the information on the sum of tasks substantially improves the performance ratio of on-line algorithms.  相似文献   

13.
This paper presents an optimal parallel algorithm for triangulating an arbitrary set ofn points in the plane. The algorithm runs inO(logn) time usingO(n) space andO(n) processors on a Concurrent-Read, Exclusive-Write Parallel RAM model (CREW PRAM). The parallel lower bound on triangulation is (logn) time so the best possible linear speedup has been achieved. A parallel divide-and-conquer technique of subdividing a problem into subproblems is employed.  相似文献   

14.
Tracking the best hyperplane with a simple budget Perceptron   总被引:1,自引:0,他引:1  
Shifting bounds for on-line classification algorithms ensure good performance on any sequence of examples that is well predicted by a sequence of changing classifiers. When proving shifting bounds for kernel-based classifiers, one also faces the problem of storing a number of support vectors that can grow unboundedly, unless an eviction policy is used to keep this number under control. In this paper, we show that shifting and on-line learning on a budget can be combined surprisingly well. First, we introduce and analyze a shifting Perceptron algorithm achieving the best known shifting bounds while using an unlimited budget. Second, we show that by applying to the Perceptron algorithm the simplest possible eviction policy, which discards a random support vector each time a new one comes in, we achieve a shifting bound close to the one we obtained with no budget restrictions. More importantly, we show that our randomized algorithm strikes the optimal trade-off $U = \Theta(\sqrt{B})$ between budget B and norm U of the largest classifier in the comparison sequence. Experiments are presented comparing several linear-threshold algorithms on chronologically-ordered textual datasets. These experiments support our theoretical findings in that they show to what extent randomized budget algorithms are more robust than deterministic ones when learning shifting target data streams.  相似文献   

15.
An optimalO(log logn)-time CRCW-PRAM algorithm for computing all period lengths of a string is presented. Previous parallel algorithms compute the period only if it is shorter than half of the length of the string. The algorithm can be used to find all initial palindromes of a string in the same time and processor bounds. Both algorithms are the fastest possible over a general alphabet. We derive a lower bound for finding initial palindromes by modifying a known lower bound for finding the period length of a string [9]. Whenp processors are available the bounds become (n/p+log1+p/n2p).This work was partially supported by NSF Grant CCR-90-14605. D. Breslauer was partially supported by an IBM Graduate Fellowship while studying at Columbia University and by a European Research Consortium for Informatics and Mathematics postdoctoral fellowship.  相似文献   

16.
In this paper we investigate the parallel complexity of backtrack and branch-and-bound search on the mesh-connected array. We present an (dN/logdN) lower bound for the time needed by arandomized algorithm to perform backtrack and branch-and-bound search of a tree of depthd on the N × N mesh, even when the depth of the tree is known in advance. The lower bound also holds for algorithms that are allowed to move tree-nodes and create multiple copies of the same tree-node.For the upper bounds we givedeterministic algorithms that are within a factor of 0(log3/2 N) from our lower bound. Our algorithms do not make any assumption on the shape of the tree to be searched, do not know the depth of the tree in advance, and do not move tree-nodes nor create multiple copies of the same node.The best previously known algorithm for backtrack search on the mesh was randomized and required (dN/ logN) time. Our algorithm for branch-and-bound is the first algorithm that performs branch-and-bound search on a sparse network. Both the lower and the upper bounds extend to meshes of higher dimensions.Part of this work was done while the authors were at Harvard University.  相似文献   

17.
We develop new linear program performance bounds for closed reentrantqueueing networks based on an inequality relaxation of the averagecost equation. The approach exploits the fact that the transitionprobabilities under certain policies of closed queueing networksare invariant within certain regions of the state space. Thisinvariance suggests the use of a piecewise quadratic functionas a surrogate for the differential cost function. The linearprogramming throughput bounds obtained are provably tighter thanpreviously known bounds at the cost of increased computationalcomplexity. Functional throughput bounds parameterized by thefixed customer population N are obtained, alongwith a bound on the limiting throughput as N + .We show that one may obtain reduced complexity bounds while stillretaining superiority.  相似文献   

18.
We construct a predicate that is computable by a perceptron with linear size, order 1, and exponential weights, but which cannot be computed by any perceptron having subexponential size, subpolynomial (n o(1) ) order and subxponential weights. A consequence is that there is an oracle relative to which PNP is not contained in PP.  相似文献   

19.
New approximation algorithms for packing rectangles into a semi-infinite strip are introduced in this paper. Within a standard probability model, an asymptotic average-case analysis is given for the wasted space in the packings produced by these algorithms.An off-line algorithm is presented along with a proof that it wastes (/n)space on the average, wheren is the number of rectangles packed. This result is known to apply to optimal packings as well. Several on-line shelf algorithms are also analyzed. Withn assumed known in advance, one such algorithm is described and shown to waste (n 2/3) space on the average. It is proved that this result also characterizes optimal on-line shelf packings. For a very general class of linear-time algorithms, it is shown that a constant (nonzero) fraction of the space must be wasted on the average for alln, and a lower bound on this fraction in terms of algorithm parameters is given. Finally, the paper discusses the implications of the above results for dynamic packing and two-dimensional bin-packing problems.  相似文献   

20.
Understanding the empirical success of boosting algorithms is an important theoretical problem in machine learning.One of the most influential works is the margin theory,which provides a series of upper bounds for the generalization error of any voting classifier in terms of the margins of the training data.Recently an equilibrium margin (Emargin) bound which is sharper than previously well-known margin bounds is proposed.In this paper,we conduct extensive experiments to test the Emargin theory.Specifically,we develop an efficient algorithm that,given a boosting classifier (or a voting classifier in general),learns a new voting classifier which usually has a smaller Emargin bound.We then compare the performances of the two classifiers and find that the new classifier often has smaller test errors,which agrees with what the Emargin theory predicts.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号