首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 203 毫秒
1.
Gold introduced the notion of learning in the limit where a class S is learnable iff there is a recursive machine M which reads the course of values of a function f and converges to a program for f whenever f is in S. An important measure for the speed of convergence in this model is the quantity of mind changes before the onset of convergence. The oldest model is to consider a constant bound on the number of mind changes M makes on any input function; such a bound is referred here as type 1. Later this was generalized to a bound of type 2 where a counter ranges over constructive ordinals and is counted down at every mind change. Although ordinal bounds permit the inference of richer concept classes than constant bounds, they still are a severe restriction. Therefore the present work introduces two more general approaches to bounding mind changes. These are based on counting by going down in a linearly ordered set (type 3) and on counting by going down in a partially ordered set (type 4). In both cases the set must not contain infinite descending recursive sequences. These four types of mind changes yield a hierarchy and there are identifiable classes that cannot be learned with the most general mind change bound of type 4. It is shown that existence of type 2 bound is equivalent to the existence of a learning algorithm which converges on every (also nonrecursive) input function and the existence of type 4 is shown to be equivalent to the existence of a learning algorithm which converges on every recursive function. A partial characterization of type 3 yields a result of independent interest in recursion theory. The interplay between mind change complexity and choice of hypothesis space is investigated. It is established that for certain concept classes, a more expressive hypothesis space can sometimes reduce mind change complexity of learning these classes. The notion of mind change bound for behaviourally correct learning is indirectly addressed by employing the above four types to restrict the number of predictive errors of commission in finite error next value learning (NV′′)—a model equivalent to behaviourally correct learning. Again, natural characterizations for type 2 and type 4 bounds are derived. Their naturalness is further illustrated by characterizing them in terms of branches of uniformly recursive families of binary trees.  相似文献   

2.
We analyze the amount of data needed to carry out various model-based recognition tasks in the context of a probabilistic data collection model. We focus on objects that may be described as semi-algebraic subsets of a Euclidean space. This is a very rich class that includes polynomially described bodies, as well as polygonal objects, as special cases. The class of object transformations considered is wide, and includes perspective and affine transformations of 2D objects, and perspective projections of 3D objects.We derive upper bounds on the number of data features (associated with non-zero spatial error) which provably suffice for drawing reliable conclusions. Our bounds are based on a quantitative analysis of the complexity of the hypotheses class that one has to choose from. Our central tool is the VC-dimension, which is a well-studied parameter measuring the combinatorial complexity of families of sets. It turns out that these bounds grow linearly with the task complexity, measured via the VC-dimension of the class of objects one deals with. We show that this VC-dimension is at most logarithmic in the algebraic complexity of the objects and in the cardinality of the model library.Our approach borrows from computational learning theory. Both learning and recognition use evidence to infer hypotheses but as far as we know, their similarity was not exploited previously. We draw close relations between recognition tasks and a certain learnability framework and then apply basic techniques of learnability theory to derive our sample size upper bounds. We believe that other relations between learning procedures and visual tasks exist and hope that this work will trigger further fruitful study along these lines.  相似文献   

3.
This paper proposes the use of constructive ordinals as mistake bounds in the on-line learning model. This approach elegantly generalizes the applicability of the on-line mistake bound model to learnability analysis of very expressive concept classes like pattern languages, unions of pattern languages, elementary formal systems, and minimal models of logic programs. The main result in the paper shows that the topological property of effective finite bounded thickness is a sufficient condition for on-line learnability with a certain ordinal mistake bound. An interesting characterization of the on-line learning model is shown in terms of the identification in the limit framework. It is established that the classes of languages learnable in the on-line model with a mistake bound of α are exactly the same as the classes of languages learnable in the limit from both positive and negative data by a Popperian, consistent learner with a mind change bound of α. This result nicely builds a bridge between the two models.  相似文献   

4.
We compare the use of price-based policies or taxes, and quantity-based policies or quotas, for controlling emissions in a dynamic setup when the regulator faces two sources of uncertainty: (i) market-related uncertainty; and (ii) ecological uncertainty. We assume that the regulator is a rational Bayesian learner and the regulator and firms have asymmetric information. In our model the structure of Bayesian learning is general. Our results suggest that the expected level of emissions is the same under taxes and quotas. However, the comparison of the total benefits related to these policies suggests that taxes dominate quotas, that is, they provide a higher social welfare. Even though taxes have some benefits over quotas, neither learning nor ecological uncertainty affect the choice of policy, i.e., the only factor having such an impact is uncertainty in the instantaneous net emissions benefits (market-related uncertainty). Besides, the more volatile is this uncertainty, the more benefits of taxes over quotas. Ecological uncertainty leads to a difference between the emissions rule under the informed and the rational learning assumptions. However, the direction of this difference depends on the beliefs bias with regard to ecological uncertainty. We also find that a change in the regulator’s beliefs toward more optimistic views will increase the emissions.  相似文献   

5.
Boosting a Weak Learning Algorithm by Majority   总被引:11,自引:0,他引:11  
We present an algorithm for improving the accuracy of algorithms for learning binary concepts. The improvement is achieved by combining a large number of hypotheses, each of which is generated by training the given learning algorithm on a different set of examples. Our algorithm is based on ideas presented by Schapire and represents an improvement over his results, The analysis of our algorithm provides general upper bounds on the resources required for learning in Valiant′s polynomial PAC learning framework, which are the best general upper bounds known today. We show that the number of hypotheses that are combined by our algorithm is the smallest number possible. Other outcomes of our analysis are results regarding the representational power of threshold circuits, the relation between learnability and compression, and a method for parallelizing PAC learning algorithms. We provide extensions of our algorithms to cases in which the concepts are not binary and to the case where the accuracy of the learning algorithm depends on the distribution of the instances.  相似文献   

6.
The present paper motivates the study of mind change complexity for learning minimal models of length-bounded logic programs. It establishes ordinal mind change complexity bounds for learnability of these classes both from positive facts and from positive and negative facts. Building on Angluin's notion of finite thickness and Wright's work on finite elasticity, Shinohara defined the property of bounded finite thickness to give a sufficient condition for learnability of indexed families of computable languages from positive data. This paper shows that an effective version of Shinohara's notion of bounded finite thickness gives sufficient conditions for learnability with ordinal mind change bound, both in the context of learnability from positive data and for learnability from complete (both positive and negative) data. Let ω be a notation for the first limit ordinal. Then, it is shown that if a language defining framework yields a uniformly decidable family of languages and has effective bounded finite thickness, then for each natural number m>0, the class of languages defined by formal systems of length ⩽m:
  • •is identifiable in the limit from positive data with a mind change bound of ωm;
  • •is identifiable in the limit from both positive and negative data with an ordinal mind change bound of ω×m.
The above sufficient conditions are employed to give an ordinal mind change bound for learnability of minimal models of various classes of length-bounded Prolog programs, including Shapiro's linear programs, Arimura and Shinohara's depth-bounded linearly covering programs, and Krishna Rao's depth-bounded linearly moded programs. It is also noted that the bound for learning from positive data is tight for the example classes considered.  相似文献   

7.
We investigate the inferrability of E-pattern languages (also known as extended or erasing pattern languages) from positive data in Gold's learning model. As the main result, our analysis yields a negative outcome for the full class of E-pattern languages—and even for the subclass of terminal-free E-pattern languages—if the corresponding terminal alphabet consists of exactly two distinct letters. Furthermore, we present a positive result for a manifest subclass of terminal-free E-pattern languages. We point out that the considered problems are closely related to fundamental questions concerning the nondeterminism of E-pattern languages.  相似文献   

8.
Bayes’ rule specifies how to obtain a posterior from a class of hypotheses endowed with a prior and the observed data. There are three fundamental ways to use this posterior for predicting the future: marginalization (integration over the hypotheses w.r.t. the posterior), MAP (taking the a posteriori most probable hypothesis), and stochastic model selection (selecting a hypothesis at random according to the posterior distribution). If the hypothesis class is countable, and contains the data generating distribution (this is termed the “realizable case”), strong consistency theorems are known for the former two methods in a sequential prediction framework, asserting almost sure convergence of the predictions to the truth as well as loss bounds. We prove corresponding results for stochastic model selection, for both discrete and continuous observation spaces. As a main technical tool, we will use the concept of a potential: this quantity, which is always positive, measures the total possible amount of future prediction errors. Precisely, in each time step, the expected potential decrease upper bounds the expected error. We introduce the entropy potential of a hypothesis class as its worst-case entropy, with regard to the true distribution. Our results are proven within a general stochastic online prediction framework, that comprises both online classification and prediction of non-i.i.d. sequences.  相似文献   

9.
This paper analyzes the properties of a procedure for learning from examples. This “canonical learner” is based on a canonical error estimator developed in Part I. In learning problems one can observe data that consists of labeled sample points, and the goal is to find a model or “hypothesis” from a set of candidates that will accurately predict the labels of new sample points. The expected mismatch between a hypothesis prediction and the actual label of a new sample point is called the hypothesis “generalization error”. We compare the canonical learner with the traditional technique of finding hypotheses that minimize the relative frequency-based empirical error estimate. It is shown that for a broad class of learning problems, the set of cases for which such empirical error minimization works is a proper subset of the cases for which the canonical learner works. We derive bounds to show that the number of samples required by these two methods is comparable. We also address the issue of how to determine the appropriate complexity for the class of candidate hypotheses  相似文献   

10.
We consider the problem of learning to predict as well as the best in a group of experts making continuous predictions. We assume the learning algorithm has prior knowledge of the maximum number of mistakes of the best expert. We propose a new master strategy that achieves the best known performance for on-line learning with continuous experts in the mistake bounded model. Our ideas are based on drifting games, a generalization of boosting and on-line learning algorithms. We prove new lower bounds based on the drifting games framework which, though not as tight as previous bounds, have simpler proofs and do not require an enormous number of experts. We also extend previous lower bounds to show that our upper bounds are exactly tight for sufficiently many experts. A surprising consequence of our work is that continuous experts are only as powerful as experts making binary or no prediction in each round.  相似文献   

11.
Langford  John  Blum  Avrim 《Machine Learning》2003,51(2):165-179
A major topic in machine learning is to determine good upper bounds on the true error rates of learned hypotheses based upon their empirical performance on training data. In this paper, we demonstrate new adaptive bounds designed for learning algorithms that operate by making a sequence of choices. These bounds, which we call Microchoice bounds, are similar to Occam-style bounds and can be used to make learning algorithms self-bounding in the style of Freund (1998). We then show how to combine these bounds with Freund's query-tree approach producing a version of Freund's query-tree structure that can be implemented with much more algorithmic efficiency.  相似文献   

12.
The measurement and management of positional accuracy and positional uncertainty is especially problematic in historical cartography and Historical GIS applications, for at least two reasons: first, historical sources, and especially historical maps, generally carry a higher degree of positional inaccuracy and uncertainty compared to contemporary geographic databases; second, it is always difficult and often impossible to reliably measure the positional accuracy and positional uncertainty of the spatial attribute of historical data. As an added complication, the terms “inaccuracy” and “uncertainty” are often used as synonyms in the literature, with relatively little attention given to issues of uncertainty.In this article we propose a methodology for detecting the positional inaccuracy and positional uncertainty of measurements of urban change using historical maps at a very high spatial resolution (the building). A widely accepted and routinely employed method for detecting urban change, and spatial change in general, consists in overlaying two or more maps created at different dates, but the technique can lead to the formation of spurious changes—typically, sliver polygons—that are the product of misclassification error or map misalignment rather than actual modifications in land cover. In this paper we develop an algorithm to detect such spurious changes. More in general, we extend the discussion to examine the effects of positional uncertainty and positional inaccuracy in feature change detection analysis. The case-study is the city of Milan, Italy.  相似文献   

13.
We derive general bounds on the complexity of learning in the statistical query (SQ) model and in the PAC model with classification noise. We do so by considering the problem of boosting the accuracy of weak learning algorithms which fall within the SQ model. This new model was introduced by Kearns to provide a general framework for efficient PAC learning in the presence of classification noise. We first show a general scheme for boosting the accuracy of weak SQ learning algorithms, proving that weak SQ learning is equivalent to strong SQ learning. The boosting is efficient and is used to show our main result of the first general upper bounds on the complexity of strong SQ learning. Since all SQ algorithms can be simulated in the PAC model with classification noise, we also obtain general upper bounds on learning in the presence of classification noise for classes which can be learned in the SQ model.  相似文献   

14.
This paper concerns learning binary-valued functions defined on R, and investigates how a particular type of ‘regularity’ of hypotheses can be used to obtain better generalization error bounds. We derive error bounds that depend on the sample width (a notion analogous to that of sample margin for real-valued functions). This motivates learning algorithms that seek to maximize sample width.  相似文献   

15.
In this article, we critically examine the role of semantic technology in data driven analysis. We explain why learning from data is more than just analyzing data, including also a number of essential synthetic parts that suggest a revision of George Box’s model of data analysis in statistics. We review arguments from statistical learning under uncertainty, workflow reproducibility, as well as from philosophy of science, and propose an alternative, synthetic learning model that takes into account semantic conflicts, observation, biased model and data selection, as well as interpretation into background knowledge. The model highlights and clarifies the different roles that semantic technology may have in fostering reproduction and reuse of data analysis across communities of practice under the conditions of informational uncertainty. We also investigate the role of semantic technology in current analysis and workflow tools, compare it with the requirements of our model, and conclude with a roadmap of 8 challenging research problems which currently seem largely unaddressed.  相似文献   

16.
AdaBoost is a popular and effective leveraging procedure for improving the hypotheses generated by weak learning algorithms. AdaBoost and many other leveraging algorithms can be viewed as performing a constrained gradient descent over a potential function. At each iteration the distribution over the sample given to the weak learner is proportional to the direction of steepest descent. We introduce a new leveraging algorithm based on a natural potential function. For this potential function, the direction of steepest descent can have negative components. Therefore, we provide two techniques for obtaining suitable distributions from these directions of steepest descent. The resulting algorithms have bounds that are incomparable to AdaBoost's. The analysis suggests that our algorithm is likely to perform better than AdaBoost on noisy data and with weak learners returning low confidence hypotheses. Modest experiments confirm that our algorithm can perform better than AdaBoost in these situations.  相似文献   

17.
Shoham et al. identify several important agendas which can help direct research in multi-agent learning. We propose two additional agendas—called “modelling” and “design”—which cover the problems we need to consider before our agents can start learning. We then consider research goals for modelling, design, and learning, and identify the problem of finding learning algorithms that guarantee convergence to Pareto-dominant equilibria against a wide range of opponents. Finally, we conclude with an example: starting from an informally-specified multi-agent learning problem, we illustrate how one might formalize and solve it by stepping through the tasks of modelling, design, and learning.  相似文献   

18.
To date, little research has examined gender difference in how convenience is perceived in mobile commerce (m-commerce). The current work presents and tests a theoretical model partially based on Technology Acceptance Model (TAM), and posits a sequential relationship among four primary dimensions: (1) intrinsic attributes of mobile device—portability and interface design; (2) ease of use; (3) extrinsic attributes of mobile device—simultaneity, speed, and searchability; and (4) perceived convenience of m-commerce. We posit that physical attributes of mobile device (portability and interface design) are antecedents of ease of use, which in turn determines three extrinsic attributes (simultaneity, speed, and searchability). The final dependent variable is perceived convenience. Based on prior research on TAM and gender theories, the study proposes 16 hypotheses, of which our data support 12. Our results indicate that the link between interface design and ease of use holds a key to motivate females’ use of m-commerce. In closing, implications are discussed while important limitations are recognized along with future research suggestions.  相似文献   

19.
The approach of ordinal mind change complexity, introduced by Freivalds and Smith, uses (notations for) constructive ordinals to bound the number of mind changes made by a learning machine. This approach provides a measure of the extent to which a learning machine has to keep revising its estimate of the number of mind changes it will make before converging to a correct hypothesis for languages in the class being learned. Recently, this notion, which also yields a measure for the difficulty of learning a class of languages, has been used to analyze the learnability of rich concept classes.

The present paper further investigates the utility of ordinal mind change complexity. It is shown that for identification from both positive and negative data and n 1, the ordinal mind change complexity of the class of languages formed by unions of up to n + 1 pattern languages is only ω ×0 notn(n) (where notn(n) is a notation for n, ω is a notation for the least limit ordinal and ×0 represents ordinal multiplication). This result nicely extends an observation of Lange and Zeugmann that pattern languages can be identified from both positive and negative data with 0 mind changes.

Existence of an ordinal mind change bound for a class of learnable languages can be seen as an indication of its learning “tractability”. Conditions are investigated under which a class has an ordinal mind change bound for identification from positive data. It is shown that an indexed family of languages has an ordinal mind change bound if it has finite elasticity and can be identified by a conservative machine. It is also shown that the requirement of conservative identification can be sacrificed for the purely topological requirement ofM-finite thickness. Interaction between identification by monotonic strategies and existence of ordinal mind change bound is also investigated.  相似文献   


20.
Studies the performance of gradient descent (GD) when applied to the problem of online linear prediction in arbitrary inner product spaces. We prove worst-case bounds on the sum of the squared prediction errors under various assumptions concerning the amount of a priori information about the sequence to predict. The algorithms we use are variants and extensions of online GD. Whereas our algorithms always predict using linear functions as hypotheses, none of our results requires the data to be linearly related. In fact, the bounds proved on the total prediction loss are typically expressed as a function of the total loss of the best fixed linear predictor with bounded norm. All the upper bounds are tight to within constants. Matching lower bounds are provided in some cases. Finally, we apply our results to the problem of online prediction for classes of smooth functions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号