期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

On the Limits of Proper Learnability of Subclasses of DNF Formulas

Pillaipakkamnatt Krishnan Raghavan Vijay 《Machine Learning》1996,25(2-3):237-263

Bshouty, Goldman, Hancock and Matar have shown that up to term DNF formulas can be properly learned in the exact model with equivalence and membership queries. Given standard complexity-theoretical assumptions, we show that this positive result for proper learning cannot be significantly improved in the exact model or the PAC model extended to allow membership queries. Our negative results are derived from two general techniques for proving such results in the exact model and the extended PAC model. As a further application of these techniques, we consider read-thrice DNF formulas. Here we improve on Aizenstein, Hellerstein, and Pitt's negative result for proper learning in the exact model in two ways. First, we show that their assumption of NP co-NP can be replaced with the weaker assumption of P NP. Second, we show that read-thrice DNF formulas are not properly learnable in the extended PAC model, assuming RP NP. 相似文献

2.

On the Learnability of Recursive Data

Barbara Hammer 《Mathematics of Control, Signals, and Systems (MCSS)》1999,12(1):62-79

We establish some general results concerning PAC learning: We find a characterization of the property that any consistent algorithm is PAC. It is shown that the shrinking width property is equivalent to PUAC learnability. By counterexample, PAC and PUAC learning are shown to be different concepts. We find conditions ensuring that any nearly consistent algorithm is PAC or PUAC, respectively.?The VC dimension of recurrent neural networks and folding networks is infinite. For restricted inputs, however, bounds exist. The bounds for restricted inputs are transferred to folding networks.?We find conditions on the probability of the input space ensuring polynomial learnability: the probability of sequences or trees has to converge to zero sufficiently fast with increasing length or height.?Finally, we find an example for a concept class that requires exponentially growing sample sizes for accurate generalization. Date received: September 5, 1997. Date revised: May 29, 1998. 相似文献

3.

A Dichotomy Theorem for Learning Quantified Boolean Formulas

Dalmau Víictor 《Machine Learning》1999,35(3):207-224

We consider the following classes of quantified boolean formulas. Fix a finite set of basic boolean functions. Take conjunctions of these basic functions applied to variables and constants in arbitrary ways. Finally quantify existentially or universally some of the variables. We prove the following dichotomy theorem: For any set of basic boolean functions, the resulting set of formulas is either polynomially learnable from equivalence queries alone or else it is not PAC-predictable even with membership queries under cryptographic assumptions. Furthermore, we identify precisely which sets of basic functions are in which of the two cases. 相似文献

4.

The Strength of Weak Learnability 总被引：136，自引：0，他引：136

Schapire Robert E. 《Machine Learning》1990,5(2):197-227

This paper addresses the problem of improving the accuracy of an hypothesis output by a learning algorithm in the distribution-free (PAC) learning model. A concept class is learnable (or strongly learnable) if, given access to a source of examples of the unknown concept, the learner with high probability is able to output an hypothesis that is correct on all but an arbitrarily small fraction of the instances. The concept class is weakly learnable if the learner can produce an hypothesis that performs only slightly better than random guessing. In this paper, it is shown that these two notions of learnability are equivalent.A method is described for converting a weak learning algorithm into one that achieves arbitrarily high accuracy. This construction may have practical applications as a tool for efficiently converting a mediocre learning algorithm into one that performs extremely well. In addition, the construction has some interesting theoretical consequences, including a set of general upper bounds on the complexity of any strong learning algorithm as a function of the allowed error . 相似文献

5.

Learning Probabilistic Read-once Formulas on Product Distributions

Schapire Robert E. 《Machine Learning》1994,14(1):47-81

This paper presents a polynomial-time algorithm for inferring a probabilistic generalization of the class of read-once Boolean formulas over the usual basis {AND, OR, NOT}. The algorithm effectively infers a good approximation of the target formula when provided with random examples which are chosen according to anyproduct distribution, i.e., any distribution in which the setting of each input bit is chosen independently of the settings of the other bits. Since the class of formulas considered includes ordinary read-once Boolean formulas, our result shows that such formulas are PAC learnable (in the sense of Valiant) against any product distribution (for instance, against the uniform distribution). Further, this class of probabilistic formulas includes read-once formulas whose behavior has been corrupted by large amounts of random noise. Such noise may affect the formula's output (misclassification noise), the input bits (attribute noise), or it may affect the behavior of individual gates of the formula. Thus, in this setting, we show that read-once formula's can be inferred (approximately), despite large amounts of noise affecting the formula's behavior. 相似文献

6.

On the Computational Complexity of Approximating Distributions by Probabilistic Automata

Abe Naoki Warmuth Manfred K. 《Machine Learning》1992,9(2-3):205-260

Machine Learning - We introduce a rigorous performance criterion for training algorithms for probabilistic automata (PAs) and hidden Markov models (HMMs), used extensively for speech recognition,... 相似文献

7.

The Complexity of Learning According to Two Models of a Drifting Environment

Long Philip M. 《Machine Learning》1999,37(3):337-354

We show that a bound on the rate of drift of the distribution generating the examples is sufficient for agnostic learning to relative accuracy , where c > 0 is a constant; this matches a known necessary condition to within a constant factor. We establish a sufficient condition for the realizable case, also matching a known necessary condition to within a constant factor. We provide a relatively simple proof of a bound of + on the sample complexity of agnostic learning in a fixed environment. 相似文献

8.

Learning from a Population of Hypotheses

Kearns Michael Sebastian Seung H. 《Machine Learning》1995,18(2-3):255-276

We introduce a new formal model in which a learning algorithm must combine a collection of potentially poor but statistically independent hypothesis functions in order to approximate an unknown target function arbitrarily well. Our motivation includes the question of how to make optimal use of multiple independent runs of a mediocre learning algorithm, as well as settings in which the many hypotheses are obtained by a distributed population of identical learning agents. 相似文献

9.

Toward Efficient Agnostic Learning

Kearns Michael J. Schapire Robert E. Sellie Linda M. 《Machine Learning》1994,17(2-3):115-141

In this paper we initiate an investigation of generalizations of the Probably Approximately Correct (PAC) learning model that attempt to significantly weaken the target function assumptions. The ultimate goal in this direction is informally termed agnostic learning, in which we make virtually no assumptions on the target function. The name derives from the fact that as designers of learning algorithms, we give up the belief that Nature (as represented by the target function) has a simple or succinct explanation. We give a number of positive and negative results that provide an initial outline of the possibilities for agnostic learning. Our results include hardness results for the most obvious generalization of the PAC model to an agnostic setting, an efficient and general agnostic learning method based on dynamic programming, relationships between loss functions for agnostic learning, and an algorithm for a learning problem that involves hidden variables. 相似文献

10.

Theory Revision with Queries: DNF Formulas

Goldsmith Judy Sloan Robert H. Turán György 《Machine Learning》2002,47(2-3):257-295

The theory revision, or concept revision, problem is to correct a given, roughly correct concept. This problem is considered here in the model of learning with equivalence and membership queries. A revision algorithm is considered efficient if the number of queries it makes is polynomial in the revision distance between the initial theory and the target theory, and polylogarithmic in the number of variables and the size of the initial theory. The revision distance is the minimal number of syntactic revision operations, such as the deletion or addition of literals, needed to obtain the target theory from the initial theory. Efficient revision algorithms are given for three classes of disjunctive normal form expressions: monotone k-DNF, monotone m-term DNF and unate two-term DNF. A negative result shows that some monotone DNF formulas are hard to revise. 相似文献

11.

The complexity of learning concept classes with polynomial general dimension

Johannes Köbler Wolfgang Lindner 《Theoretical computer science》2006

相似文献

12.

Improved Bounds on Quantum Learning Algorithms

Alp?Atici Email author Rocco?A.?Servedio 《Quantum Information Processing》2005,4(5):355-386

In this article we give several new results on the complexity of algorithms that learn Boolean functions from quantum queries and quantum examples.

	Hunziker et al.[Quantum Information Processing, to appear] conjectured that for any class C of Boolean functions, the number of quantum black-box queries which are required to exactly identify an unknown function from C is , where is a combinatorial parameter of the class C. We essentially resolve this conjecture in the affirmative by giving a quantum algorithm that, for any class C, identifies any unknown function from C using quantum black-box queries.
	We consider a range of natural problems intermediate between the exact learning problem (in which the learner must obtain all bits of information about the black-box function) and the usual problem of computing a predicate (in which the learner must obtain only one bit of information about the black-box function). We give positive and negative results on when the quantum and classical query complexities of these intermediate problems are polynomially related to each other.
	Finally, we improve the known lower bounds on the number of quantum examples (as opposed to quantum black-box queries) required for ɛ, Δ-PAC learning any concept class of Vapnik-Chervonenkis dimension d over the domain from to . This new lower bound comes closer to matching known upper bounds for classical PAC learning.

Pacs: 03.67.Lx, 89.80.+h, 02.70.-c 相似文献

13.

On the Complexity of Function Learning

Auer Peter Long Philip M. Maass Wolfgang Woeginger Gerhard J. 《Machine Learning》1995,18(2-3):187-230

The majority of results in computational learning theory are concerned with concept learning, i.e. with the special case of function learning for classes of functions with range {0, 1}. Much less is known about the theory of learning functions with a larger range such as or . In particular relatively few results exist about the general structure of common models for function learning, and there are only very few nontrivial function classes for which positive learning results have been exhibited in any of these models.We introduce in this paper the notion of a binary branching adversary tree for function learning, which allows us to give a somewhat surprising equivalent characterization of the optimal learning cost for learning a class of real-valued functions (in terms of a max-min definition which does not involve any learning model).Another general structural result of this paper relates the cost for learning a union of function classes to the learning costs for the individual function classes.Furthermore, we exhibit an efficient learning algorithm for learning convex piecewise linear functions from ^d into . Previously, the class of linear functions from ^d into was the only class of functions with multidimensional domain that was known to be learnable within the rigorous framework of a formal model for online learning.Finally we give a sufficient condition for an arbitrary class of functions from into that allows us to learn the class of all functions that can be written as the pointwise maximum ofk functions from . This allows us to exhibit a number of further nontrivial classes of functions from into for which there exist efficient learning algorithms. 相似文献

14.

On PAC learning algorithms for rich Boolean function classes

Lisa Hellerstein Rocco A. Servedio 《Theoretical computer science》2007

相似文献

15.

On the Sample Complexity for Nonoverlapping Neural Networks

Schmitt Michael 《Machine Learning》1999,37(2):131-141

A neural network is said to be nonoverlapping if there is at most one edge outgoing from each node. We investigate the number of examples that a learning algorithm needs when using nonoverlapping neural networks as hypotheses. We derive bounds for this sample complexity in terms of the Vapnik-Chervonenkis dimension. In particular, we consider networks consisting of threshold, sigmoidal and linear gates. We show that the class of nonoverlapping threshold networks and the class of nonoverlapping sigmoidal networks on n inputs both have Vapnik-Chervonenkis dimension (nlog n). This bound is asymptotically tight for the class of nonoverlapping threshold networks. We also present an upper bound for this class where the constants involved are considerably smaller than in a previous calculation. Finally, we argue that the Vapnik-Chervonenkis dimension of nonoverlapping threshold or sigmoidal networks cannot become larger by allowing the nodes to compute linear functions. This sheds some light on a recent result that exhibited neural networks with quadratic Vapnik-Chervonenkis dimension. 相似文献

16.

Complexity theoretic hardness results for query learning

H. Aizenstein T. Hegedüs L. Hellerstein L. Pitt 《Computational Complexity》1998,7(1):19-53

We investigate the complexity of learning for the well-studied model in which the learning algorithm may ask membership and equivalence queries. While complexity theoretic techniques have previously been used to prove hardness results in various learning models, these techniques typically are not strong enough to use when a learning algorithm may make membership queries. We develop a general technique for proving hardness results for learning with membership and equivalence queries (and for more general query models). We apply the technique to show that, assuming , no polynomial-time membership and (proper) equivalence query algorithms exist for exactly learning read-thrice DNF formulas, unions of halfspaces over the Boolean domain, or some other related classes. Our hardness results are representation dependent, and do not preclude the existence of representation independent algorithms.?The general technique introduces the representation problem for a class F of representations (e.g., formulas), which is naturally associated with the learning problem for F. This problem is related to the structural question of how to characterize functions representable by formulas in F, and is a generalization of standard complexity problems such as Satisfiability. While in general the representation problem is in , we present a theorem demonstrating that for "reasonable" classes F, the existence of a polynomial-time membership and equivalence query algorithm for exactly learning F implies that the representation problem for F is in fact in co-NP. The theorem is applied to prove hardness results such as the ones mentioned above, by showing that the representation problem for specific classes of formulas is NP-hard. Received: December 6, 1994 相似文献

17.

Randomly Fallible Teachers: Learning Monotone DNF with an Incomplete Membership Oracle

Angluin Dana Slonim Donna K. 《Machine Learning》1994,14(1):7-26

We introduce a new fault-tolerant model of algorithmic learning using an equivalence oracle and anincomplete membership oracle, in which the answers to a random subset of the learner's membership queries may be missing. We demonstrate that, with high probability, it is still possible to learn monotone DNF formulas in polynomial time, provided that the fraction of missing answers is bounded by some constant less than one. Even when half the membership queries are expected to yield no information, our algorithm will exactly identifym-term,n-variable monotone DNF formulas with an expectedO(mn ²) queries. The same task has been shown to require exponential time using equivalence queries alone. We extend the algorithm to handle some one-sided errors, and discuss several other possible error models. It is hoped that this work may lead to a better understanding of the power of membership queries and the effects of faulty teachers on query models of concept learning. 相似文献

18.

An introduction to some statistical aspects of PAC learning theory

M. Vidyasagar 《Systems & Control Letters》1998,34(3):181

In this paper, a brief introduction is given to some statistical aspects of PAC (probably approximately correct) learning theory. It is shown that there is a close connection between the principal results in PAC learning theory and those in empirical process theory, the latter being a well-established branch of probability theory. The main results in each area are summarized without proofs, and the reader is directed to appropriate sources in the literature. 相似文献

19.

Learning Changing Concepts by Exploiting the Structure of Change 总被引：1，自引：0，他引：1

Bartlett Peter L. Ben-David Shai Kulkarni Sanjeev R. 《Machine Learning》2000,41(2):153-174

This paper examines learning problems in which the target function is allowed to change. The learner sees a sequence of random examples, labelled according to a sequence of functions, and must provide an accurate estimate of the target function sequence. We consider a variety of restrictions on how the target function is allowed to change, including infrequent but arbitrary changes, sequences that correspond to slow walks on a graph whose nodes are functions, and changes that are small on average, as measured by the probability of disagreements between consecutive functions. We first study estimation, in which the learner sees a batch of examples and is then required to give an accurate estimate of the function sequence. Our results provide bounds on the sample complexity and allowable drift rate for these problems. We also study prediction, in which the learner must produce online a hypothesis after each labelled example and the average misclassification probability over this hypothesis sequence should be small. Using a deterministic analysis in a general metric space setting, we provide a technique for constructing a successful prediction algorithm, given a successful estimation algorithm. This leads to sample complexity and drift rate bounds for the prediction of changing concepts. 相似文献

20.

On Polynomial-Time Learnability in the Limit of Strictly Deterministic Automata

Yokomori Takashi 《Machine Learning》1995,19(2):153-179

This paper deals with the polynomial-time learnability of a language class in the limit from positive data, and discusses the learning problem of a subclass of deterministic finite automata (DFAs), called strictly deterministic automata (SDAs), in the framework of learning in the limit from positive data. We first discuss the difficulty of Pitt's definition in the framework of learning in the limit from positive data, by showing that any class of languages with an infinite descending chain property is not polynomial-time learnable in the limit from positive data. We then propose new definitions for polynomial-time learnability in the limit from positive data. We show in our new definitions that the class of SDAs is iteratively, consistently polynomial-time learnable in the limit from positive data. In particular, we present a learning algorithm that learns any SDA M in the limit from positive data, satisfying the properties that (i) the time for updating a conjecture is at most O(lm), (ii) the number of implicit prediction errors is at most O(ln), where l is the maximum length of all positive data provided, m is the alphabet size of M and n is the size of M, (iii) each conjecture is computed from only the previous conjecture and the current example, and (iv) at any stage the conjecture is consistent with the sample set seen so far. This is in marked contrast to the fact that the class of DFAs is neither learnable in the limit from positive data nor polynomial-time learnable in the limit. 相似文献