Bshouty, Goldman, Hancock and Matar have shown that up to term DNF formulas can be properly learned in the exact model with equivalence and membership queries. Given standard complexity-theoretical assumptions, we show that this positive result for proper learning cannot be significantly improved in the exact model or the PAC model extended to allow membership queries. Our negative results are derived from two general techniques for proving such results in the exact model and the extended PAC model. As a further application of these techniques, we consider read-thrice DNF formulas. Here we improve on Aizenstein, Hellerstein, and Pitt's negative result for proper learning in the exact model in two ways. First, we show that their assumption of NP co-NP can be replaced with the weaker assumption of P NP. Second, we show that read-thrice DNF formulas are not properly learnable in the extended PAC model, assuming RP NP.  相似文献   

粗糙集理论中分辨函数的析取范式生成算法   总被引:1,自引:0,他引:1  
基于粗糙集理论,研究分辨函数(DF)的析取范式的自动生成算法,提出由分辨矩阵到合取项矩阵(CM)的计算方法。建立了从属性约简的合取项矩阵(CM)到析取项矩阵(DM)转换的数学模型,并给出了算法的实现流程。基于数学模型,提出直接搜索的转换方法,节省了运算空间,降低了算法的时间复杂性,提高了规则生成的效率。最后通过UCI数据库的实例验证了算法的有效性。  相似文献   

We establish some general results concerning PAC learning: We find a characterization of the property that any consistent algorithm is PAC. It is shown that the shrinking width property is equivalent to PUAC learnability. By counterexample, PAC and PUAC learning are shown to be different concepts. We find conditions ensuring that any nearly consistent algorithm is PAC or PUAC, respectively.?The VC dimension of recurrent neural networks and folding networks is infinite. For restricted inputs, however, bounds exist. The bounds for restricted inputs are transferred to folding networks.?We find conditions on the probability of the input space ensuring polynomial learnability: the probability of sequences or trees has to converge to zero sufficiently fast with increasing length or height.?Finally, we find an example for a concept class that requires exponentially growing sample sizes for accurate generalization. Date received: September 5, 1997. Date revised: May 29, 1998.  相似文献   

Dalmau  Víictor 《Machine Learning》1999,35(3):207-224
We consider the following classes of quantified boolean formulas. Fix a finite set of basic boolean functions. Take conjunctions of these basic functions applied to variables and constants in arbitrary ways. Finally quantify existentially or universally some of the variables. We prove the following dichotomy theorem: For any set of basic boolean functions, the resulting set of formulas is either polynomially learnable from equivalence queries alone or else it is not PAC-predictable even with membership queries under cryptographic assumptions. Furthermore, we identify precisely which sets of basic functions are in which of the two cases.  相似文献   

The Strength of Weak Learnability   总被引:136,自引:0,他引:136  
This paper addresses the problem of improving the accuracy of an hypothesis output by a learning algorithm in the distribution-free (PAC) learning model. A concept class is learnable (or strongly learnable) if, given access to a source of examples of the unknown concept, the learner with high probability is able to output an hypothesis that is correct on all but an arbitrarily small fraction of the instances. The concept class is weakly learnable if the learner can produce an hypothesis that performs only slightly better than random guessing. In this paper, it is shown that these two notions of learnability are equivalent.A method is described for converting a weak learning algorithm into one that achieves arbitrarily high accuracy. This construction may have practical applications as a tool for efficiently converting a mediocre learning algorithm into one that performs extremely well. In addition, the construction has some interesting theoretical consequences, including a set of general upper bounds on the complexity of any strong learning algorithm as a function of the allowed error .  相似文献   

This paper presents a polynomial-time algorithm for inferring a probabilistic generalization of the class of read-once Boolean formulas over the usual basis {AND, OR, NOT}. The algorithm effectively infers a good approximation of the target formula when provided with random examples which are chosen according to anyproduct distribution, i.e., any distribution in which the setting of each input bit is chosen independently of the settings of the other bits. Since the class of formulas considered includes ordinary read-once Boolean formulas, our result shows that such formulas are PAC learnable (in the sense of Valiant) against any product distribution (for instance, against the uniform distribution). Further, this class of probabilistic formulas includes read-once formulas whose behavior has been corrupted by large amounts of random noise. Such noise may affect the formula's output (misclassification noise), the input bits (attribute noise), or it may affect the behavior of individual gates of the formula. Thus, in this setting, we show that read-once formula's can be inferred (approximately), despite large amounts of noise affecting the formula's behavior.  相似文献   

Abe  Naoki  Warmuth  Manfred K. 《Machine Learning》1992,9(2-3):205-260
Machine Learning - We introduce a rigorous performance criterion for training algorithms for probabilistic automata (PAs) and hidden Markov models (HMMs), used extensively for speech recognition,...  相似文献   

Long  Philip M. 《Machine Learning》1999,37(3):337-354
We show that a bound on the rate of drift of the distribution generating the examples is sufficient for agnostic learning to relative accuracy , where c > 0 is a constant; this matches a known necessary condition to within a constant factor. We establish a sufficient condition for the realizable case, also matching a known necessary condition to within a constant factor. We provide a relatively simple proof of a bound of + on the sample complexity of agnostic learning in a fixed environment.  相似文献   

Kearns  Michael  Sebastian Seung  H. 《Machine Learning》1995,18(2-3):255-276
We introduce a new formal model in which a learning algorithm must combine a collection of potentially poor but statistically independent hypothesis functions in order to approximate an unknown target function arbitrarily well. Our motivation includes the question of how to make optimal use of multiple independent runs of a mediocre learning algorithm, as well as settings in which the many hypotheses are obtained by a distributed population of identical learning agents.  相似文献   

In this paper we initiate an investigation of generalizations of the Probably Approximately Correct (PAC) learning model that attempt to significantly weaken the target function assumptions. The ultimate goal in this direction is informally termed agnostic learning, in which we make virtually no assumptions on the target function. The name derives from the fact that as designers of learning algorithms, we give up the belief that Nature (as represented by the target function) has a simple or succinct explanation. We give a number of positive and negative results that provide an initial outline of the possibilities for agnostic learning. Our results include hardness results for the most obvious generalization of the PAC model to an agnostic setting, an efficient and general agnostic learning method based on dynamic programming, relationships between loss functions for agnostic learning, and an algorithm for a learning problem that involves hidden variables.  相似文献   

Goldsmith  Judy  Sloan  Robert H.  Turán  György 《Machine Learning》2002,47(2-3):257-295
The theory revision, or concept revision, problem is to correct a given, roughly correct concept. This problem is considered here in the model of learning with equivalence and membership queries. A revision algorithm is considered efficient if the number of queries it makes is polynomial in the revision distance between the initial theory and the target theory, and polylogarithmic in the number of variables and the size of the initial theory. The revision distance is the minimal number of syntactic revision operations, such as the deletion or addition of literals, needed to obtain the target theory from the initial theory. Efficient revision algorithms are given for three classes of disjunctive normal form expressions: monotone k-DNF, monotone m-term DNF and unate two-term DNF. A negative result shows that some monotone DNF formulas are hard to revise.  相似文献   

In this article we give several new results on the complexity of algorithms that learn Boolean functions from quantum queries and quantum examples.
  Hunziker et al.[Quantum Information Processing, to appear] conjectured that for any class C of Boolean functions, the number of quantum black-box queries which are required to exactly identify an unknown function from C is , where is a combinatorial parameter of the class C. We essentially resolve this conjecture in the affirmative by giving a quantum algorithm that, for any class C, identifies any unknown function from C using quantum black-box queries.
  We consider a range of natural problems intermediate between the exact learning problem (in which the learner must obtain all bits of information about the black-box function) and the usual problem of computing a predicate (in which the learner must obtain only one bit of information about the black-box function). We give positive and negative results on when the quantum and classical query complexities of these intermediate problems are polynomially related to each other.
  Finally, we improve the known lower bounds on the number of quantum examples (as opposed to quantum black-box queries) required for ɛ, Δ-PAC learning any concept class of Vapnik-Chervonenkis dimension d over the domain from to . This new lower bound comes closer to matching known upper bounds for classical PAC learning.
Pacs: 03.67.Lx, 89.80.+h, 02.70.-c  相似文献   

Auer  Peter  Long  Philip M.  Maass  Wolfgang  Woeginger  Gerhard J. 《Machine Learning》1995,18(2-3):187-230
The majority of results in computational learning theory are concerned with concept learning, i.e. with the special case of function learning for classes of functions with range {0, 1}. Much less is known about the theory of learning functions with a larger range such as or . In particular relatively few results exist about the general structure of common models for function learning, and there are only very few nontrivial function classes for which positive learning results have been exhibited in any of these models.We introduce in this paper the notion of a binary branching adversary tree for function learning, which allows us to give a somewhat surprising equivalent characterization of the optimal learning cost for learning a class of real-valued functions (in terms of a max-min definition which does not involve any learning model).Another general structural result of this paper relates the cost for learning a union of function classes to the learning costs for the individual function classes.Furthermore, we exhibit an efficient learning algorithm for learning convex piecewise linear functions from d into . Previously, the class of linear functions from d into was the only class of functions with multidimensional domain that was known to be learnable within the rigorous framework of a formal model for online learning.Finally we give a sufficient condition for an arbitrary class of functions from into that allows us to learn the class of all functions that can be written as the pointwise maximum ofk functions from . This allows us to exhibit a number of further nontrivial classes of functions from into for which there exist efficient learning algorithms.  相似文献   

Schmitt  Michael 《Machine Learning》1999,37(2):131-141
A neural network is said to be nonoverlapping if there is at most one edge outgoing from each node. We investigate the number of examples that a learning algorithm needs when using nonoverlapping neural networks as hypotheses. We derive bounds for this sample complexity in terms of the Vapnik-Chervonenkis dimension. In particular, we consider networks consisting of threshold, sigmoidal and linear gates. We show that the class of nonoverlapping threshold networks and the class of nonoverlapping sigmoidal networks on n inputs both have Vapnik-Chervonenkis dimension (nlog n). This bound is asymptotically tight for the class of nonoverlapping threshold networks. We also present an upper bound for this class where the constants involved are considerably smaller than in a previous calculation. Finally, we argue that the Vapnik-Chervonenkis dimension of nonoverlapping threshold or sigmoidal networks cannot become larger by allowing the nodes to compute linear functions. This sheds some light on a recent result that exhibited neural networks with quadratic Vapnik-Chervonenkis dimension.  相似文献   

We investigate the complexity of learning for the well-studied model in which the learning algorithm may ask membership and equivalence queries. While complexity theoretic techniques have previously been used to prove hardness results in various learning models, these techniques typically are not strong enough to use when a learning algorithm may make membership queries. We develop a general technique for proving hardness results for learning with membership and equivalence queries (and for more general query models). We apply the technique to show that, assuming , no polynomial-time membership and (proper) equivalence query algorithms exist for exactly learning read-thrice DNF formulas, unions of halfspaces over the Boolean domain, or some other related classes. Our hardness results are representation dependent, and do not preclude the existence of representation independent algorithms.?The general technique introduces the representation problem for a class F of representations (e.g., formulas), which is naturally associated with the learning problem for F. This problem is related to the structural question of how to characterize functions representable by formulas in F, and is a generalization of standard complexity problems such as Satisfiability. While in general the representation problem is in , we present a theorem demonstrating that for "reasonable" classes F, the existence of a polynomial-time membership and equivalence query algorithm for exactly learning F implies that the representation problem for F is in fact in co-NP. The theorem is applied to prove hardness results such as the ones mentioned above, by showing that the representation problem for specific classes of formulas is NP-hard. Received: December 6, 1994  相似文献   

We introduce a new fault-tolerant model of algorithmic learning using an equivalence oracle and anincomplete membership oracle, in which the answers to a random subset of the learner's membership queries may be missing. We demonstrate that, with high probability, it is still possible to learn monotone DNF formulas in polynomial time, provided that the fraction of missing answers is bounded by some constant less than one. Even when half the membership queries are expected to yield no information, our algorithm will exactly identifym-term,n-variable monotone DNF formulas with an expectedO(mn 2) queries. The same task has been shown to require exponential time using equivalence queries alone. We extend the algorithm to handle some one-sided errors, and discuss several other possible error models. It is hoped that this work may lead to a better understanding of the power of membership queries and the effects of faulty teachers on query models of concept learning.  相似文献   

In this paper, a brief introduction is given to some statistical aspects of PAC (probably approximately correct) learning theory. It is shown that there is a close connection between the principal results in PAC learning theory and those in empirical process theory, the latter being a well-established branch of probability theory. The main results in each area are summarized without proofs, and the reader is directed to appropriate sources in the literature.  相似文献   

Learning Changing Concepts by Exploiting the Structure of Change   总被引:1,自引:0,他引:1  
This paper examines learning problems in which the target function is allowed to change. The learner sees a sequence of random examples, labelled according to a sequence of functions, and must provide an accurate estimate of the target function sequence. We consider a variety of restrictions on how the target function is allowed to change, including infrequent but arbitrary changes, sequences that correspond to slow walks on a graph whose nodes are functions, and changes that are small on average, as measured by the probability of disagreements between consecutive functions. We first study estimation, in which the learner sees a batch of examples and is then required to give an accurate estimate of the function sequence. Our results provide bounds on the sample complexity and allowable drift rate for these problems. We also study prediction, in which the learner must produce online a hypothesis after each labelled example and the average misclassification probability over this hypothesis sequence should be small. Using a deterministic analysis in a general metric space setting, we provide a technique for constructing a successful prediction algorithm, given a successful estimation algorithm. This leads to sample complexity and drift rate bounds for the prediction of changing concepts.  相似文献   

