首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Tracking Drifting Concepts By Minimizing Disagreements   总被引:3,自引:0,他引:3  
In this paper we consider the problem of tracking a subset of a domain (called thetarget) which changes gradually over time. A single (unknown) probability distribution over the domain is used to generate random examples for the learning algorithm and measure the speed at which the target changes. Clearly, the more rapidly the target moves, the harder it is for the algorithm to maintain a good approximation of the target. Therefore we evaluate algorithms based on how much movement of the target can be tolerated between examples while predicting with accuracy . Furthermore, the complexity of the classH of possible targets, as measured byd, its VC-dimension, also effects the difficulty of tracking the target concept. We show that if the problem of minimizing the number of disagreements with a sample from among concepts in a classH can be approximated to within a factork, then there is a simple tracking algorithm forH which can achieve a probability of making a mistake if the target movement rate is at most a constant times 2/(k(d +k) ln 1/), whered is the Vapnik-Chervonenkis dimension ofH. Also, we show that ifH is properly PAC-learnable, then there is an efficient (randomized) algorithm that with high probability approximately minimizes disagreements to within a factor of 7d + 1, yielding an efficient tracking algorithm forH which tolerates drift rates up to a constant times 2/(d 2 ln 1/). In addition, we prove complementary results for the classes of halfspaces and axisaligned hyperrectangles showing that the maximum rate of drift that any algorithm (even with unlimited computational power) can tolerate is a constant times 2/d.  相似文献   

2.
A Note on Learning from Multiple-Instance Examples   总被引:7,自引:0,他引:7  
Blum  Avrim  Kalai  Adam 《Machine Learning》1998,30(1):23-29
We describe a simple reduction from the problem of PAC-learning from multiple-instance examples to that of PAC-learning with one-sided random classification noise. Thus, all concept classes learnable with one-sided noise, which includes all concepts learnable in the usual 2-sided random noise model plus others such as the parity function, are learnable from multiple-instance examples. We also describe a more efficient (and somewhat technically more involved) reduction to the Statistical-Query model that results in a polynomial-time algorithm for learning axis-parallel rectangles with sample complexity Õ(d2r/2) , saving roughly a factor of r over the results of Auer et al. (1997).  相似文献   

3.
The Strength of Weak Learnability   总被引:136,自引:0,他引:136  
This paper addresses the problem of improving the accuracy of an hypothesis output by a learning algorithm in the distribution-free (PAC) learning model. A concept class is learnable (or strongly learnable) if, given access to a source of examples of the unknown concept, the learner with high probability is able to output an hypothesis that is correct on all but an arbitrarily small fraction of the instances. The concept class is weakly learnable if the learner can produce an hypothesis that performs only slightly better than random guessing. In this paper, it is shown that these two notions of learnability are equivalent.A method is described for converting a weak learning algorithm into one that achieves arbitrarily high accuracy. This construction may have practical applications as a tool for efficiently converting a mediocre learning algorithm into one that performs extremely well. In addition, the construction has some interesting theoretical consequences, including a set of general upper bounds on the complexity of any strong learning algorithm as a function of the allowed error .  相似文献   

4.
A challenging problem within machine learning is how to make good inferences from data sets in which pieces of information are missing. While it is valuable to have algorithms that perform well for specific domains, to gain a fundamental understanding of the problem, one needs a “theory” about how to learn with incomplete data. The important contribution of such a theory is not so much the specific algorithmic results, but rather that it provides good ways of thinking about the problem formally. In this paper we introduce the unspecified attribute value (UAV) learning model as a first step towards a theoretical framework for studying the problem of learning from incomplete data in the exact learning framework.In the UAV learning model, an example x is classified positive (resp., negative) if all possible assignments for the unspecified attributes result in a positive (resp., negative) classification. Otherwise the classification given to x is “?” (for unknown). Given an example x in which some attributes are unspecified, the oracle UAV-MQ responds with the classification of x. Given a hypothesis h, the oracle UAV-EQ returns an example x (that could have unspecified attributes) for which h(x) is incorrect.We show that any class of functions learnable in Angluin’s exact model using the MQ and EQ oracles is also learnable in the UAV model using the MQ and UAV-EQ oracles as long as the counterexamples provided by the UAV-EQ oracle have a logarithmic number of unspecified attributes. We also show that any class learnable in the exact model using the MQ and EQ oracles is also learnable in the UAV model using the UAV-MQ and UAV-EQ oracles as well as an oracle to evaluate a given boolean formula on an example with unspecified attributes. (For some hypothesis classes such as decision trees and unate formulas the evaluation can be done in polynomial time without an oracle.) We also study the learnability of a universal class of decision trees under the UAV model and of DNF formulas under a representation-dependent variation of the UAV model.  相似文献   

5.
Previous partially supervised classification methods can partition unlabeled data into positive examples and negative examples for a given class by learning from positive labeled examples and unlabeled examples, but they cannot further group the negative examples into meaningful clusters even if there are many different classes in the negative examples. Here we proposed an automatic method to obtain a natural partitioning of mixed data (labeled data + unlabeled data) by maximizing a stability criterion defined on classification results from an extended label propagation algorithm over all the possible values of model order (or the number of classes) in mixed data. Our experimental results on benchmark corpora for word sense disambiguation task indicate that this model order identification algorithm with the extended label propagation algorithm as the base classifier outperforms SVM, a one-class partially supervised classification algorithm, and the model order identification algorithm with semi-supervised k-means clustering as the base classifier when labeled data is incomplete.  相似文献   

6.
In the maximum constraint satisfaction problem (Max CSP), one is given a finite collection of positive-weight constraints on overlapping sets of variables, and the goal is to assign values from a given domain to the variables so that the total weight of satisfied constraints is maximized. We consider this problem and its variant Max AW CSP where the weights are allowed to be both positive and negative, and study how the complexity of the problems depends on the allowed constraint types. We prove that Max AW CSP over an arbitrary finite domain exhibits a dichotomy: it is either polynomial-time solvable or NP-hard. Our proof builds on two results that may be of independent interest: one is that the problem of finding a maximum H-colourable subdigraph in a given digraph is either NP-hard or trivial depending on H, and the other a dichotomy result for Max CSP with a single allowed constraint type.  相似文献   

7.
In this article, H structured model reduction is addressed for linear discrete systems. Two important classes of systems are considered for structured model reduction, i.e. Markov jump systems and uncertain systems. The problem we deal with is the development of algorithms with the flexibility to allow any structure in the reduced-order system design, such as the structure of an original system, decentralisation of a networked system, pole assignment of the reduced system, etc. The algorithms are derived such that an associated model reduction error guarantees to satisfy a prescribed H norm-bound constraint. A new condition for the existence of desired reduced-order models preserving a certain structure is presented in a set of linear matrix inequalities (LMI) and non-convex equality constraints. Effective computational algorithms involving LMI are suggested to solve the matrix inequalities characterising a solution of the structured model reduction problem. Numerical examples demonstrate the advantages of the proposed model reduction method.  相似文献   

8.
Reliable and probably useful learning, proposed by Rivest and Sloan, is a variant of probably approximately correct learning. In this model the hypothesis must never misclassify an instance but is allowed to answer I don't know with a low probability. We derive upper and lower bounds for the sample complexity of reliable and probably useful learning in terms of the combinatorial characteristics of the concept class to be learned. This is done by reducing reliable and probably useful learning to learning with one-sided error. The bounds also hold for a slightly weaker model that allows the learner to output with a low probability a hypothesis that makes misclassifications. We see that in these models learning with one oracle is more difficult than learning with two oracles. Our results imply that monotone Boolean conjunctions or disjunctions cannot be learned reliably and probably usefully from a polynomial number of examples. Rectangles in n forn 2 cannot be learned from any finite number of examples.A preliminary version of this paper appeared under the title Reliable and useful learning inProceedings of the 2nd Annual Workshop on Computational Learning Theory, Morgan Kaufmann, San Mateo, CA, 1989, pp. 365–380. This work was supported by the Academy of Finland.  相似文献   

9.
The Satisfactory Bisection problem means to decide whether a given graph has a partition of its vertex set into two parts of the same cardinality such that each vertex has at least as many neighbors in its part as in the other part. A related variant of this problem, called Co-Satisfactory Bisection, requires that each vertex has at most as many neighbors in its part as in the other part. A vertex satisfying the degree constraint above in a partition is called ‘satisfied’ or ‘co-satisfied,’ respectively. After stating the NP-completeness of both problems, we study approximation results in two directions. We prove that maximizing the number of (co-)satisfied vertices in a bisection has no polynomial-time approximation scheme (unless P=NP), whereas constant approximation algorithms can be obtained in polynomial time. Moreover, minimizing the difference of the cardinalities of vertex classes in a bipartition that (co-)satisfies all vertices has no polynomial-time approximation scheme either.  相似文献   

10.
The main problem considered consists in testing the hypothesis H 0 that letters of an alphabet A = a 1, a 2, . . ., a k are generated with equal probabilities 1/k against the alternative complex hypothesis H 1, the negation of H 0. In many applications, in particular, those connected with cryptography, k is large, but possible deviations from the uniform distribution are small. Therefore, application of Pearson's 2 test, which is one of the most wide-spread and efficient tests, requires samples of a very large size, certainly exceeding k. We propose a so-called adaptive 2 test, whose power can be considerably higher than that of the traditional method in the case described. This conclusion is based on the theoretical analysis of the proposed criterion for some classes of alternatives as well as on experimental results related to discriminating between enciphered Russian texts and random sequences.  相似文献   

11.
Given a reachable discrete-time linear system (A,b), the reachable set is a cone when a positive constraint is imposed on the input. The problem to be studied is the geometrical structure of the reachable set = cone(b, Ab, A2b,...) in terms of the spectrum ofA. In particular, conditions which ensure , or its closureR, is a polyhedral proper cone are derived. The impact of the given results on finite-time reachability and positive realizability is also discussed.  相似文献   

12.
This paper is concerned with the problem of finding a hypothesis in consistent with given positive and negative examples. The hypothesis class consists of all sets of at most two tree patterns and represents the class of unions of at most two tree pattern languages. Especially, we consider the problem from the point of view of the consistency problem for . The consistency problem is a problem for deciding whether there exists a consistent hypothesis with given positive and negative examples within some fixed hypothesis space. Efficient solvability of that problem is closely related to the possibility of efficient machine learning or machine discovery. Unfortunately, however, the consistency problem is known to be NP-complete for many hypothesis spaces. In this paper, the problem for the class is also shown to be NP-complete. In order to overcome this computational hardness, we try to use additional information obtained by making queries. First, we give an algorithm that, using restricted subset queries, solves the consistency problem for in time polynomial in the total size of given positive and negative examples. Next, we show that each subset query made by the algorithm can be replaced by several membership queries under some condition on a set of function symbols. As a result, we have that the consistency problem for is solved in polynomial time using membership queries. This revised version was published online in June 2006 with corrections to the Cover Date.  相似文献   

13.
A version space is a collection of concepts consistent with a given set of positive and negative examples. Mitchell [Artificial Intelligence 18 (1982) 203-226] proposed representing a version space by its boundary sets: the maximally general (G) and maximally specific consistent concepts (S). For many simple concept classes, the size of G and S is known to grow exponentially in the number of positive and negative examples. This paper argues that previous work on alternative representations of version spaces has disguised the real question underlying version space reasoning. We instead show that tractable reasoning with version spaces turns out to depend on the consistency problem, i.e., determining if there is any concept consistent with a set of positive and negative examples. Indeed, we show that tractable version space reasoning is possible if and only if there is an efficient algorithm for the consistency problem. Our observations give rise to new concept classes for which tractable version space reasoning is now possible, e.g., 1-decision lists, monotone depth two formulas, and halfspaces.  相似文献   

14.
In this paper, we address a fundamental problem related to the induction of Boolean logic: Given a set of data, represented as a set of binary “truen-vectors” (or “positive examples”) and a set of “falsen-vectors” (or “negative examples”), we establish a Boolean function (or an extension)f, so thatfis true (resp., false) in every given true (resp., false) vector. We shall further require that such an extension belongs to a certain specified class of functions, e.g., class of positive functions, class of Horn functions, and so on. The class of functions represents our a priori knowledge or hypothesis about the extensionf, which may be obtained from experience or from the analysis of mechanisms that may or may not cause the phenomena under consideration. The real-world data may contain errors, e.g., measurement and classification errors might come in when obtaining data, or there may be some other influential factors not represented as variables in the vectors. In such situations, we have to give up the goal of establishing an extension that is perfectly consistent with the given data, and we are satisfied with an extensionfhaving the minimum number of misclassifications. Both problems, i.e., the problem of finding an extension within a specified class of Boolean functions and the problem of finding a minimum error extension in that class, will be extensively studied in this paper. For certain classes we shall provide polynomial algorithms, and for other cases we prove their NP-hardness.  相似文献   

15.
16.
We consider the single-machine scheduling problem of minimizing the number of late jobs. We omit here one of the standard assumptions in scheduling theory, which is that the processing times are deterministic. In this scheduling environment, the completion times will be stochastic variables as well. Instead of looking at the expected number of on time jobs, we present a new model to deal with the stochastic completion times, which is based on using a chance constraint to define whether a job is on time or late: a job is on time if the probability that it is completed by the deterministic due date is at least equal to a certain given minimum success probability. We have studied this problem for four classes of stochastic processing times. The jobs in the first three classes have processing times that follow: (i) A gamma distribution with shape parameter p j and scale parameter β, where β is common to all jobs; (ii) A negative binomial distribution with parameters p j and r, where r is the same for each job; (iii) A normal distribution with parameters p j and σ j 2. The jobs in the fourth class have equally disturbed processing times, that is, the processing times consist of a deterministic part and a random component that is independently, identically distributed for each job. We show that the first two cases have a common characteristic that makes it possible to solve these problems in O(nlog n) time through the algorithm by Moore and Hodgson. To analyze the third and fourth problem we need the additional assumption that the due dates and the minimum success probabilities are agreeable. We show that under this assumption the third problem is -hard in the ordinary sense, whereas the fourth problem is solvable by Moore and Hodgson’s algorithm. We further indicate how the problem of maximizing the expected number of on time jobs (with respect to the standard definition) can be tackled if we add the constraint that the on time jobs are sequenced in a given order and when we require that the probability that a job is on time amounts to at least some given lower bound. Supported by EC Contract IST-1999-14186 (Project alcom-FT).  相似文献   

17.
It is well known that theH control problem has a state space formulation in terms of differential games. For a finite time horizon control problem, the analogous differential game is considered. The disturbance is the control for the maximizing player. In order to allow forL 2 disturbances, the controls for at least one player must be allowed to be unbounded. It is shown that the value of the game is the viscosity solution of the corresponding Isaacs equation under rather general conditions.  相似文献   

18.
In this paper, the optimal H 2 model order reduction (MOR) problem for bilinear systems is explored. The orthogonality constraint of the cost function generated by the H 2 MOR error makes it is posed not on the Euclidean space, but can be discussed on the Stiefel manifold. Then, the H 2 optimal MOR problem of bilinear systems is turned into the unconstrained optimisation on the Stiefel manifold. The explicit expression of the gradient for the cost function on this manifold is derived. Full use of the geometry properties of this Stiefiel manifold, we propose a feasible and effective iterative algorithm to solve the unconstrained H 2 minimisation problem. Moreover, the convergence of our algorithm is rigorously proved. Finally, two practical examples related to bilinear systems demonstrate the effectiveness of our algorithm.  相似文献   

19.
Sample Compression,Learnability, and the Vapnik-Chervonenkis Dimension   总被引:2,自引:0,他引:2  
Floyd  Sally  Warmuth  Manfred 《Machine Learning》1995,21(3):269-304
Within the framework of pac-learning, we explore the learnability of concepts from samples using the paradigm of sample compression schemes. A sample compression scheme of size k for a concept class C 2 X consists of a compression function and a reconstruction function. The compression function receives a finite sample set consistent with some concept in C and chooses a subset of k examples as the compression set. The reconstruction function forms a hypothesis on X from a compression set of k examples. For any sample set of a concept in C the compression set produced by the compression function must lead to a hypothesis consistent with the whole original sample set when it is fed to the reconstruction function. We demonstrate that the existence of a sample compression scheme of fixed-size for a class C is sufficient to ensure that the class C is pac-learnable.Previous work has shown that a class is pac-learnable if and only if the Vapnik-Chervonenkis (VC) dimension of the class is finite. In the second half of this paper we explore the relationship between sample compression schemes and the VC dimension. We define maximum and maximal classes of VC dimension d. For every maximum class of VC dimension d, there is a sample compression scheme of size d, and for sufficiently-large maximum classes there is no sample compression scheme of size less than d. We discuss briefly classes of VC dimension d that are maximal but not maximum. It is an open question whether every class of VC dimension d has a sample compression scheme of size O(d).  相似文献   

20.
This paper focuses on the H model reduction problem of positive fractional order systems. For a stable positive fractional order system, we aim to construct a positive reduced‐order fractional system such that the associated error system is stable with a prescribed H performance. Then, based on the bounded real lemma for fractional order systems, a sufficient condition is given to characterize the model reduction problem with a prescribed H‐norm error bound in terms of a linear matrix inequality (LMI). Furthermore, by introducing a new flexible real matrix variable, the desired reduced‐order system matrices are decoupled with the complex matrix variable and further parameterized by the new matrix variable. A corresponding iterative LMI algorithm is also proposed. Finally, several illustrative examples are given to show the effectiveness of the proposed algorithms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号