首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A Survey of Methods for Scaling Up Inductive Algorithms   总被引:5,自引:1,他引:5  
One of the defining challenges for the KDD research community is to enable inductive learning algorithms to mine very large databases. This paper summarizes, categorizes, and compares existing work on scaling up inductive algorithms. We concentrate on algorithms that build decision trees and rule sets, in order to provide focus and specific details; the issues and techniques generalize to other types of data mining. We begin with a discussion of important issues related to scaling up. We highlight similarities among scaling techniques by categorizing them into three main approaches. For each approach, we then describe, compare, and contrast the different constituent techniques, drawing on specific examples from published papers. Finally, we use the preceding analysis to suggest how to proceed when dealing with a large problem, and where to focus future research.  相似文献   

2.
基于决策树学习的智能机器人控制方法   总被引:4,自引:0,他引:4  
通过样本例子的训练,建立状态空间到操作空间的对应关系是学习控制理论研究的核心,给出了用决策树学习方法,建立状态空间到操作空间进行影射的方法--基于决策树学习的空间变换方法。该方法可以应用到智能机器人控制,模式识别,专家系统的知识获取等方面,文中给出了有关智能机器人控制方面的数值实验结果表明,基于决策树学习的空间变换方法是行之有效的。  相似文献   

3.
4.
Most manifold learning techniques are used to transform high-dimensional data sets into low-dimensional space. In the use of such techniques, after unseen data samples are added to the data set, retraining is usually necessary. However, retraining is a time-consuming process and no guarantee of the transformation into the exactly same coordinates, thus presenting a barrier to the application of manifold learning as a preprocessing step in predictive modeling. To solve this problem, learning a mapping from high-dimensional representations to low-dimensional coordinates is proposed via structured support vector machine. After training a mapping, low-dimensional representations of unobserved data samples can be easily predicted. Experiments on several datasets show that the proposed method outperforms the existing out-of-sample extension methods.  相似文献   

5.
6.
Improved Boosting Algorithms Using Confidence-rated Predictions   总被引:55,自引:0,他引:55  
We describe several improvements to Freund and Schapire's AdaBoost boosting algorithm, particularly in a setting in which hypotheses may assign confidences to each of their predictions. We give a simplified analysis of AdaBoost in this setting, and we show how this analysis can be used to find improved parameter settings as well as a refined criterion for training weak hypotheses. We give a specific method for assigning confidences to the predictions of decision trees, a method closely related to one used by Quinlan. This method also suggests a technique for growing decision trees which turns out to be identical to one proposed by Kearns and Mansour. We focus next on how to apply the new boosting algorithms to multiclass classification problems, particularly to the multi-label case in which each example may belong to more than one class. We give two boosting methods for this problem, plus a third method based on output coding. One of these leads to a new method for handling the single-label case which is simpler but as effective as techniques suggested by Freund and Schapire. Finally, we give some experimental results comparing a few of the algorithms discussed in this paper.  相似文献   

7.
Bauer  Eric  Kohavi  Ron 《Machine Learning》1999,36(1-2):105-139
Methods for voting classification algorithms, such as Bagging and AdaBoost, have been shown to be very successful in improving the accuracy of certain classifiers for artificial and real-world datasets. We review these algorithms and describe a large empirical study comparing several variants in conjunction with a decision tree inducer (three variants) and a Naive-Bayes inducer. The purpose of the study is to improve our understanding of why and when these algorithms, which use perturbation, reweighting, and combination techniques, affect classification error. We provide a bias and variance decomposition of the error to show how different methods and variants influence these two terms. This allowed us to determine that Bagging reduced variance of unstable methods, while boosting methods (AdaBoost and Arc-x4) reduced both the bias and variance of unstable methods but increased the variance for Naive-Bayes, which was very stable. We observed that Arc-x4 behaves differently than AdaBoost if reweighting is used instead of resampling, indicating a fundamental difference. Voting variants, some of which are introduced in this paper, include: pruning versus no pruning, use of probabilistic estimates, weight perturbations (Wagging), and backfitting of data. We found that Bagging improves when probabilistic estimates in conjunction with no-pruning are used, as well as when the data was backfit. We measure tree sizes and show an interesting positive correlation between the increase in the average tree size in AdaBoost trials and its success in reducing the error. We compare the mean-squared error of voting methods to non-voting methods and show that the voting methods lead to large and significant reductions in the mean-squared errors. Practical problems that arise in implementing boosting algorithms are explored, including numerical instabilities and underflows. We use scatterplots that graphically show how AdaBoost reweights instances, emphasizing not only hard areas but also outliers and noise.  相似文献   

8.
9.
Parallel Formulations of Decision-Tree Classification Algorithms   总被引:5,自引:0,他引:5  
Classification decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, fraud detection, etc. Highly parallel algorithms for constructing classification decision trees are desirable for dealing with large data sets in reasonable amount of time. Algorithms for building classification decision trees have a natural concurrency, but are difficult to parallelize due to the inherent dynamic nature of the computation. In this paper, we present parallel formulations of classification decision tree learning algorithm based on induction. We describe two basic parallel formulations. One is based on Synchronous Tree Construction Approach and the other is based on Partitioned Tree Construction Approach. We discuss the advantages and disadvantages of using these methods and propose a hybrid method that employs the good features of these methods. We also provide the analysis of the cost of computation and communication of the proposed hybrid method. Moreover, experimental results on an IBM SP-2 demonstrate excellent speedups and scalability.  相似文献   

10.
In the distribution-independent model of concept learning of Valiant, Angluin and Laird have introduced a formal model of noise process, called classification noise process, to study how to compensate for randomly introduced errors, or noise, in classifying the example data. In this article, we investigate the problem of designing efficient learning algorithms in the presence of classification noise. First, we develop a technique of building efficient robust learning algorithms, called noise-tolerant Occam algorithms, and show that using them, one can construct a polynomial-time algorithm for learning a class of Boolean functions in the presence of classification noise. Next, as an instance of such problems of learning in the presence of classification noise, we focus on the learning problem of Boolean functions represented by decision trees. We present a noise-tolerant Occam algorithm for k-DL (the class of decision lists with conjunctive clauses of size at most k at each decision introduced by Rivest) and hence conclude that k-DL is polynomially learnable in the presence of classification noise. Further, we extend the noise-tolerant Occam algorithm for k-DL to one for r-DT (the class of decision trees of rank at most r introduced by Ehrenfeucht and Haussler) and conclude that r-DT is polynomially learnable in the presence of classification noise.  相似文献   

11.
Scaling Up Inductive Logic Programming by Learning from Interpretations   总被引:4,自引:0,他引:4  
When comparing inductive logic programming (ILP) and attribute-value learning techniques, there is a trade-off between expressive power and efficiency. Inductive logic programming techniques are typically more expressive but also less efficient. Therefore, the data sets handled by current inductive logic programming systems are small according to general standards within the data mining community. The main source of inefficiency lies in the assumption that several examples may be related to each other, so they cannot be handled independently.Within the learning from interpretations framework for inductive logic programming this assumption is unnecessary, which allows to scale up existing ILP algorithms. In this paper we explain this learning setting in the context of relational databases. We relate the setting to propositional data mining and to the classical ILP setting, and show that learning from interpretations corresponds to learning from multiple relations and thus extends the expressiveness of propositional learning, while maintaining its efficiency to a large extent (which is not the case in the classical ILP setting).As a case study, we present two alternative implementations of the ILP system TILDE (Top-down Induction of Logical DEcision trees): TILDEclassic, which loads all data in main memory, and TILDELDS, which loads the examples one by one. We experimentally compare the implementations, showing TILDELDS can handle large data sets (in the order of 100,000 examples or 100 MB) and indeed scales up linearly in the number of examples.  相似文献   

12.
We describe the IGTree learning algorithm, which compresses an instance base into a tree structure. The concept of information gain is used as a heuristic function for performing this compression. IGTree produces trees that, compared to other lazy learning approaches, reduce storage requirements and the time required to compute classifications. Furthermore, we obtained similar or better generalization accuracy with IGTree when trained on two complex linguistic tasks, viz. letter–phoneme transliteration and part-of-speech-tagging, when compared to alternative lazy learning and decision tree approaches (viz., IB1, information-gain-weighted IB1, and C4.5). A third experiment, with the task of word hyphenation, demonstrates that when the mutual differences in information gain of features is too small, IGTree as well as information-gain-weighted IB1 perform worse than IB1. These results indicate that IGTree is a useful algorithm for problems characterized by the availability of a large number of training instances described by symbolic features with sufficiently differing information gain values.  相似文献   

13.
Decision trees that are based on information-theory are useful paradigms for learning from examples. However, in some real-world applications, known information-theoretic methods frequently generate non-monotonic decision trees, in which objects with better attribute values are sometimes classified to lower classes than objects with inferior values. This property is undesirable for problem solving in many application domains, such as credit scoring and insurance premium determination, where monotonicity of subsequent classifications is important. An attribute-selection metric is proposed here that takes both the error as well as monotonicity into account while building decision trees. The metric is empirically shown capable of significantly reducing the degree of non-monotonicity of decision trees without sacrificing their inductive accuracy.  相似文献   

14.
随着部队实战化训练的深入,传统的训练成绩分析方法已不能适应科学组训的需要。该文引入射击训练实例,应用人工智能中机器学习的ID3归纳学习算法,对射击情况进行分类分析,得出影响军事训练成绩的内部原因以及其他一些结论。  相似文献   

15.
Liu  W.Z.  White  A.P. 《Machine Learning》1994,15(1):25-41
Recent work by Mingers and by Buntine and Niblett on the performance of various attribute selection measures has addressed the topic of random selection of attributes in the construction of decision trees. This article is concerned with the mechanisms underlying the relative performance of conventional and random attribute selection measures. The three experiments reported here employed synthetic data sets, constructed so as to have the precise properties required to test specific hypotheses. The principal underlying idea was that the performance decrement typical of random attribute selection is due to two factors. First, there is a greater chance that informative attributes will be omitted from the subset selected for the final tree. Second, there is a greater risk of overfitting, which is caused by attributes of little or no value in discriminating between classes being locked in to the tree structure, near the root. The first experiment showed that the performance decrement increased with the number of available pure-noise attributes. The second experiment indicated that there was little decrement when all the attributes were of equal importance in discriminating between classes. The third experiment showed that a rather greater performance decrement (than in the second experiment) could be expected if the attributes were all informative, but to different degrees.  相似文献   

16.
随着部队实战化训练的深入,传统的训练成绩分析方法已不能适应科学组训的需要。该文引入射击训练实例,应用人工智能中机器学习的ID3归纳学习算法,对射击情况进行分类分析,得出影响军事训练成绩的内部原因以及其他一些结论。  相似文献   

17.
    
This research proposes a new model for constructing decision trees using interval-valued fuzzy membership values. Most existing fuzzy decision trees do not consider the uncertainty associated with their membership values, however, precise values of fuzzy membership values are not always possible. In this paper, we represent fuzzy membership values as intervals to model uncertainty and employ the look-ahead based fuzzy decision tree induction method to construct decision trees. We also investigate the significance of different neighbourhood values and define a new parameter insensitive to specific data sets using fuzzy sets. Some examples are provided to demonstrate the effectiveness of the approach.  相似文献   

18.
区间值属性决策树学习算法*   总被引:8,自引:0,他引:8  
王熙照  洪家荣 《软件学报》1998,9(8):637-640
该文提出了一种区间值属性决策树的学习算法.区间值属性的值域不同于离散情况下的无序集和连续情况下的全序集,而是一种半序集.作为ID3算法在区间值意义下的推广,算法通过一种分割信息熵的极小化来选取扩展属性.通过非平稳点分析,减少了分割信息熵的计算次数,使算法的效率得到了提高.  相似文献   

19.
Extensive research has been performed for developing knowledge based intelligent monitoring systems for improving the reliability of manufacturing processes. Due to the high expense of obtaining knowledge from human experts, it is expected to develop new techniques to obtain the knowledge automatically from the collected data using data mining techniques. Inductive learning has become one of the widely used data mining methods for generating decision rules from data. In order to deal with the noise or uncertainties existing in the data collected in industrial processes and systems, this paper presents a new method using fuzzy logic techniques to improve the performance of the classical inductive learning approach. The proposed approach, in contrast to classical inductive learning method using hard cut point to discretize the continuous-valued attributes, uses soft discretization to enable the systems have less sensitivity to the uncertainties and noise. The effectiveness of the proposed approach has been illustrated in an application of monitoring the machining conditions in uncertain environment. Experimental results show that this new fuzzy inductive learning method gives improved accuracy compared with using classical inductive learning techniques.  相似文献   

20.
Despite the fact that artificial neural networks (ANNs) are universal function approximators, their black box nature (that is, their lack of direct interpretability or expressive power) limits their utility. In contrast, univariate decision trees (UDTs) have expressive power, although usually they are not as accurate as ANNs. We propose an improvement, C-Net, for both the expressiveness of ANNs and the accuracy of UDTs by consolidating both technologies for generating multivariate decision trees (MDTs). In addition, we introduce a new concept, recurrent decision trees, where C-Net uses recurrent neural networks to generate an MDT with a recurrent feature. That is, a memory is associated with each node in the tree with a recursive condition which replaces the conventional linear one. Furthermore, we show empirically that, in our test cases, our proposed method achieves a balance of comprehensibility and accuracy intermediate between ANNs and UDTs. MDTs are found to be intermediate since they are more expressive than ANNs and more accurate than UDTs. Moreover, in all cases MDTs are more compact (i.e., smaller tree size) than UDTs. Received 27 January 2000 / Revised 30 May 2000 / Accepted in revised form 30 October 2000  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号