期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Multivariate Decision Trees 总被引：24，自引：0，他引：24

Brodley Carla E. Utgoff Paul E. 《Machine Learning》1995,19(1):45-77

Unlike a univariate decision tree, a multivariate decision tree is not restricted to splits of the instance space that are orthogonal to the features' axes. This article addresses several issues for constructing multivariate decision trees: representing a multivariate test, including symbolic and numeric features, learning the coefficients of a multivariate test, selecting the features to include in a test, and pruning of multivariate decision trees. We present several new methods for forming multivariate decision trees and compare them with several well-known methods. We compare the different methods across a variety of learning tasks, in order to assess each method's ability to find concise, accurate decision trees. The results demonstrate that some multivariate methods are in general more effective than others (in the context of our experimental assumptions). In addition, the experiments confirm that allowing multivariate tests generally improves the accuracy of the resulting decision tree over a univariate tree. 相似文献

2.

Coding Decision Trees 总被引：4，自引：0，他引：4

Wallace C.S. Patrick J.D. 《Machine Learning》1993,11(1):7-22

相似文献

3.

An Efficient Extension to Mixture Techniques for Prediction and Decision Trees

Pereira Fernando C. Singer Yoram 《Machine Learning》1999,36(3):183-199

We present an efficient method for maintaining mixtures of prunings of a prediction or decision tree that extends the previous methods for node-based prunings (Buntine, 1990; Willems, Shtarkov, & Tjalkens, 1995; Helmbold & Schapire, 1997; Singer, 1997) to the larger class of edge-based prunings. The method includes an online weight-allocation algorithm that can be used for prediction, compression and classification. Although the set of edge-based prunings of a given tree is much larger than that of node-based prunings, our algorithm has similar space and time complexity to that of previous mixture algorithms for trees. Using the general online framework of Freund and Schapire (1997), we prove that our algorithm maintains correctly the mixture weights for edge-based prunings with any bounded loss function. We also give a similar algorithm for the logarithmic loss function with a corresponding weight-allocation algorithm. Finally, we describe experiments comparing node-based and edge-based mixture models for estimating the probability of the next word in English text, which show the advantages of edge-based models. 相似文献

4.

Implementation and Evaluation of Decision Trees with Range and Region Splitting

Yasuhiko Morimoto Takeshi Fukuda Shinichi Morishita Takeshi Tokuyama 《Constraints》1997,2(3-4):401-427

We propose an extension of an entropy-based heuristic for constructing a decision tree from a large database with many numeric attributes. When it comes to handling numeric attributes, conventional methods are inefficient if any numeric attributes are strongly correlated. Our approach offers one solution to this problem. For each pair of numeric attributes with strong correlation, we compute a two-dimensional association rule with respect to these attributes and the objective attribute of the decision tree. In particular, we consider a family R of grid-regions in the plane associated with the pairof attributes. For R R, the data canbe split into two classes: data inside R and dataoutside R. We compute the region R_opt R that minimizes the entropy of the splitting,and add the splitting associated with R_opt (foreach pair of strongly correlated attributes) to the set of candidatetests in an entropy-based heuristic. We give efficient algorithmsfor cases in which R is (1) x-monotone connected regions, (2) based-monotone regions, (3) rectangles, and (4) rectilinear convex regions. The algorithm has been implemented as a subsystem of SONAR (System for Optimized Numeric Association Rules) developed by the authors. We have confirmed that we can compute the optimal region efficiently. And diverse experiments show that our approach can create compact trees whose accuracy is comparable with or better than that of conventional trees. More importantly, we can grasp non-linear correlation among numeric attributes which could not be found without our region splitting. 相似文献

5.

Enlarging the Margins in Perceptron Decision Trees 总被引：4，自引：0，他引：4

Kristin P. Bennett Nello Cristianini John Shawe-Taylor Donghui Wu 《Machine Learning》2000,41(3):295-313

Capacity control in perceptron decision trees is typically performed by controlling their size. We prove that other quantities can be as relevant to reduce their flexibility and combat overfitting. In particular, we provide an upper bound on the generalization error which depends both on the size of the tree and on the margin of the decision nodes. So enlarging the margin in perceptron decision trees will reduce the upper bound on generalization error. Based on this analysis, we introduce three new algorithms, which can induce large margin perceptron decision trees. To assess the effect of the large margin bias, OC1 (Journal of Artificial Intelligence Research, 1994, 2, 1–32.) of Murthy, Kasif and Salzberg, a well-known system for inducing perceptron decision trees, is used as the baseline algorithm. An extensive experimental study on real world data showed that all three new algorithms perform better or at least not significantly worse than OC1 on almost every dataset with only one exception. OC1 performed worse than the best margin-based method on every dataset. 相似文献

6.

On Learning Decision Trees with Large Output Domains

N. H. Bshouty C. Tamon D. K. Wilson 《Algorithmica》1998,20(1):77-100

For two disjoint sets of variables, X and Y , and a class of functions C , we define DT(X,Y,C) to be the class of all decision trees over X whose leaves are functions from C over Y . We study the learnability of DT(X,Y,C) using membership and equivalence queries. Boolean decision trees, , were shown to be exactly learnable by Bshouty but does this imply the learnability of decision trees that have nonboolean leaves? A simple encoding of all possible leaf values will work provided that the size of C is reasonable. Our investigation involves several cases where simple encoding is not feasible, i.e., when |C| is large. We show how to learn decision trees whose leaves are learnable concepts belonging to a class C , DT(X,Y,C) , when the separation between the variables X and Y is known. A simple algorithm for decision trees whose leaves are constants, , is also presented. Each case above requires at least s separate executions of the algorithm due to Bshouty where s is the number of distinct leaves of the tree but we show that if C is a bounded lattice, is learnable using only one execution of this algorithm. Received September 23, 1995; revised January 15, 1996. 相似文献

7.

Combining Classifiers with Meta Decision Trees 总被引：4，自引：0，他引：4

Todorovski Ljupčo Džeroski Sašo 《Machine Learning》2003,50(3):223-249

The paper introduces meta decision trees (MDTs), a novel method for combining multiple classifiers. Instead of giving a prediction, MDT leaves specify which classifier should be used to obtain a prediction. We present an algorithm for learning MDTs based on the C4.5 algorithm for learning ordinary decision trees (ODTs). An extensive experimental evaluation of the new algorithm is performed on twenty-one data sets, combining classifiers generated by five learning algorithms: two algorithms for learning decision trees, a rule learning algorithm, a nearest neighbor algorithm and a naive Bayes algorithm. In terms of performance, stacking with MDTs combines classifiers better than voting and stacking with ODTs. In addition, the MDTs are much more concise than the ODTs and are thus a step towards comprehensible combination of multiple classifiers. MDTs also perform better than several other approaches to stacking. 相似文献

8.

Noise-Tolerant Occam Algorithms and Their Applications to Learning Decision Trees

Sakakibara Yasubumi 《Machine Learning》1993,11(1):37-62

In the distribution-independent model of concept learning of Valiant, Angluin and Laird have introduced a formal model of noise process, called classification noise process, to study how to compensate for randomly introduced errors, or noise, in classifying the example data. In this article, we investigate the problem of designing efficient learning algorithms in the presence of classification noise. First, we develop a technique of building efficient robust learning algorithms, called noise-tolerant Occam algorithms, and show that using them, one can construct a polynomial-time algorithm for learning a class of Boolean functions in the presence of classification noise. Next, as an instance of such problems of learning in the presence of classification noise, we focus on the learning problem of Boolean functions represented by decision trees. We present a noise-tolerant Occam algorithm for k-DL (the class of decision lists with conjunctive clauses of size at most k at each decision introduced by Rivest) and hence conclude that k-DL is polynomially learnable in the presence of classification noise. Further, we extend the noise-tolerant Occam algorithm for k-DL to one for r-DT (the class of decision trees of rank at most r introduced by Ehrenfeucht and Haussler) and conclude that r-DT is polynomially learnable in the presence of classification noise. 相似文献

9.

Incremental Learning of Linear Model Trees

Duncan Potts Claude Sammut 《Machine Learning》2005,61(1-3):5-48

A linear model tree is a decision tree with a linear functional model in each leaf. Previous model tree induction algorithms have been batch techniques that operate on the entire training set. However there are many situations when an incremental learner is advantageous. In this article a new batch model tree learner is described with two alternative splitting rules and a stopping rule. An incremental algorithm is then developed that has many similarities with the batch version but is able to process examples one at a time. An online pruning rule is also developed. The incremental training time for an example is shown to only depend on the height of the tree induced so far, and not on the number of previous examples. The algorithms are evaluated empirically on a number of standard datasets, a simple test function and three dynamic domains ranging from a simple pendulum to a complex 13 dimensional flight simulator. The new batch algorithm is compared with the most recent batch model tree algorithms and is seen to perform favourably overall. The new incremental model tree learner compares well with an alternative online function approximator. In addition it can sometimes perform almost as well as the batch model tree algorithms, highlighting the effectiveness of the incremental implementation. Editor: Johannes Fürnkranz 相似文献

10.

ART: A Hybrid Classification Model 总被引：1，自引：0，他引：1

Berzal Fernando Cubero Juan-Carlos Sánchez Daniel Serrano José María 《Machine Learning》2004,54(1):67-92

This paper presents a new family of decision list induction algorithms based on ideas from the association rule mining context. ART, which stands for Association Rule Tree, builds decision lists that can be viewed as degenerate, polythetic decision trees. Our method is a generalized Separate and Conquer algorithm suitable for Data Mining applications because it makes use of efficient and scalable association rule mining techniques. 相似文献

11.

Using Model Trees for Classification 总被引：1，自引：0，他引：1

Frank Eibe Wang Yong Inglis Stuart Holmes Geoffrey Witten Ian H. 《Machine Learning》1998,32(1):63-76

Model trees, which are a type of decision tree with linear regression functions at the leaves, form the basis of a recent successful technique for predicting continuous numeric values. They can be applied to classification problems by employing a standard method of transforming a classification problem into a problem of function approximation. Surprisingly, using this simple transformation the model tree inducer M5, based on Quinlan's M5, generates more accurate classifiers than the state-of-the-art decision tree learner C5.0, particularly when most of the attributes are numeric. 相似文献

12.

C-Net: A Method for Generating Non-deterministic and Dynamic Multivariate Decision Trees

H. A. Abbass M. Towsey G. Finn 《Knowledge and Information Systems》2001,3(2):184-197

Despite the fact that artificial neural networks (ANNs) are universal function approximators, their black box nature (that is, their lack of direct interpretability or expressive power) limits their utility. In contrast, univariate decision trees (UDTs) have expressive power, although usually they are not as accurate as ANNs. We propose an improvement, C-Net, for both the expressiveness of ANNs and the accuracy of UDTs by consolidating both technologies for generating multivariate decision trees (MDTs). In addition, we introduce a new concept, recurrent decision trees, where C-Net uses recurrent neural networks to generate an MDT with a recurrent feature. That is, a memory is associated with each node in the tree with a recursive condition which replaces the conventional linear one. Furthermore, we show empirically that, in our test cases, our proposed method achieves a balance of comprehensibility and accuracy intermediate between ANNs and UDTs. MDTs are found to be intermediate since they are more expressive than ANNs and more accurate than UDTs. Moreover, in all cases MDTs are more compact (i.e., smaller tree size) than UDTs. Received 27 January 2000 / Revised 30 May 2000 / Accepted in revised form 30 October 2000 相似文献

13.

多层梯度提升树在药品鉴别中的应用

杜师帅邱天李灵巧胡锦泉郑安兵冯艳春胡昌勤杨辉华《计算机科学与探索》2020,14(2):260-273

近红外光谱分析技术高效应用于药品分析领域。针对高维非线性的小规模近红外数据,传统的药品鉴别算法存在特征学习能力不足的缺陷,基于神经网络的方法有局部最优及过拟合等问题,且两者易忽略样本的不均衡性。针对以上劣势,提出一种基于特征选择与代价敏感学习的多层梯度提升树(CS_FGBDT)药品分类方法。首先采用Savitsky-Golay平滑和一阶导数对原始数据进行预处理;其次利用随机森林对预处理光谱自适应提取特征,并由多层梯度提升树进行特征映射;然后结合代价敏感学习机制将样本不均衡性的负效应降到最小。实验结果表明,在胶囊和药片两种不平衡数据集上对算法进行对比评估,该模型具有更高的预测精度和稳定性,是一种有效的药品鉴别方法。相似文献

14.

增量决策树算法研究 总被引：2，自引：1，他引：2

尹阿东郭秀颖宫雨龙誉高学东《微机发展》2005,15(2):63-66

文中主要解决传统的ID3算法不能处理增量数据集构造决策树的问题。在传统ID3决策树算法和原有增量算法的基础上,利用信息论中熵变原理的特点,对与增量决策树算法相关的三个定理进行相应的改进,在理论上证明了改进的增量决策树算法的有效性和可靠性。同时对增量决策树算法和ID3算法的复杂度进行了对比分析,得出增量决策树算法的实例费用和信息熵费用都高于ID3算法的结论。最后通过一个实验证明,改进的增量决策树算法能够构造出与ID3算法形态基本相同的决策树。相似文献

15.

IGTree: Using Trees for Compression and Classification in Lazy Learning Algorithms 总被引：4，自引：0，他引：4

Walter Daelemans Antal Van Den Bosch Ton Weijters 《Artificial Intelligence Review》1997,11(1-5):407-423

We describe the IGTree learning algorithm, which compresses an instance base into a tree structure. The concept of information gain is used as a heuristic function for performing this compression. IGTree produces trees that, compared to other lazy learning approaches, reduce storage requirements and the time required to compute classifications. Furthermore, we obtained similar or better generalization accuracy with IGTree when trained on two complex linguistic tasks, viz. letter–phoneme transliteration and part-of-speech-tagging, when compared to alternative lazy learning and decision tree approaches (viz., IB1, information-gain-weighted IB1, and C4.5). A third experiment, with the task of word hyphenation, demonstrates that when the mutual differences in information gain of features is too small, IGTree as well as information-gain-weighted IB1 perform worse than IB1. These results indicate that IGTree is a useful algorithm for problems characterized by the availability of a large number of training instances described by symbolic features with sufficiently differing information gain values. 相似文献

16.

Trading Accuracy for Simplicity in Decision Trees 总被引：4，自引：0，他引：4

Bohanec Marko Bratko Ivan 《Machine Learning》1994,15(3):223-250

When communicating concepts, it is often convenient or even necessary to define a concept approximately. A simple, although only approximately accurate concept definition may be more useful than a completely accurate definition which involves a lot of detail. This paper addresses the problem: given a completely accurate, but complex, definition of a concept, simplify the definition, possibly at the expense of accuracy, so that the simplified definition still corresponds to the concept sufficiently well. Concepts are represented by decision trees, and the method of simplification is tree pruning. Given a decision tree that accurately specifies a concept, the problem is to find a smallest pruned tree that still represents the concept within some specified accuracy. A pruning algorithm is presented that finds an optimal solution by generating adense sequence of pruned trees, decreasing in size, such that each tree has the highest accuracy among all the possible pruned trees of the same size. An efficient implementation of the algorithm, based on dynamic programming, is presented and empirically compared with three progressive pruning algorithms using both artificial and real-world data. An interesting empirical finding is that the real-world data generally allow significantly greater simplification at equal loss of accuracy. 相似文献

17.

动态决策树算法研究 总被引：1，自引：0，他引：1

尹阿东谢霖铨龙誉杨立东《计算机工程与应用》2004,40(33):103-105,132

该文在增量决策树算法的基础上,提出一种能够处理变化数据集的减量决策树算法,提出并证明了减量决策树算法中的三个基本定理,保证了减量决策树算法的可靠性。同时将传统的增量决策树算法与该文所提出的减量决策树算法相结合,构造出一种动态决策树算法,该算法很好地解决了发生增减变化的动态数据集构造决策树的问题,另外动态决策树算法的提出也促进了在线规则提取的发展与完善。相似文献

18.

Active Sampling for Class Probability Estimation and Ranking 总被引：1，自引：0，他引：1

Maytal Saar-Tsechansky Foster Provost 《Machine Learning》2004,54(2):153-178

In many cost-sensitive environments class probability estimates are used by decision makers to evaluate the expected utility from a set of alternatives. Supervised learning can be used to build class probability estimates; however, it often is very costly to obtain training data with class labels. Active learning acquires data incrementally, at each phase identifying especially useful additional data for labeling, and can be used to economize on examples needed for learning. We outline the critical features of an active learner and present a sampling-based active learning method for estimating class probabilities and class-based rankings. BOOTSTRAP-LV identifies particularly informative new data for learning based on the variance in probability estimates, and uses weighted sampling to account for a potential example's informative value for the rest of the input space. We show empirically that the method reduces the number of data items that must be obtained and labeled, across a wide variety of domains. We investigate the contribution of the components of the algorithm and show that each provides valuable information to help identify informative examples. We also compare BOOTSTRAP-LV with UNCERTAINTY SAMPLING, an existing active learning method designed to maximize classification accuracy. The results show that BOOTSTRAP-LV uses fewer examples to exhibit a certain estimation accuracy and provide insights to the behavior of the algorithms. Finally, we experiment with another new active sampling algorithm drawing from both UNCERTAINTY SAMPLING and BOOTSTRAP-LV and show that it is significantly more competitive with BOOTSTRAP-LV compared to UNCERTAINTY SAMPLING. The analysis suggests more general implications for improving existing active sampling algorithms for classification. 相似文献

19.

Induction of Decision Trees 总被引：390，自引：5，他引：390

Quinlan J.R. 《Machine Learning》1986,1(1):81-106

The technology for building knowledge-based systems by inductive inference from examples has been demonstrated successfully in several practical applications. This paper summarizes an approach to synthesizing decision trees that has been used in a variety of systems, and it describes one such system, ID3, in detail. Results from recent studies show ways in which the methodology can be modified to deal with information that is noisy and/or incomplete. A reported shortcoming of the basic algorithm is discussed and two means of overcoming it are compared. The paper concludes with illustrations of current research directions. 相似文献

20.

一种新的决策树分裂属性选择方法

刘星毅《微机发展》2008,18(5):70-72

分类问题是数据挖掘和机器学习中的一个核心问题。为了得到最大程度的分类准确率,决策树分类过程中,非常关键的是结点分裂属性的选择。常见的分裂结点属性选择方法可以分为信息熵方法、GINI系数方法等。分析了目前常见的选择分裂属性方法——基于信息熵方法的优、缺点,提出了基于卡方检验的决策树分裂属性的选择方法,用真实例子和设置模拟实验说明了文中算法的优越性。实验结果显示文中算法在分类错误率方面好于以信息熵为基础的方法。相似文献