首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Extracting decision trees from trained neural networks   总被引:4,自引:0,他引:4  
In this paper we present a methodology for extracting decision trees from input data generated from trained neural networks instead of doing it directly from the data. A genetic algorithm is used to query the trained network and extract prototypes. A prototype selection mechanism is then used to select a subset of the prototypes. Finally, a standard induction method like ID3 or C5.0 is used to extract the decision tree. The extracted decision trees can be used to understand the working of the neural network besides performing classification. This method is able to extract different decision trees of high accuracy and comprehensibility from the trained neural network.  相似文献   

2.
基于神经网络的分类决策树构造   总被引:5,自引:2,他引:3  
目前基于符号处理的方法是解决分类规则提取问题的主要方法,而基于神经网络的连接主义方法则用的不多,其主要原因在于虽然神经网络的分类精度高,但难于提取其所隐含的分类规则与知识.针对这个问题,结合神经网络的具体特点,该文提出了一种基于神经网络的构造分类决策树的新方法.该方法通过神经网络训练建立各属性与分类结果之间的关系,进而通过提取各属性与分类结果之间的导数关系来建立分类决策树.给出了具体的决策树构造算法.同时为了提高神经网络所隐含关系的提取效果,提出了关系强化约束的概念并建立了具体的模型.实际应用结果证明了算法的有效性.  相似文献   

3.
Artificial neural networks (ANNs) are a powerful and widely used pattern recognition technique. However, they remain "black boxes" giving no explanation for the decisions they make. This paper presents a new algorithm for extracting a logistic model tree (LMT) from a neural network, which gives a symbolic representation of the knowledge hidden within the ANN. Landwehr's LMTs are based on standard decision trees, but the terminal nodes are replaced with logistic regression functions. This paper reports the results of an empirical evaluation that compares the new decision tree extraction algorithm with Quinlan's C4.5 and ExTree. The evaluation used 12 standard benchmark datasets from the University of California, Irvine machine-learning repository. The results of this evaluation demonstrate that the new algorithm produces decision trees that have higher accuracy and higher fidelity than decision trees created by both C4.5 and ExTree.  相似文献   

4.
FERNN: An Algorithm for Fast Extraction of Rules from Neural Networks   总被引:4,自引:0,他引:4  
Before symbolic rules are extracted from a trained neural network, the network is usually pruned so as to obtain more concise rules. Typical pruning algorithms require retraining the network which incurs additional cost. This paper presents FERNN, a fast method for extracting rules from trained neural networks without network retraining. Given a fully connected trained feedforward network with a single hidden layer, FERNN first identifies the relevant hidden units by computing their information gains. For each relevant hidden unit, its activation values is divided into two subintervals such that the information gain is maximized. FERNN finds the set of relevant network connections from the input units to this hidden unit by checking the magnitudes of their weights. The connections with large weights are identified as relevant. Finally, FERNN generates rules that distinguish the two subintervals of the hidden activation values in terms of the network inputs. Experimental results show that the size and the predictive accuracy of the tree generated are comparable to those extracted by another method which prunes and retrains the network.  相似文献   

5.
一种与神经元网络杂交的决策树算法   总被引:7,自引:0,他引:7  
神经元网络在多数情况下获得的精度要比决策树和回归算法精度高,这是因为它能适应更复杂的模型,同时由于决策树通常每次只使用一个变量来分支,它所对应的识别空间只能是超矩形,这也就比神经元网络简单,粗度不能与神经元网络相比,然而神经元网络需要相对多的学习时间,并且其模型的可理解性不如决策树、Naive-Bayes等方法直观,本文在进行两种算法对复杂模型的识别对比后,提出了一个新的算法NNTree,这是一个决策树和神经元网络杂交的算法,决策树节点包含单变量的分支就象正常的决策树,但是叶子节点包含神经元网络分类器,这个方法针对决策树处理大型数据的效能,保留了决策树的可理解性,改善了神经元网络的学习性能,同时可使这个分类器的精度大大超过这两种算法,尤其在测试更大的数据集复杂模型时更为明显。  相似文献   

6.
Setiono  R. Huan Liu 《Computer》1996,29(3):71-77
Neural networks often surpass decision trees in predicting pattern classifications, but their predictions cannot be explained. This algorithm's symbolic representations make each prediction explicit and understandable. Our approach to understanding a neural network uses symbolic rules to represent the network decision process. The algorithm, NeuroRule, extracts these rules from a neural network. The network can be interpreted by the rules which, in general, preserve network accuracy and explain the prediction process. We based NeuroRule on a standard three layer feed forward network. NeuroRule consists of four phases. First, it builds a weight decay backpropagation network so that weights reflect the importance of the network's connections. Second, it prunes the network to remove irrelevant connections and units while maintaining the network's predictive accuracy. Third, it discretizes the hidden unit activation values by clustering. Finally, it extracts rules from the network with discretized hidden unit activation values  相似文献   

7.
增量式IHMCAP算法的研究及其应用   总被引:5,自引:0,他引:5  
增量式IHMCAP算法采用适用于混合型学习的FTART神经网络,成功解决了符号学习与神经网络学习精度之间的均衡性问题。该算法还具有较强的增量学习能力,在给系统增加新的示例时,不用重新生成已有判定树和神经网络,只需进行一遍增量学习即可调整原结构以提高学习精度,效率高,速度快。  相似文献   

8.
《Knowledge》1999,12(3):95-99
There exist several methods for transforming decision trees to neural networks. These methods typically construct the networks by directly mapping decision nodes or rules to the neural units. As a result, the networks constructed are often larger than necessary. This article describes a pruning-based method for mapping decision trees to neural networks, which can compress the network by removing unimportant and redundant units and connections. In addition, equivalent decision trees extracted from the pruned networks are simpler than those induced by well-known algorithms such as ID3 and C4.5.  相似文献   

9.
Although artificial neural networks can represent a variety of complex systems with a high degree of accuracy, these connectionist models are difficult to interpret. This significantly limits the applicability of neural networks in practice, especially where a premium is placed on the comprehensibility or reliability of systems. A novel artificial neural-network decision tree algorithm (ANN-DT) is therefore proposed, which extracts binary decision trees from a trained neural network. The ANN-DT algorithm uses the neural network to generate outputs for samples interpolated from the training data set. In contrast to existing techniques, ANN-DT can extract rules from feedforward neural networks with continuous outputs. These rules are extracted from the neural network without making assumptions about the internal structure of the neural network or the features of the data. A novel attribute selection criterion based on a significance analysis of the variables on the neural-network output is examined. It is shown to have significant benefits in certain cases when compared with the standard criteria of minimum weighted variance over the branches. In three case studies the ANN-DT algorithm compared favorably with CART, a standard decision tree algorithm.  相似文献   

10.
Univariate decision trees are classifiers currently used in many data mining applications. This classifier discovers partitions in the input space via hyperplanes that are orthogonal to the axes of attributes, producing a model that can be understood by human experts. One disadvantage of univariate decision trees is that they produce complex and inaccurate models when decision boundaries are not orthogonal to axes. In this paper we introduce the Fisher’s Tree, it is a classifier that takes advantage of dimensionality reduction of Fisher’s linear discriminant and uses the decomposition strategy of decision trees, to come up with an oblique decision tree. Our proposal generates an artificial attribute that is used to split the data in a recursive way.The Fisher’s decision tree induces oblique trees whose accuracy, size, number of leaves and training time are competitive with respect to other decision trees reported in the literature. We use more than ten public available data sets to demonstrate the effectiveness of our method.  相似文献   

11.
Effective data mining using neural networks   总被引:4,自引:0,他引:4  
Classification is one of the data mining problems receiving great attention recently in the database community. The paper presents an approach to discover symbolic classification rules using neural networks. Neural networks have not been thought suited for data mining because how the classifications were made is not explicitly stated as symbolic rules that are suitable for verification or interpretation by humans. With the proposed approach, concise symbolic rules with high accuracy can be extracted from a neural network. The network is first trained to achieve the required accuracy rate. Redundant connections of the network are then removed by a network pruning algorithm. The activation values of the hidden units in the network are analyzed, and classification rules are generated using the result of this analysis. The effectiveness of the proposed approach is clearly demonstrated by the experimental results on a set of standard data mining test problems  相似文献   

12.
混合型多概念获取系统的设计与实现   总被引:1,自引:0,他引:1  
本文主要描述了一个增量式混合型多概念获取系统HMCAS,它提出了一个基于概率论的符号学习与神经网络学习相结合的学习算法,能从隶属于某个概念集的实例集中归纳出满足用户精度要求的,以浊合型判定树表示的概念描述。在HMCAS中,符号学习与神经网络学习具有结合紧密的转换灵活等特点,具有较高的学习效率和较强的归纳能力以及增量学习能力。HMCAS的神经网络学习可选择BP网络或FTART网络,其推理机制提供了混  相似文献   

13.
混合型多概念获取算法的设计及其抗噪音能力   总被引:1,自引:0,他引:1  
IHMCAP(incremental hybrid multi-concepts acquisit ion procedure)算法将基于概率论的符号学习与神经网络学习相结合,通过引入FTART(field theory-based adaptive resonance theory)神经网络,成功地解决了符号学习与神经网络 学习精度之间的均衡性问题,实现了两种不同思维层次的靠近.该算法采用一种独特的增量学 习机制,当增加新的实例时,只需进行一遍增量学习,调整原结构,不必重新生成判定树和神经 网络,即可提高学习精度,速度快,效率高.同时,这种增量学习机制还可以降低算法对噪音数 据的敏感度,从而使IHMCAP可以应用于实时在线学习任务.  相似文献   

14.
Lim  Tjen-Sien  Loh  Wei-Yin  Shih  Yu-Shan 《Machine Learning》2000,40(3):203-228
Twenty-two decision tree, nine statistical, and two neural network algorithms are compared on thirty-two datasets in terms of classification accuracy, training time, and (in the case of trees) number of leaves. Classification accuracy is measured by mean error rate and mean rank of error rate. Both criteria place a statistical, spline-based, algorithm called POLYCLSSS at the top, although it is not statistically significantly different from twenty other algorithms. Another statistical algorithm, logistic regression, is second with respect to the two accuracy criteria. The most accurate decision tree algorithm is QUEST with linear splits, which ranks fourth and fifth, respectively. Although spline-based statistical algorithms tend to have good accuracy, they also require relatively long training times. POLYCLASS, for example, is third last in terms of median training time. It often requires hours of training compared to seconds for other algorithms. The QUEST and logistic regression algorithms are substantially faster. Among decision tree algorithms with univariate splits, C4.5, IND-CART, and QUEST have the best combinations of error rate and speed. But C4.5 tends to produce trees with twice as many leaves as those from IND-CART and QUEST.  相似文献   

15.
NeC4.5: neural ensemble based C4.5   总被引:5,自引:0,他引:5  
Decision tree is with good comprehensibility while neural network ensemble is with strong generalization ability. These merits are integrated into a novel decision tree algorithm NeC4.5. This algorithm trains a neural network ensemble at first. Then, the trained ensemble is employed to generate a new training set through replacing the desired class labels of the original training examples with those output from the trained ensemble. Some extra training examples are also generated from the trained ensemble and added to the new training set. Finally, a C4.5 decision tree is grown from the new training set. Since its learning results are decision trees, the comprehensibility of NeC4.5 is better than that of neural network ensemble. Moreover, experiments show that the generalization ability of NeC4.5 decision trees can be better than that of C4.5 decision trees.  相似文献   

16.
Credit scoring is the term used to describe methods utilized for classifying applicants for credit into classes of risk. This paper evaluates two induction approaches, rough sets and decision trees, as techniques for classifying credit (business) applicants. Inductive learning methods, like rough sets and decision trees, have a better knowledge representational structure than neural networks or statistical procedures because they can be used to derive production rules. If decision trees have already been used for credit granting, the rough sets approach is rarely utilized in this domain. In this paper, we use production rules obtained on a sample of 1102 business loans in order to compare the classification abilities of the two techniques. We show that decision trees obtain better results with 87.5% of good classifications with a pruned tree, against 76.7% for rough sets. However, decision trees make more type–II errors than rough sets, but fewer type–I errors.  相似文献   

17.
Rough sets for adapting wavelet neural networks as a new classifier system   总被引:2,自引:2,他引:0  
Classification is an important theme in data mining. Rough sets and neural networks are two techniques applied to data mining problems. Wavelet neural networks have recently attracted great interest because of their advantages over conventional neural networks as they are universal approximations and achieve faster convergence. This paper presents a hybrid system to extract efficiently classification rules from decision table. The neurons of such hybrid network instantiate approximate reasoning knowledge gleaned from input data. The new model uses rough set theory to help in decreasing the computational effort needed for building the network structure by using what is called reduct algorithm and a rules set (knowledge) is generated from the decision table. By applying the wavelets, frequencies analysis, rough sets and dynamic scaling in connection with neural network, novel and reliable classifier architecture is obtained and its effectiveness is verified by the experiments comparing with traditional rough set and neural networks approaches.  相似文献   

18.
潜在属性空间树分类器   总被引:2,自引:0,他引:2  
何萍  徐晓华  陈崚 《软件学报》2009,20(7):1735-1745
提出一种潜在属性空间树分类器(latent attribute space tree classifier,简称LAST)框架,通过将原属性空间变换到更容易分离数据或更符合决策树分类特点的潜在属性空间,突破传统决策树算法的决策面局限,改善树分类器的泛化性能.在LAST 框架下,提出了两种奇异值分解斜决策树(SVD (singular value decomposition) oblique decision tree,简称SODT)算法,通过对全局或局部数据进行奇异值分解,构建正交的潜在属性空间,然后在潜在属性空间内构建传统的单变量决策树或树节点,从而间接获得原空间内近似最优的斜决策树.SODT 算法既能够处理整体数据与局部数据分布相同或不同的数据集,又可以充分利用有标签和无标签数据的结构信息,分类结果不受样本随机重排的影响,而且时间复杂度还与单变量决策树算法相同.在复杂数据集上的实验结果表明,与传统的单变量决策树算法和其他斜决策树算法相比,SODT 算法的分类准确率更高,构建的决策树大小更稳定,整体分类性能更鲁棒,决策树构建时间与C4.5 算法相近,而远小于其他斜决策树算法.  相似文献   

19.
混合型学习模型HLM中的增量学习算法   总被引:4,自引:0,他引:4  
混合型学习模型HLM将概念获取算法HMCAP和神经网络算法FTART有机结合,能学习多概念和连续属性,其增量学习算法建立在二叉混合判定树结构和FTART网络的基础上,在给系统增加新的实例时,只需进行一遍增量学习调整原结构,不用重新生成判定树和神经网络,即可提高学习精度,速度快、效率高.本文主要介绍该模型中的增量学习算法.  相似文献   

20.
A shortcoming of univariate decision tree learners is that they do not learn intermediate concepts and select only one of the input features in the branching decision at each intermediate tree node. It has been empirically demonstrated that cascading other classification methods, which learn intermediate concepts, with decision tree learners can alleviate such representational bias of decision trees and potentially improve classification performance. However, a more complex model that fits training data better may not necessarily perform better on unseen data, commonly referred to as the overfitting problem. To find the most appropriate degree of such cascade generalization, a decision forest (i.e., a set of decision trees with other classification models cascaded to different degrees) needs to be generated, from which the best decision tree can then be identified. In this paper, the authors propose an efficient algorithm for generating such decision forests. The algorithm uses an extended decision tree data structure and constructs any node that is common to multiple decision trees only once. The authors have empirically evaluated the algorithm using 32 data sets for classification problems from the University of California, Irvine (UCI) machine learning repository and report on results demonstrating the efficiency of the algorithm in this paper.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号