首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
数据挖掘就是从海量的数据中挖掘出可能有潜在价值的信息的技术。决策树方法是一种典型的分类算法.首先对数据进行处理,利用归纳算法生成可读的规则和决策树模型,然后使用决策树模型对新数据进行分析。该文以大学生专业方向指导辅助系统的开发过程为实例从理论上论述了数据挖掘的概念、数据挖掘研究内容和本质以及进行数据挖掘的主要方法。讲述了使用MATLAB7.0开发实现决策树算法子系统的方法和实现,并且对生成的决策树模型进行分析。  相似文献   

2.
本文从数据挖掘的概念与步骤出发,针对数据挖掘的一般应用,采用决策树的方法,建立了通用的决策树模型.给出了模型中的两类数据和六个操作的一般描述,模型的重点是决策树建立、决策树剪技和规则集推导.  相似文献   

3.
数据挖掘是一种对大量信息进行分析和处理的技术,银行客户关系管理系统使用数据挖掘技术实施有效的客户关系管理,为企业决策提供有价值的信息,提高市场竞争能力。决策树方法是数据挖掘技术常用的分类方法,本文对决策树ID3算法的进行了介绍,阐述了数据挖掘技术在银行客户关系管理系统的应用。  相似文献   

4.
基于信息增益法的决策树构造方法   总被引:6,自引:1,他引:6  
决策树数据挖掘技术是目前最有影响和使用最多的一种数据挖掘技术。决策树构造的方法很多,本文提出一种基于信息增益法的决策树构造方法。给出了相应的决策树构造算法,并通过一个实例对其进行了说明。最后,本文对噪声问题、子树复制和碎叶等问题提出了解决思路。  相似文献   

5.
基于数据挖掘的决策树方法分析   总被引:1,自引:0,他引:1  
决策树方法因其简单、直观、准确率高等特点在数据挖掘及数据分析中得到了广泛的应用。在介绍了决策树方法的一般知识后,深入分析了决策树的生成算法与模型,并对决策树的剪枝过程进行了探讨。  相似文献   

6.
并行决策树算法的研究   总被引:5,自引:0,他引:5  
数据挖掘在科研和商业应用中正发挥着越来越重要的作用。随着数据量的增加,数据挖掘工具处理海量数据的能力问题显得日益突出。研究并行算法,是解决这个问题的有效途径。分类器是数据挖掘的一种基本方法,决策树是一种最重要的分类器。文章首先介绍了分类器中的决策树算法,然后设计了一种并行决策树算法,最后探讨了该并行算法在PVM系统下的实现。  相似文献   

7.
数据挖掘是一种重要的数据分析方法,决策树是数据挖掘中的一种主要技术,如何构造出最优决策树是许多研究者关心的问题。本文通过Rough集方法对决策表进行属性约简和属性值约简,去除决策表中与决策无关的冗余信息。在简化的决策表基础上构造近似最优决策树,本文给出了近似最优决策树的生成算法,并通过实例说明。  相似文献   

8.
保持隐私的决策树的生成过程研究   总被引:1,自引:0,他引:1  
路慧萍  童学锋 《计算机应用》2005,25(6):1382-1384
介绍了保持隐私的数据挖掘技术,研究了决策树分类器在保持隐私的数据挖掘中的应用。在传统的决策树算法中引入标量积协议,既保持决策树算法本身的优点,又满足了保持隐私的需求。  相似文献   

9.
简要回顾了数据挖掘的应用背景和常用的数据挖掘方法,重点研究了数据挖掘方法中的决策树算法,并对其主要成就进行评述,提出今后开展研究的建议。  相似文献   

10.
远程教育考试成绩分析决策树的构造方法   总被引:5,自引:2,他引:5  
介绍了数据挖掘技术在远程教育学生考试成绩分析上的应用和用ID3算法构造决策树的方法,并结合一组学生考试成绩样本数据,采用决策树分析方法进行了分类,给出了一个远程教育中成功应用数据挖掘的思路和模式。  相似文献   

11.
本文首先阐述了数据挖掘中决策树的基本思想,然后简单介绍了决策树经典算法(ID3算法),重点基于ID3算法论述了对于决策树的影响4个要素,并使用真实的数据详细地分析了4个要素,实验表明,只要4个要素中的任何一个改变,决策树必须要重新被构建。  相似文献   

12.
We present a novel Fourier analysis-based approach to combine, transmit, and visualize decision trees in a mobile environment. Fourier representation of a decision tree has several interesting properties that are particularly useful for mining data streams from small mobile computing devices connected through limited-bandwidth wireless networks. We present algorithms to compute the Fourier spectrum of a decision tree and outlines a technique to construct a decision tree from its Fourier spectrum. It offers a framework to aggregate decision trees in their Fourier representations. It also describes the MobiMine, a mobile data stream mining system, that uses the developed techniques for mining stock-market data from handheld devices.  相似文献   

13.
Mining with streaming data is a hot topic in data mining. When performing classification on data streams, traditional classification algorithms based on decision trees, such as ID3 and C4.5, have a relatively poor efficiency in both time and space due to the characteristics of streaming data. There are some advantages in time and space when using random decision trees. An incremental algorithm for mining data streams, SRMTDS (Semi-Random Multiple decision Trees for Data Streams), based on random decision trees is proposed in this paper. SRMTDS uses the inequality of Hoeffding bounds to choose the minimum number of split-examples, a heuristic method to compute the information gain for obtaining the split thresholds of numerical attributes, and a Naive Bayes classifier to estimate the class labels of tree leaves. Our extensive experimental study shows that SRMTDS has an improved performance in time, space, accuracy and the anti-noise capability in comparison with VFDTc, a state-of-the-art decision-tree algorithm for classifying data streams.  相似文献   

14.
This paper introduces orthogonal decision trees that offer an effective way to construct a redundancy-free, accurate, and meaningful representation of large decision-tree-ensembles often created by popular techniques such as bagging, boosting, random forests, and many distributed and data stream mining algorithms. Orthogonal decision trees are functionally orthogonal to each other and they correspond to the principal components of the underlying function space. This paper offers a technique to construct such trees based on the Fourier transformation of decision trees and eigen-analysis of the ensemble in the Fourier representation. It offers experimental results to document the performance of orthogonal trees on the grounds of accuracy and model complexity.  相似文献   

15.
首先剖析数据挖掘系统与数据库及数据仓库之间的关系,然后提出了内嵌式数据挖掘系统的思想、实现方法及其关键技术,描述了使用决策树算法对内嵌式数据挖掘系统的实现方法,并展望了内嵌式数据挖掘系统的未来.  相似文献   

16.
Because clinical research is carried out in complex environments, prior domain knowledge, constraints, and expert knowledge can enhance the capabilities and performance of data mining. In this paper we propose an unexpected pattern mining model that uses decision trees to compare recovery rates of two different treatments, and to find patterns that contrast with the prior knowledge of domain users. In the proposed model we define interestingness measures to determine whether the patterns found are interesting to the domain. By applying the concept of domain-driven data mining, we repeatedly utilize decision trees and interestingness measures in a closed-loop, in-depth mining process to find unexpected and interesting patterns. We use retrospective data from transvaginal ultrasound-guided aspirations to show that the proposed model can successfully compare different treatments using a decision tree, which is a new usage of that tool. We believe that unexpected, interesting patterns may provide clinical researchers with different perspectives for future research.  相似文献   

17.
The decision tree-based classification is a popular approach for pattern recognition and data mining. Most decision tree induction methods assume training data being present at one central location. Given the growth in distributed databases at geographically dispersed locations, the methods for decision tree induction in distributed settings are gaining importance. This paper describes one such method that generates compact trees using multifeature splits in place of single feature split decision trees generated by most existing methods for distributed data. Our method is based on Fisher's linear discriminant function, and is capable of dealing with multiple classes in the data. For homogeneously distributed data, the decision trees produced by our method are identical to decision trees generated using Fisher's linear discriminant function with centrally stored data. For heterogeneously distributed data, a certain approximation is involved with a small change in performance with respect to the tree generated with centrally stored data. Experimental results for several well-known datasets are presented and compared with decision trees generated using Fisher's linear discriminant function with centrally stored data.  相似文献   

18.
Univariate decision trees are classifiers currently used in many data mining applications. This classifier discovers partitions in the input space via hyperplanes that are orthogonal to the axes of attributes, producing a model that can be understood by human experts. One disadvantage of univariate decision trees is that they produce complex and inaccurate models when decision boundaries are not orthogonal to axes. In this paper we introduce the Fisher’s Tree, it is a classifier that takes advantage of dimensionality reduction of Fisher’s linear discriminant and uses the decomposition strategy of decision trees, to come up with an oblique decision tree. Our proposal generates an artificial attribute that is used to split the data in a recursive way.The Fisher’s decision tree induces oblique trees whose accuracy, size, number of leaves and training time are competitive with respect to other decision trees reported in the literature. We use more than ten public available data sets to demonstrate the effectiveness of our method.  相似文献   

19.
决策树是数据挖掘中常用的分类方法。针对高等院校学生就业问题中出现由噪声造成的不一致性数据,本文提出了基于变精度粗糙集的决策树模型,并应用于学生就业数据分析。该方法以变精度粗糙集的分类质量的量度作为信息函数,对条件属性进行选择,作为树的节点,自上而下地分割数据集,直到满足某种终止条件。它充分考虑了属性间的依赖性和冗余性,允许在构造决策树的过程中划入正域的实例类别存在一定的不一致性。实验表明,该算法能够有效地处理不一致性数据集,并能正确合理地将就业数据分类,最终得到若干有价值的结论,供决策分析。该算法大大提高了决策规则的泛化能力,减化了树的结构。  相似文献   

20.
在数据挖掘中,分期是一个很重要的问题,有很多流行的分类器可以创建决策树木产生类模型。本文介绍了通过信息增益或熵的比较来构造一棵决策树的数桩挖掘算法思想,给出了用粗糙集理论构造决策树的一种方法,并用曲面造型方面的实例说明了决策树的生成过程。通过与ID3方法的比较,该种方法可以降低决策树的复杂性,优化决策树的结构,能挖掘较好的规则信息。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号