首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
决策树算法及其在乳腺疾病图像数据挖掘中的应用   总被引:5,自引:1,他引:5  
介绍了ID3决策树算法建立决策树的基本原理,着重介绍了决策树的修剪问题和两种典型的修剪算法-减少分类错误修剪算法和最小代价-复杂度修剪算法,并利用介绍的决策树算法和修剪算法对乳腺疾病图像进行数据挖掘,得到了一些有实际参考价值的规则,获得了很高的分类准确率,证明了决策树算法在医学图像数据挖掘领域有着广泛的应用前景。  相似文献   

2.
In this explorative research, we aim to find the most important service experience variables that determine customer purchasing decision and the clerks’ influence on customers’ purchases. This study was conducted as a case study of a children’s apparel company, denoted Company L, which has 243 retail stores. Company L has implemented Point of Sale (POS) systems in its retail stores, and would like to know what functions could be added to induce storefront employees to deliver better customer service. We, therefore, focus on observing the services provided by storefront employees and their reflection on a customer’s purchasing decision in a retail store. The study generated decision trees via Weka, a data mining open source software platform, to analyze multiple data sources to (1) understand what makes a good service experience for a customer, (2) get explicit knowledge from service encounter information, and (3) externalize the tacit knowledge of storefront service experiences. These findings can be used to improve Company L’s POS system to guide storefront employees to learn from trained decision rules. Moreover, the company can internalize service experience knowledge by aggregating learned rules from the company’s retail stores.  相似文献   

3.
The rapid growth of Taiwan’s economy has been accompanied by the country’s developing market for luxury products. To successfully establish the new market demand chain for the luxury industry in Taiwan, it is essential to understand customer preferences. Thus, this study uses an association rules approach and clustering analysis for data mining to mine knowledge among luxury product-buying customers in Taiwan. The results of knowledge extraction from data mining, illustrated as knowledge patterns, rules and knowledge maps, are used to make recommendations for future developments in the luxury products industry.  相似文献   

4.
刘晓平 《计算机仿真》2006,23(4):103-105,113
数据挖掘是从大量原始数据中抽取隐藏知识的过程。大部分数据挖掘工具采用规则发现和决策树分类技术来发现数据模式和规则,其核心是归纳算法。与传统统计方法相比,基于机器学习技术得到的分类结果具有较好的可解释性。在针对特定的数据集进行数据挖掘时,如果缺乏相应的领域知识,用户或决策者就很难确定选择何种归纳算法。因此,需要尝试各种算法。借助MLC++,决策者能够轻而易举地比较不同分类算法对特定数据集的有效性,从而选择合适的分类算法。同时,系统开发人员也可以利用MLC++设计各种混合算法。  相似文献   

5.
分类问题是数据挖掘中的一个重要问题,分类目的就是寻找规则,具体来说,就是从给定的数据集合中找出能把数据集划分成不相交的若干个组的规则,目前已有的在大型数据库中挖掘分类规则的数据挖掘方法,主要还是基于符号学习机制的决策树方法.本文研究了一种新型的规则抽取算法,能够从神经网络中抽取出较好的规则.  相似文献   

6.
基于值约简和决策树的最简规则提取算法   总被引:7,自引:0,他引:7  
罗秋瑾  陈世联 《计算机应用》2005,25(8):1853-1855
粗糙集理论中的值约简和数据挖掘领域中的决策树都是有效的分类方法,但二者都有其局限性。将这两种方法结合起来,生成一种新的基于值核的极小化方法对决策树进行修剪,提出了约简规则的判定准则,缩小了约简的范围,最后再对生成的规则进行极大化处理,以保证规则覆盖信息的一致性,实验验证了该算法的有效性。  相似文献   

7.
数据库、数据仓库以及其他存储信息库中潜藏着很多与商业、科学研究等活动的决策有关的数据和知识。对于数据挖掘中的数据分析,通常有两种常见的方法,即分类和预测,首先对数据库中的数据进行分类归纳,然后根据分类规则可以得到比较有价值的数据,然后我们可以根据这个数据来预测得到一些包含未来趋势的信息。在常见的分类算法中,决策树算法是一个有着很好扩展性的算法,可以应用到大型数据库中,可以对多种数据类型进行处理,分类模式容易转化为分类规则,结果也十分的浅显易懂易于理解。该文主要先介绍了几种常用的分类算法,然后具体介绍决策树算法的过程以及在分类算法实际应用中的优缺点。  相似文献   

8.
《Intelligent Data Analysis》1998,2(1-4):165-185
Classification, which involves finding rules that partition a given dataset into disjoint groups, is one class of data mining problems. Approaches proposed so far for mining classification rules from databases are mainly decision tree based on symbolic learning methods. In this paper, we combine artificial neural network and genetic algorithm to mine classification rules. Some experiments have demonstrated that our method generates rules of better performance than the decision tree approach and the number of extracted rules is fewer than that of C4.5.  相似文献   

9.
交通流量数据的分类规则挖掘   总被引:2,自引:0,他引:2  
巩帅 《计算机工程与应用》2006,42(6):219-220,232
概述了数据挖掘的分类算法,并简要介绍了C5.0决策树算法。以北京市“三横两纵”主干道交通流量数据为例,采用C5.0决策树算法提取交通流量的分类规则,用于分析交通流量规律、信息模式和数据趋势,并对分类树进行量化,为交通信号设计、路网规划、道路设计、路网节点设计等提供决策支持。  相似文献   

10.
基于神经网络的分类决策树构造   总被引:5,自引:2,他引:3  
目前基于符号处理的方法是解决分类规则提取问题的主要方法,而基于神经网络的连接主义方法则用的不多,其主要原因在于虽然神经网络的分类精度高,但难于提取其所隐含的分类规则与知识.针对这个问题,结合神经网络的具体特点,该文提出了一种基于神经网络的构造分类决策树的新方法.该方法通过神经网络训练建立各属性与分类结果之间的关系,进而通过提取各属性与分类结果之间的导数关系来建立分类决策树.给出了具体的决策树构造算法.同时为了提高神经网络所隐含关系的提取效果,提出了关系强化约束的概念并建立了具体的模型.实际应用结果证明了算法的有效性.  相似文献   

11.
Decision trees have been widely used in data mining and machine learning as a comprehensible knowledge representation. While ant colony optimization (ACO) algorithms have been successfully applied to extract classification rules, decision tree induction with ACO algorithms remains an almost unexplored research area. In this paper we propose a novel ACO algorithm to induce decision trees, combining commonly used strategies from both traditional decision tree induction algorithms and ACO. The proposed algorithm is compared against three decision tree induction algorithms, namely C4.5, CART and cACDT, in 22 publicly available data sets. The results show that the predictive accuracy of the proposed algorithm is statistically significantly higher than the accuracy of both C4.5 and CART, which are well-known conventional algorithms for decision tree induction, and the accuracy of the ACO-based cACDT decision tree algorithm.  相似文献   

12.
运用高校学生成绩、学籍等相关数据,创建高校学生成绩分析的数据仓库,运用ID3算法实现基于学生成绩等级的决策树挖掘模型的构建,由决策树提取分类规则,并利用Analysis Services工具进行挖掘验证.  相似文献   

13.
An important issue in text mining is how to make use of multiple pieces knowledge discovered to improve future decisions. In this paper, we propose a new approach to combining multiple sets of rules for text categorization using Dempster’s rule of combination. We develop a boosting-like technique for generating multiple sets of rules based on rough set theory and model classification decisions from multiple sets of rules as pieces of evidence which can be combined by Dempster’s rule of combination. We apply these methods to 10 of the 20-newsgroups—a benchmark data collection (Baker and McCallum 1998), individually and in combination. Our experimental results show that the performance of the best combination of the multiple sets of rules on the 10 groups of the benchmark data is statistically significant and better than that of the best single set of rules. The comparative analysis between the Dempster–Shafer and the majority voting (MV) methods along with an overfitting study confirm the advantage and the robustness of our approach.  相似文献   

14.
In the financial industry, continually changing economic conditions and characteristics involving uncertainty and risk have made financial forecasts even more difficult, increasing the need for more reliable ways to forecast a bank’s operating performance. However, early related studies of performance analysis for using statistical methods usually become more complex when relationships in input/output data are nonlinear. Furthermore, strict data assumptions, such as linearity, normality, and independence, limit real-world applications often. Additionally, a drawback of traditional rough sets is that data must be discretized first for improving classification accuracy. To remedy the existing shortcomings above, the study proposes a hybrid procedure, which mixes professional knowledge, an attribute granularity, and a rough sets classifier, for automatically classifying profit growth rate (PGR) to solve real problems faced by investors. The proposed procedure is illustrated by examining a practical dataset for publicly traded financial holding stocks in Taiwan‘s stock markets. The experimental results reveal that the proposed procedure outperforms listing methods in terms of accuracy, and they provide useful insights in responsiveness to rapidly changing stock market conditions. Importantly, the output created by the rough sets LEM2 (Learning from Examples Module, version 2) algorithm is a set of comprehensible rules applied in a knowledge-based investment system for investors.  相似文献   

15.
丁春荣  李龙澍 《微机发展》2007,17(11):110-113
决策树是数据挖掘任务中分类的常用方法。在构造决策树的过程中,分离属性的选择标准直接影响到分类的效果,传统的决策树算法往往是基于信息论度量的。基于粗糙集的理论提出了一种基于属性重要度和依赖度为属性选择标准的决策树规则提取算法。使用该算法,能提取出明确的分类规则,比传统的ID3算法结构简单,并且能提高分类效率。  相似文献   

16.
文章首先对Sql Server 2008商业智能平台及决策树技术相关理论作了介绍,然后对挖掘数据源数据进行了一系列预处理。利用Sql Server 2008商业智能平台下的决策树技术并采用数据挖掘扩展语言即DMX语言创建了CET-4成绩分析决策树模型。分类矩阵和挖掘提升图分别对该模型的评估结果表明了模型具有较高的可靠性和分类准确度,同时模型对应的一些规则可作为英语教学管理和改革的重要参考依据。  相似文献   

17.
The recent deregulation of telecommunication industry by the Taiwanese government has brought about the acute competition for Internet Service Providers (ISP). Taiwan’s ISP industry is characterized by the heavy pressure for raising revenue after hefty capital investments of last decade and the lack of knowledge to develop competitive strategies. To attract subscribers, all ISP dealers are making an all-out effort to improve their service management. This study proposes a Business Intelligence process for ISP dealers in Taiwan to assist management in developing effective service management strategies. We explore the customers’ usage characteristics and preference knowledge through applying the attribute-oriented induction (AOI) method on IP traffic data of users. Using the self-organizing map (SOM) method, we are able to divide customers into clusters with different usage behavior patterns. We then apply RFM modeling to calibrate customers’ value of each cluster, which will enable the management to develop direct and effective marketing strategies. For network resource management, this research mines the facility utilization over various administrative districts of the region, which could assist management in planning for effective network facilities investment. With actual data from one major ISP, we develop a BI decision support system with visual presentation, which is well received by its management staff.  相似文献   

18.
决策树是归纳学习和数据挖掘的重要方法,主要用于分类和预测。文章引入了广义决策树的概念,实现了分类规则集和决策树结构的统一。同时,提出一种新颖的基于DNA编码遗传算法构造决策树的方法。先用C4.5算法对数据集进行分类得到初始规则集,再通过文章中算法优化规则集并由此构建决策树。实验证明了该方法有效地避免了传统决策树构建过程的缺点,且有较好的并行性。  相似文献   

19.
传统关联规则挖掘在面临分类决策问题时,易出现非频繁规则遗漏、预测精度不高的问题。为得到正确合理且更为完整的规则,提出了一种改进方法 DT-AR(decision tree-association rule algorithm),利用决策树剪枝策略对关联规则集进行补充。该方法利用FP-Growth(frequent pattern growth)算法得到关联规则集,利用C4.5算法构建后剪枝决策树并提取分类规则,在进行置信度迭代筛选后与关联规则集取并集修正,利用置信度作为权重系数采取投票法进行分类。实验结果表明,与传统关联规则挖掘和决策树剪枝方法相比,该方法得到的规则在数据集分类结果上更准确。  相似文献   

20.
身份认证是网络安全中一个重要问题,本文结合了Web日志挖掘和决策树分类这两方面的知识,提出了一种新的认证方式,个性化身份验证,在用户登陆系统后可以对其身份进行二次验证。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号