共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
针对目标变量为二进制的数据集合进行分类,提出一种新的基于误差模型的混合分类方法,可以提高分类的精度。采用实际数据集作为测试数据,结果表明本文提出的算法性能优于其他的混合算法以及现有的单一使用的分类方法,尤其是当2种方法预测不一致的比率较高时,利用该方法能够显著地改善预测的准确性。 相似文献
3.
刘红梅 《数字社区&智能家居》2009,(3)
分析、比较了当前具有代表性的分类关联算法,总结了关联规则分类存在的问题,便于使用者根据需要选择合适的算法,也便于研究者对算法进行研究改进,提出性能更好的分类算法。 相似文献
4.
Using Model Trees for Classification 总被引:1,自引:0,他引:1
Model trees, which are a type of decision tree with linear regression functions at the leaves, form the basis of a recent successful technique for predicting continuous numeric values. They can be applied to classification problems by employing a standard method of transforming a classification problem into a problem of function approximation. Surprisingly, using this simple transformation the model tree inducer M5, based on Quinlan's M5, generates more accurate classifiers than the state-of-the-art decision tree learner C5.0, particularly when most of the attributes are numeric. 相似文献
5.
文章应用数据挖掘技术和基于构件的软件开发方法,提出了一个基于关联规则的、适合于各种不同领域的构件化企业数据挖掘系统开发模型。在此基础上,实现了构件化企业资源分析系统的设计。此外,还对CRAS系统的架构、总体设计以及关键技术进行了详细的讨论,并对系统的应用及一些相关的问题进行了分析,最后得出结论。 相似文献
6.
Lazy Learning of Bayesian Rules 总被引:19,自引:0,他引:19
The naive Bayesian classifier provides a simple and effective approach to classifier learning, but its attribute independence assumption is often violated in the real world. A number of approaches have sought to alleviate this problem. A Bayesian tree learning algorithm builds a decision tree, and generates a local naive Bayesian classifier at each leaf. The tests leading to a leaf can alleviate attribute inter-dependencies for the local naive Bayesian classifier. However, Bayesian tree learning still suffers from the small disjunct problem of tree learning. While inferred Bayesian trees demonstrate low average prediction error rates, there is reason to believe that error rates will be higher for those leaves with few training examples. This paper proposes the application of lazy learning techniques to Bayesian tree induction and presents the resulting lazy Bayesian rule learning algorithm, called LBR. This algorithm can be justified by a variant of Bayes theorem which supports a weaker conditional attribute independence assumption than is required by naive Bayes. For each test example, it builds a most appropriate rule with a local naive Bayesian classifier as its consequent. It is demonstrated that the computational requirements of LBR are reasonable in a wide cross-section of natural domains. Experiments with these domains show that, on average, this new algorithm obtains lower error rates significantly more often than the reverse in comparison to a naive Bayesian classifier, C4.5, a Bayesian tree learning algorithm, a constructive Bayesian classifier that eliminates attributes and constructs new attributes using Cartesian products of existing nominal attributes, and a lazy decision tree learning algorithm. It also outperforms, although the result is not statistically significant, a selective naive Bayesian classifier. 相似文献
7.
数据挖掘综述 总被引:1,自引:0,他引:1
方元康 《数字社区&智能家居》2007,(9):1189-1190,1199
综述了关联规则、分类与预测、聚类分析等数据挖掘的三个主要功能,最后描述了数据挖掘的发展前景。 相似文献
8.
A standard approach to determining decision trees is to learn them from examples. A disadvantage of this approach is that once a decision tree is learned, it is difficult to modify it to suit different decision making situations. Such problems arise, for example, when an attribute assigned to some node cannot be measured, or there is a significant change in the costs of measuring attributes or in the frequency distribution of events from different decision classes. An attractive approach to resolving this problem is to learn and store knowledge in the form of decision rules, and to generate from them, whenever needed, a decision tree that is most suitable in a given situation. An additional advantage of such an approach is that it facilitates buildingcompact decision trees, which can be much simpler than the logically equivalent conventional decision trees (by compact trees are meant decision trees that may contain branches assigned aset of values, and nodes assignedderived attributes, i.e., attributes that are logical or mathematical functions of the original ones). The paper describes an efficient method, AQDT-1, that takes decision rules generated by an AQ-type learning system (AQ15 or AQ17), and builds from them a decision tree optimizing a given optimality criterion. The method can work in two modes: thestandard mode, which produces conventional decision trees, andcompact mode, which produces compact decision trees. The preliminary experiments with AQDT-1 have shown that the decision trees generated by it from decision rules (conventional and compact) have outperformed those generated from examples by the well-known C4.5 program both in terms of their simplicity and their predictive accuracy. 相似文献
9.
朱喜梅 《数字社区&智能家居》2006,(2):36-37
关联规则挖掘则是数据挖掘中最重要的分支之一。它着重研究大量数据中项集之间有趣的关联或相关关系,一个典型的例子就是购物篮分析。该过程可以分析出哪些商品顾客倾向于在一起购买,从而可以为商店经理提供比较好的商店布局方式。例如,通过分析,我们发现,顾客在购买了一台计算机以后,一般都会去购买财务管理软件,那么我们就可以把计算机和财务管理软件放在比较近的位置,以增加销售量。这里主要介绍了关联规则挖掘的经典算法,Apriori算法,同时给出了关联规则中的基本概念,然后分析了算法的运行效率。提出了改进的方法。 相似文献
10.
11.
随着基于机器学习的文本自动分类方法成为主流分类技术,基于机器学习的文本分类方法往往忽视了对规则分类方法的有效运用。该文将基于规则的分类思想和基于机器学习的分类方法有机地结合起来,把规则判别看作一个分量分类器,提出了一种辅以规则补充的双层文本分类模型和一种优化的分类规则学习算法。根据该方法设计并实现了一个基于规则和N-Gram统计分类相结合的双层分类器,进行了双层分类模型与单独的N-Gram分类模型的实验,结果表明辅以规则补充的双层分类器具有更好的分类性能。 相似文献
12.
交通流量数据的分类规则挖掘 总被引:2,自引:0,他引:2
巩帅 《计算机工程与应用》2006,42(6):219-220,232
概述了数据挖掘的分类算法,并简要介绍了C5.0决策树算法。以北京市“三横两纵”主干道交通流量数据为例,采用C5.0决策树算法提取交通流量的分类规则,用于分析交通流量规律、信息模式和数据趋势,并对分类树进行量化,为交通信号设计、路网规划、道路设计、路网节点设计等提供决策支持。 相似文献
13.
关联规则挖掘与分类规则挖掘的比较研究 总被引:1,自引:0,他引:1
关联规则挖掘与分类规则挖掘都是数据挖掘,领域中很重要的技术。本文首先简要介绍了关联规则挖掘和分类规则挖掘的基本知识,主要从挖掘目的、发现规则算法的方法、算法的设计思想等几个方面对它们进行了比较,最后介绍了它们之间的联系。 相似文献
14.
在图像关联规则挖掘的某些领域,要求提取出具有较高置信度的关联规则,同时对支持度的要求相对较低。提出了一种在兼顾支持度的情况下挖掘出高置信度的图像关联规则的方法。为了便于有效地提取图像关联规则,使用了名为bSQ(bit Sequential)的一种栅格数据格式。而后采取“逐层搜索”的方法,建立规则树,避免了传统方法在处理低支持度时产生的大量频繁项集。最后通过多图像关联规则提取优先级和图像数据立方体等技术在多幅图像中提取基于象素级的关联规则。通过实验证明,该方法能有效地提取图像数据高置信度关联规则,方法具有可行性。 相似文献
15.
一种集成数据挖掘的自动视频分类方法 总被引:1,自引:0,他引:1
针对自动视频分类工作中分类预测精度低的问题,提出了一种集成数据挖掘技术的自动视频分类方法。首先进行视频分割,形成了一个视频属性数据库;然后分别使用决策树、分类关联规则等技术对视频属性数据库进行数据挖掘,提取出决策树分类规则集和分类关联规则集;最后利用一个规则集的合并裁减算法来合并这两个分类预测规则集,形成最终的具有更高精度的视频分类规则集。通过实验验证了决策树分类预测规则和分类关联规则具有分类预测的一致性;同时实验表明,使用合并后的规则集比单独使用一个规则集来预测视频具有更高的预测准确率。 相似文献
16.
Functional Trees 总被引:1,自引:0,他引:1
João Gama 《Machine Learning》2004,55(3):219-250
In the context of classification problems, algorithms that generate multivariate trees are able to explore multiple representation languages by using decision tests based on a combination of attributes. In the regression setting, model trees algorithms explore multiple representation languages but using linear models at leaf nodes. In this work we study the effects of using combinations of attributes at decision nodes, leaf nodes, or both nodes and leaves in regression and classification tree learning. In order to study the use of functional nodes at different places and for different types of modeling, we introduce a simple unifying framework for multivariate tree learning. This framework combines a univariate decision tree with a linear function by means of constructive induction. Decision trees derived from the framework are able to use decision nodes with multivariate tests, and leaf nodes that make predictions using linear functions. Multivariate decision nodes are built when growing the tree, while functional leaves are built when pruning the tree. We experimentally evaluate a univariate tree, a multivariate tree using linear combinations at inner and leaf nodes, and two simplified versions restricting linear combinations to inner nodes and leaves. The experimental evaluation shows that all functional trees variants exhibit similar performance, with advantages in different datasets. In this study there is a marginal advantage of the full model. These results lead us to study the role of functional leaves and nodes. We use the bias-variance decomposition of the error, cluster analysis, and learning curves as tools for analysis. We observe that in the datasets under study and for classification and regression, the use of multivariate decision nodes has more impact in the bias component of the error, while the use of multivariate decision leaves has more impact in the variance component. 相似文献
17.
18.
If we lack relevant problem-specific knowledge, cross-validation methods may be used to select a classification method empirically. We examine this idea here to show in what senses cross-validation does and does not solve the selection problem. As illustrated empirically, cross-validation may lead to higher average performance than application of any single classification strategy, and it also cuts the risk of poor performance. On the other hand, cross-validation is no more or less a form of bias than simpler strategies, and applying it appropriately ultimately depends in the same way on prior knowledge. In fact, cross-validation may be seen as a way of applying partial information about the applicability of alternative classification strategies. 相似文献
19.
关联规则挖掘综述 总被引:2,自引:0,他引:2
朱喜梅 《数字社区&智能家居》2006,(5)
关联规则挖掘则是数据挖掘中最重要的分支之一。它着重研究大量数据中项集之间有趣的关联或相关关系,一个典型的例子就是购物篮分析。该过程可以分析出哪些商品顾客倾向于在一起购买,从而可以为商店经理提供比较好的商店布局方式。例如,通过分析,我们发现,顾客在购买了一台计算机以后,一般都会去购买财务管理软件,那么我们就可以把计算机和财务管理软件放在比较近的位置,以增加销售量。这里主要介绍了关联规则挖掘的经典算法,Apriori算法,同时给出了关联规则中的基本概念,然后分析了算法的运行效率,提出了改进的方法。 相似文献
20.
数据挖掘技术关联规划算法在营销策略中的应用 总被引:1,自引:0,他引:1
本文首先简要介绍了当前数据挖掘的产生背景及该领域内的主要数据挖掘技术,然后通过实例重点介绍知识类数据挖掘技术中应用较广的关联规则算法,及其在营销策略中的应用。 相似文献