首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
关联规则是数据挖掘的重要研究内容之一.传统的关联规则挖掘算法仅适于处理二元属性与分类属性.为更好地处理数量属性,提出了一种自适应的基于模糊概念的量化关联规则挖掘算法.该算法克服了传统的离散分区法的不足,改进了已有模糊关联规则支持度的计算方法.引入了一种基于聚类的隶属函数自动生成方法,使得模糊关联规则的发现不依赖于人类专家给出的隶属函数,使得关联规则的表示自然、简明,有利于专家理解.实验表明该算法是有效的.  相似文献   

2.
关联规则是数据挖掘的重要研究内容之一。传统的关联规则挖掘算法仅适于处理二元属性与分类属性。为更好地处理数量属性,提出了一种自适应的基于模糊概念的量化关联规则挖掘算法。该算法克服了传统的离散分区法的不足,改进了已有模糊关联规则支持度的计算方法。引入了一种基于聚类的隶属函数自动生成方法,使得模糊关联规则的发现不依赖于人类专家给出的隶属函数,使得关联规则的表示自然、简明,有利于专家理解。实验表明该算法是有效的。  相似文献   

3.
在信息化评估过程中,传统关联分类算法无法优先发现短规则,且分类精度对规则次序的依赖较强。为此,提出基于子集支持度和多规则分类的关联分类算法,将训练集按待分类属性归类,利用子集支持度挖掘关联规则,通过计算类平均支持度对测试集进行分类。实验结果表明,该算法发现规则的能力和分类精度均优于传统方法。  相似文献   

4.
针对监控视频下的行人多属性识别问题,提出一种结合神经网络与关联规则的多分类方法。首先通过Faster-RCNN检测算法与改进的AlexNet多分类网络得到监控视频下行人各个属性的置信度,再采用关联规则Apriori算法对训练数据进行处理,进而结合神经网络分类的置信度和关联规则的处理结果,提出一种对分类置信度进行优化的算法。最后,统计关联规则优化后的某些行人属性准确率。结果表明,将神经网络与关联规则有效结合后可以提升某些属性识别的准确率。  相似文献   

5.
基于信息增益的中文文本关联分类   总被引:1,自引:0,他引:1  
关联分类是一种通过挖掘训练集中的关联规则,并利用这些规则预测新数据类属性的分类技术。最近的研究表明,关联分类取得了比传统的分类方法如C4.5更高的准确率。现有的基于支持度-置信度架构的关联分类方法仅仅是选择频繁文字构建分类规则,忽略了文字的分类有效性。本文提出一种新的ACIG算法,结合信息增益与FoilGain在中文文本中选择规则的文字,以提高文字的分类有效性。实验结果表明,ACIG算法比其他关联分类算法(CPAR)有更高的准确率。  相似文献   

6.
基于关联规则的特征选择算法   总被引:2,自引:0,他引:2  
关联规则能够发现数据库中属性之间的关联,通过优先选择短规则用于相关属性的选择,有可能得到最小的属性子集.基于此,本文提出一种基于关联规则的特征选择算法,实验结果表明在属性子集大小和分类精度上优于多种特征选择方法.同时,对支持度和置信度对算法效果的影响进行探索,结果表明高的支持度和置信度并不导致高的分类精度和小的特征子集,而充足的规则数是基于关联规则特征选择算法高效的必要条件.  相似文献   

7.
CBA算法是将关联规则挖掘与分类技术相结合的一种分类算法,在许多领域中得到了广泛应用.针对CBA处理海量数据效率低的缺点,提出了一个改进的CBA算法.该算法将粗糙集理论应用到CBA算法中,对决策表进行属性约简,提高了分类关联规则的生成效率;并应用PEP(pessimistic error pruning)方法对候选规则进行剪裁.实验结果表明,该算法比CBA具有更高的分类效率和准确度.  相似文献   

8.
一个最优分类关联规则算法   总被引:1,自引:0,他引:1  
分类和关联规则发现是数据挖掘中的两个重要领域。使用关联规则算法挖掘分类规则被叫做分类关联规则算法,是一个有较好前景的方法。本文提出了一个最优分类关联规则算法——OCARA。该算法使用最优关联规则挖掘算法挖掘分类规则,并对最优规则集排序,从而获得一个分类精度较高的分类器。将OCARA与传统分类算法C4.5和一般分类关联规则算法CBA、RMR在8个UCI数据集上进行实验比较,结果显示OCARA具有更好的性能,证明OCARA是一个有效的分类关联规则挖掘算法。  相似文献   

9.
增量式关联分类方法在病毒检测中的应用   总被引:2,自引:2,他引:0       下载免费PDF全文
传统关联规则挖掘算法主要基于支持度一可信度构架,时空开销的限制使其无法深入挖掘非频繁项集。171前对带类属性的关联分类增量学习研究较少,该文提出一种新的增量式关联分类方法,解决了带类属性数据的增量学习问题,在数据频繁更新时,实现有限时空开销下关联规则的快速提取和维护。实验结果表明,该方法能有效维护并更新关联规则,避免重复学习历史样本,保证分类模型的预测能力。  相似文献   

10.
传统的关联规则文本分类一般以规则的置信度作为分类准则,完全忽略了特征词的词频对分类的影响.这就导致了关联文本分类算法的性能较差,针对这个问题,在ARC-BC算法的基础上,提出了基于词频向量的关联规则文本分类算法TFARC(term frequency-based ARC),该算法引入了词频向量,重新定义了规则和文本的可信度作为分类器的分类准则,用迭代的方法求出每条规则的最佳调整因子.实验结果表明,词频的引入确实提高了关联规则文本分类的准确率.  相似文献   

11.
提出了一种方法来探测BitTorrent流量.该方法分析BitTorrent协议,识别BitTorrent协议特征,并对这些特征设定规则,使其能够被入侵探测系统识别,然后通过带有SNORT(一种开放源代码的IDS)的网络监控这些特征,达到探测数据流的目的.最后指出了探测网络流量研究的新方向.  相似文献   

12.
Data mining provides the opportunity to extract useful information from large databases. Various techniques have been proposed in this context in order to extract this information in the most efficient way. However, efficiency is not our only concern in this study. The security and privacy issues over the extracted knowledge must be seriously considered as well. By taking this into consideration, we study the procedure of hiding sensitive association rules in binary data sets by blocking some data values and we present an algorithm for solving this problem. We also provide a fuzzification of the support and the confidence of an association rule in order to accommodate for the existence of blocked/unknown values. In addition, we quantitatively compare the proposed algorithm with other already published algorithms by running experiments on binary data sets, and we also qualitatively compare the efficiency of the proposed algorithm in hiding association rules. We utilize the notion of border rules, by putting weights in each rule, and we use effective data structures for the representation of the rules so as (a) to minimize the side effects created by the hiding process and (b) to speed up the selection of the victim transactions. Finally, we study the overall security of the modified database, using the C4.5 decision tree algorithm of the WEKA data mining tool, and we discuss the advantages and the limitations of blocking.  相似文献   

13.
This paper presents a new architecture of a fuzzy decision tree based on fuzzy rules – fuzzy rule based decision tree (FRDT) and provides a learning algorithm. In contrast with “traditional” axis-parallel decision trees in which only a single feature (variable) is taken into account at each node, the node of the proposed decision trees involves a fuzzy rule which involves multiple features. Fuzzy rules are employed to produce leaves of high purity. Using multiple features for a node helps us minimize the size of the trees. The growth of the FRDT is realized by expanding an additional node composed of a mixture of data coming from different classes, which is the only non-leaf node of each layer. This gives rise to a new geometric structure endowed with linguistic terms which are quite different from the “traditional” oblique decision trees endowed with hyperplanes as decision functions. A series of numeric studies are reported using data coming from UCI machine learning data sets. The comparison is carried out with regard to “traditional” decision trees such as C4.5, LADtree, BFTree, SimpleCart, and NBTree. The results of statistical tests have shown that the proposed FRDT exhibits the best performance in terms of both accuracy and the size of the produced trees.  相似文献   

14.
提出了一种基于粗糙集理论的决策规则获取算法及实用的决策规则表示方法。产品智能设计系统中,在概念设计阶段需要决定产品的类型。一般的产品设计过程,是从分析用户给定的技术参数、设计要求开始。规则获取算法就是从设计参数中挖掘设计知识,从而确定哪种类型的产品最适合用户的要求。该算法不需要任何先验知识。为了将算法应用到实际系统中,必须解决规则表示问题。本文将关系数据库成功地应用于设计规则的表示,使设计出的专家系统推理方便,运行效率高。通过继电器智能设计实例,说明了本方法的应用。  相似文献   

15.
提出了将本体规则和关联规则的一致性维护映射到样本空间中解决的策略。通过对基于样本空间的一致性规则模型的建立和证明得出基于规则的一致性判则,并在此基础上设计了基于本体的多维关联规则一致性维护算法MARCMAO,最终得到了基于本体的具有一致性的关联规则集。基于茶叶病虫害预测本体的实验结果表明,该策略具有较高的可行性和有效性。  相似文献   

16.
The information content of rules and rule sets and its application   总被引:1,自引:1,他引:0  
The information content of rules is categorized into inner mutual information content and outer impartation information content. Actually, the conventional objective interestingness measures based on information theory are all inner mutual information, which represent the confidence of rules and the mutual information between the antecedent and consequent. Moreover, almost all of these measures lose sight of the outer impartation information, which is conveyed to the user and help the user to make decisions. We put forward the viewpoint that the outer impartation information content of rules and rule sets can be represented by the relations from input universe to output universe. By binary relations, the interaction of rules in a rule set can be easily represented by operators: union and intersection. Based on the entropy of relations, the outer impartation information content of rules and rule sets are well measured. Then, the conditional information content of rules and rule sets, the independence of rules and rule sets and the inconsistent knowledge of rule sets are defined and measured. The properties of these new measures are discussed and some interesting results are proven, such as the information content of a rule set may be bigger than the sum of the information content of rules in the rule set, and the conditional information content of rules may be negative. At last, the applications of these new measures are discussed. The new method for the appraisement of rule mining algorithm, and two rule pruning algorithms, λ-choice and RPClC, are put forward. These new methods and algorithms have predominance in satisfying the need of more efficient decision information.  相似文献   

17.
有效的安全数据采集是精准分析网络威胁的基础,当前常用的全采集、概率采集和自适应采集等采集方法,未考虑采集数据的有效性和采集数据的关联关系,消耗过多的资源,其采集收益和成本率低。针对该问题,考虑影响采集收益和成本的因素(节点特征间关系、网络拓扑关系、系统威胁状况、节点资源情况、节点相似度等),设计了一种基于规则关联的安全数据采集策略生成方法。该方法根据节点间的关联规则和系统中所发生安全事件间的关联规则,构建备选采集项,缩减数据采集范围;综合考虑采集收益和采集成本,设计最大化采集收益和最小化采集成本的多目标优化函数,基于遗传算法求解该优化函数。与常用采集方法进行比较和分析,实验结果表明所提方法12 h累计数据采集量较其他方案减少了1 000~3 000条数据记录,数据有效性较其他数据采集方案提升约4%~10%,证明了所提方法的有效性。  相似文献   

18.
正负模糊规则系统、极限学习机与图像分类   总被引:1,自引:1,他引:0       下载免费PDF全文
传统的图像分类一般只利用了图像的正规则,忽略了负规则在图像分类中的作用。Nguyen将负规则引入图像分类,提出将正负模糊规则相结合形成正负模糊规则系统,并将其用于遥感图像和自然图像的分类。实验证明,其在图像分类过程中取得了很好的效果。他们提出的前馈神经网络模型在调整权值时利用了梯度下降法,由于步长选择不合理或陷入局部最优从而使训练速度受到了限制。极限学习机(ELM)是一种单隐层前馈神经网络(SLFN)学习算法,具有学习速度快,泛化性能好的优点。本文证明了极限学习机与正负模糊规则系统的实质是等价的,遂将其用于图像分类。实验结果说明了极限学习机能很好的利用正负模糊规则相结合的方法对图像进行分类,实验结果较为理想。  相似文献   

19.
证据理论合成公式的讨论及一些修正   总被引:1,自引:0,他引:1       下载免费PDF全文
D-S证据理论作为一种不确定推理方法,已经广泛用于数据融合和目标识别领域。但是D-S 证据合成公式存在不足之处,使证据理论的应用受到了一定的限制。鉴于此,Yager 对合成公式作了改进,但改进后的合成公式又存在着新的问题。文[2],[3],[4]针对Yager 合成公式进行了一些改进。综合比较了以上几种合成公式,并对文[4]的合成公式进行了一些修正,使其满足结合律,提高了计算效率。  相似文献   

20.
Extracting classification rules from data is an important task of data mining and gaining considerable more attention in recent years. In this paper, a new meta-heuristic algorithm which is called as TACO-miner is proposed for rule extraction from artificial neural networks (ANN). The proposed rule extraction algorithm actually works on the trained ANNs in order to discover the hidden knowledge which is available in the form of connection weights within ANN structure. The proposed algorithm is mainly based on a meta-heuristic which is known as touring ant colony optimization (TACO) and consists of two-step hierarchical structure. The proposed algorithm is experimentally evaluated on six binary and n-ary classification benchmark data sets. Results of the comparative study show that TACO-miner is able to discover accurate and concise classification rules.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号