共查询到20条相似文献,搜索用时 125 毫秒
1.
关联规则是数据挖掘的重要研究内容之一.传统的关联规则挖掘算法仅适于处理二元属性与分类属性.为更好地处理数量属性,提出了一种自适应的基于模糊概念的量化关联规则挖掘算法.该算法克服了传统的离散分区法的不足,改进了已有模糊关联规则支持度的计算方法.引入了一种基于聚类的隶属函数自动生成方法,使得模糊关联规则的发现不依赖于人类专家给出的隶属函数,使得关联规则的表示自然、简明,有利于专家理解.实验表明该算法是有效的. 相似文献
2.
关联规则是数据挖掘的重要研究内容之一。传统的关联规则挖掘算法仅适于处理二元属性与分类属性。为更好地处理数量属性,提出了一种自适应的基于模糊概念的量化关联规则挖掘算法。该算法克服了传统的离散分区法的不足,改进了已有模糊关联规则支持度的计算方法。引入了一种基于聚类的隶属函数自动生成方法,使得模糊关联规则的发现不依赖于人类专家给出的隶属函数,使得关联规则的表示自然、简明,有利于专家理解。实验表明该算法是有效的。 相似文献
3.
4.
针对监控视频下的行人多属性识别问题,提出一种结合神经网络与关联规则的多分类方法。首先通过Faster-RCNN检测算法与改进的AlexNet多分类网络得到监控视频下行人各个属性的置信度,再采用关联规则Apriori算法对训练数据进行处理,进而结合神经网络分类的置信度和关联规则的处理结果,提出一种对分类置信度进行优化的算法。最后,统计关联规则优化后的某些行人属性准确率。结果表明,将神经网络与关联规则有效结合后可以提升某些属性识别的准确率。 相似文献
5.
6.
基于关联规则的特征选择算法 总被引:2,自引:0,他引:2
关联规则能够发现数据库中属性之间的关联,通过优先选择短规则用于相关属性的选择,有可能得到最小的属性子集.基于此,本文提出一种基于关联规则的特征选择算法,实验结果表明在属性子集大小和分类精度上优于多种特征选择方法.同时,对支持度和置信度对算法效果的影响进行探索,结果表明高的支持度和置信度并不导致高的分类精度和小的特征子集,而充足的规则数是基于关联规则特征选择算法高效的必要条件. 相似文献
7.
禹蒲阳 《计算机应用与软件》2010,27(8)
CBA算法是将关联规则挖掘与分类技术相结合的一种分类算法,在许多领域中得到了广泛应用.针对CBA处理海量数据效率低的缺点,提出了一个改进的CBA算法.该算法将粗糙集理论应用到CBA算法中,对决策表进行属性约简,提高了分类关联规则的生成效率;并应用PEP(pessimistic error pruning)方法对候选规则进行剪裁.实验结果表明,该算法比CBA具有更高的分类效率和准确度. 相似文献
8.
一个最优分类关联规则算法 总被引:1,自引:0,他引:1
分类和关联规则发现是数据挖掘中的两个重要领域。使用关联规则算法挖掘分类规则被叫做分类关联规则算法,是一个有较好前景的方法。本文提出了一个最优分类关联规则算法——OCARA。该算法使用最优关联规则挖掘算法挖掘分类规则,并对最优规则集排序,从而获得一个分类精度较高的分类器。将OCARA与传统分类算法C4.5和一般分类关联规则算法CBA、RMR在8个UCI数据集上进行实验比较,结果显示OCARA具有更好的性能,证明OCARA是一个有效的分类关联规则挖掘算法。 相似文献
9.
10.
传统的关联规则文本分类一般以规则的置信度作为分类准则,完全忽略了特征词的词频对分类的影响.这就导致了关联文本分类算法的性能较差,针对这个问题,在ARC-BC算法的基础上,提出了基于词频向量的关联规则文本分类算法TFARC(term frequency-based ARC),该算法引入了词频向量,重新定义了规则和文本的可信度作为分类器的分类准则,用迭代的方法求出每条规则的最佳调整因子.实验结果表明,词频的引入确实提高了关联规则文本分类的准确率. 相似文献
11.
提出了一种方法来探测BitTorrent流量.该方法分析BitTorrent协议,识别BitTorrent协议特征,并对这些特征设定规则,使其能够被入侵探测系统识别,然后通过带有SNORT(一种开放源代码的IDS)的网络监控这些特征,达到探测数据流的目的.最后指出了探测网络流量研究的新方向. 相似文献
12.
Efficient algorithms for distortion and blocking techniques in association rule hiding 总被引:1,自引:0,他引:1
Vassilios S. Verykios Emmanuel D. Pontikakis Yannis Theodoridis Liwu Chang 《Distributed and Parallel Databases》2007,22(1):85-104
Data mining provides the opportunity to extract useful information from large databases. Various techniques have been proposed
in this context in order to extract this information in the most efficient way. However, efficiency is not our only concern
in this study. The security and privacy issues over the extracted knowledge must be seriously considered as well. By taking
this into consideration, we study the procedure of hiding sensitive association rules in binary data sets by blocking some
data values and we present an algorithm for solving this problem. We also provide a fuzzification of the support and the confidence
of an association rule in order to accommodate for the existence of blocked/unknown values. In addition, we quantitatively
compare the proposed algorithm with other already published algorithms by running experiments on binary data sets, and we
also qualitatively compare the efficiency of the proposed algorithm in hiding association rules. We utilize the notion of
border rules, by putting weights in each rule, and we use effective data structures for the representation of the rules so
as (a) to minimize the side effects created by the hiding process and (b) to speed up the selection of the victim transactions.
Finally, we study the overall security of the modified database, using the C4.5 decision tree algorithm of the WEKA data mining
tool, and we discuss the advantages and the limitations of blocking. 相似文献
13.
This paper presents a new architecture of a fuzzy decision tree based on fuzzy rules – fuzzy rule based decision tree (FRDT) and provides a learning algorithm. In contrast with “traditional” axis-parallel decision trees in which only a single feature (variable) is taken into account at each node, the node of the proposed decision trees involves a fuzzy rule which involves multiple features. Fuzzy rules are employed to produce leaves of high purity. Using multiple features for a node helps us minimize the size of the trees. The growth of the FRDT is realized by expanding an additional node composed of a mixture of data coming from different classes, which is the only non-leaf node of each layer. This gives rise to a new geometric structure endowed with linguistic terms which are quite different from the “traditional” oblique decision trees endowed with hyperplanes as decision functions. A series of numeric studies are reported using data coming from UCI machine learning data sets. The comparison is carried out with regard to “traditional” decision trees such as C4.5, LADtree, BFTree, SimpleCart, and NBTree. The results of statistical tests have shown that the proposed FRDT exhibits the best performance in terms of both accuracy and the size of the produced trees. 相似文献
14.
提出了一种基于粗糙集理论的决策规则获取算法及实用的决策规则表示方法。产品智能设计系统中,在概念设计阶段需要决定产品的类型。一般的产品设计过程,是从分析用户给定的技术参数、设计要求开始。规则获取算法就是从设计参数中挖掘设计知识,从而确定哪种类型的产品最适合用户的要求。该算法不需要任何先验知识。为了将算法应用到实际系统中,必须解决规则表示问题。本文将关系数据库成功地应用于设计规则的表示,使设计出的专家系统推理方便,运行效率高。通过继电器智能设计实例,说明了本方法的应用。 相似文献
15.
16.
The information content of rules is categorized into inner mutual information content and outer impartation information content. Actually, the conventional objective interestingness measures based on information theory are all inner mutual information, which represent the confidence of rules and the mutual information between the antecedent and consequent. Moreover, almost all of these measures lose sight of the outer impartation information, which is conveyed to the user and help the user to make decisions. We put forward the viewpoint that the outer impartation information content of rules and rule sets can be represented by the relations from input universe to output universe. By binary relations, the interaction of rules in a rule set can be easily represented by operators: union and intersection. Based on the entropy of relations, the outer impartation information content of rules and rule sets are well measured. Then, the conditional information content of rules and rule sets, the independence of rules and rule sets and the inconsistent knowledge of rule sets are defined and measured. The properties of these new measures are discussed and some interesting results are proven, such as the information content of a rule set may be bigger than the sum of the information content of rules in the rule set, and the conditional information content of rules may be negative. At last, the applications of these new measures are discussed. The new method for the appraisement of rule mining algorithm, and two rule pruning algorithms, λ-choice and RPClC, are put forward. These new methods and algorithms have predominance in satisfying the need of more efficient decision information. 相似文献
17.
有效的安全数据采集是精准分析网络威胁的基础,当前常用的全采集、概率采集和自适应采集等采集方法,未考虑采集数据的有效性和采集数据的关联关系,消耗过多的资源,其采集收益和成本率低。针对该问题,考虑影响采集收益和成本的因素(节点特征间关系、网络拓扑关系、系统威胁状况、节点资源情况、节点相似度等),设计了一种基于规则关联的安全数据采集策略生成方法。该方法根据节点间的关联规则和系统中所发生安全事件间的关联规则,构建备选采集项,缩减数据采集范围;综合考虑采集收益和采集成本,设计最大化采集收益和最小化采集成本的多目标优化函数,基于遗传算法求解该优化函数。与常用采集方法进行比较和分析,实验结果表明所提方法12 h累计数据采集量较其他方案减少了1 000~3 000条数据记录,数据有效性较其他数据采集方案提升约4%~10%,证明了所提方法的有效性。 相似文献
18.
传统的图像分类一般只利用了图像的正规则,忽略了负规则在图像分类中的作用。Nguyen将负规则引入图像分类,提出将正负模糊规则相结合形成正负模糊规则系统,并将其用于遥感图像和自然图像的分类。实验证明,其在图像分类过程中取得了很好的效果。他们提出的前馈神经网络模型在调整权值时利用了梯度下降法,由于步长选择不合理或陷入局部最优从而使训练速度受到了限制。极限学习机(ELM)是一种单隐层前馈神经网络(SLFN)学习算法,具有学习速度快,泛化性能好的优点。本文证明了极限学习机与正负模糊规则系统的实质是等价的,遂将其用于图像分类。实验结果说明了极限学习机能很好的利用正负模糊规则相结合的方法对图像进行分类,实验结果较为理想。 相似文献
19.
D-S证据理论作为一种不确定推理方法,已经广泛用于数据融合和目标识别领域。但是D-S 证据合成公式存在不足之处,使证据理论的应用受到了一定的限制。鉴于此,Yager 对合成公式作了改进,但改进后的合成公式又存在着新的问题。文[2],[3],[4]针对Yager 合成公式进行了一些改进。综合比较了以上几种合成公式,并对文[4]的合成公式进行了一些修正,使其满足结合律,提高了计算效率。 相似文献
20.
TACO-miner: An ant colony based algorithm for rule extraction from trained neural networks 总被引:1,自引:0,他引:1
Lale
zbakir Adil Baykasolu Sinem Kulluk Hüseyin Yapc 《Expert systems with applications》2009,36(10):12295-12305
Extracting classification rules from data is an important task of data mining and gaining considerable more attention in recent years. In this paper, a new meta-heuristic algorithm which is called as TACO-miner is proposed for rule extraction from artificial neural networks (ANN). The proposed rule extraction algorithm actually works on the trained ANNs in order to discover the hidden knowledge which is available in the form of connection weights within ANN structure. The proposed algorithm is mainly based on a meta-heuristic which is known as touring ant colony optimization (TACO) and consists of two-step hierarchical structure. The proposed algorithm is experimentally evaluated on six binary and n-ary classification benchmark data sets. Results of the comparative study show that TACO-miner is able to discover accurate and concise classification rules. 相似文献