首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Association rules form one of the most widely used techniques to discover correlations among attribute in a database. So far, some efficient methods have been proposed to obtain these rules with respect to an optimal goal, such as: to maximize the number of large itemsets and interesting rules or the values of support and confidence for the discovered rules. This paper first introduces optimized fuzzy association rule mining in terms of three important criteria; strongness, interestingness and comprehensibility. Then, it proposes multi-objective Genetic Algorithm (GA) based approaches for discovering these optimized rules. Optimization technique according to given criterion may be one of two different forms; The first tries to determine the appropriate fuzzy sets of quantitative attributes in a prespecified rule, which is also called as certain rule. The second deals with finding both uncertain rules and their appropriate fuzzy sets. Experimental results conducted on a real data set show the effectiveness and applicability of the proposed approach.  相似文献   

2.
Association Rule Mining is one of the important data mining activities and has received substantial attention in the literature. Association rule mining is a computationally and I/O intensive task. In this paper, we propose a solution approach for mining optimized fuzzy association rules of different orders. We also propose an approach to define membership functions for all the continuous attributes in a database by using clustering techniques. Although single objective genetic algorithms are used extensively, they degenerate the solution. In our approach, extraction and optimization of fuzzy association rules are done together using multi-objective genetic algorithm by considering the objectives such as fuzzy support, fuzzy confidence and rule length. The effectiveness of the proposed approach is tested using computer activity dataset to analyze the performance of a multi processor system and network audit data to detect anomaly based intrusions. Experiments show that the proposed method is efficient in many scenarios.
V. S. AnanthanarayanaEmail:
  相似文献   

3.
In the domain of association rules mining (ARM) discovering the rules for numerical attributes is still a challenging issue. Most of the popular approaches for numerical ARM require a priori data discretization to handle the numerical attributes. Moreover, in the process of discovering relations among data, often more than one objective (quality measure) is required, and in most cases, such objectives include conflicting measures. In such a situation, it is recommended to obtain the optimal trade-off between objectives. This paper deals with the numerical ARM problem using a multi-objective perspective by proposing a multi-objective particle swarm optimization algorithm (i.e., MOPAR) for numerical ARM that discovers numerical association rules (ARs) in only one single step. To identify more efficient ARs, several objectives are defined in the proposed multi-objective optimization approach, including confidence, comprehensibility, and interestingness. Finally, by using the Pareto optimality the best ARs are extracted. To deal with numerical attributes, we use rough values containing lower and upper bounds to show the intervals of attributes. In the experimental section of the paper, we analyze the effect of operators used in this study, compare our method to the most popular evolutionary-based proposals for ARM and present an analysis of the mined ARs. The results show that MOPAR extracts reliable (with confidence values close to 95%), comprehensible, and interesting numerical ARs when attaining the optimal trade-off between confidence, comprehensibility and interestingness.  相似文献   

4.
闫伟  张浩  陆剑峰 《计算机应用》2005,25(11):2676-2678
采用数据挖掘中的模糊聚类分析了流程企业中历史数据的区间值,然后用模糊关联规则挖掘出有用的规则。首先阐述了模糊聚类的RFCM算法和关联规则的Apriori算法的内容,分析了实现模糊关联规则的Fuzzy_ClustApriori算法流程,并用RFCM算法对实际数据进行分析,得到不同类别的模糊数。根据Fuzzy_ClustApriori算法的步骤对模糊化的参数点进行处理,得到了有价值的模糊规则,为流程企业的生产优化提供了理论依据。  相似文献   

5.
In this paper, a genetic algorithm (GA) is proposed as a search strategy for not only positive but also negative quantitative association rule (AR) mining within databases. Contrary to the methods used as usual, ARs are directly mined without generating frequent itemsets. The proposed GA performs a database-independent approach that does not rely upon the minimum support and the minimum confidence thresholds that are hard to determine for each database. Instead of randomly generated initial population, uniform population that forces the initial population to be not far away from the solutions and distributes it in the feasible region uniformly is used. An adaptive mutation probability, a new operator called uniform operator that ensures the genetic diversity, and an efficient adjusted fitness function are used for mining all interesting ARs from the last population in only single run of GA. The efficiency of the proposed GA is validated upon synthetic and real databases.  相似文献   

6.
提出一种基于免疫原理的人工免疫算法,用于模糊关联规则的挖掘.该算法通过借鉴生物免疫系统中的克隆选择原理来实施优化操作,它直接从给出的数据中,通过优化机制自动确定每个属性对应的模糊集合,使推导出的满足条件的模糊关联规则数目最多.将实际数据集和相关算法进行性能比较,实验结果表明了所提出算法的有效性.  相似文献   

7.
Online mining of fuzzy multidimensional weighted association rules   总被引:1,自引:1,他引:0  
This paper addresses the integration of fuzziness with On-Line Analytical Processing (OLAP) based association rules mining. It contributes to the ongoing research on multidimensional online association rules mining by proposing a general architecture that utilizes a fuzzy data cube for knowledge discovery. A data cube is mainly constructed to provide users with the flexibility to view data from different perspectives as some dimensions of the cube contain multiple levels of abstraction. The first step of the process described in this paper involves introducing fuzzy data cube as a remedy to the problem of handling quantitative values of dimensional attributes in a cube. This facilitates the online mining of fuzzy association rules at different levels within the constructed fuzzy data cube. Then, we investigate combining the concepts of weight and multiple-level to mine fuzzy weighted multi-cross-level association rules from the constructed fuzzy data cube. For this purpose, three different methods are introduced for single dimension, multidimensional and hybrid (integrates the other two methods) fuzzy weighted association rules mining. Each of the three methods utilizes a fuzzy data cube constructed to suite the particular method. To the best of our knowledge, this is the first effort in this direction. We compared the proposed approach to an existing approach that does not utilize fuzziness. Experimental results obtained for each of the three methods on a synthetic dataset and on the adult data of the United States census in year 2000 demonstrate the effectiveness and applicability of the proposed fuzzy OLAP based mining approach. OLAP is one of the most popular tools for on-line, fast and effective multidimensional data analysis. In the OLAP framework, data is mainly stored in data hypercubes (simply called cubes).  相似文献   

8.
提出利用模糊属性集和关联规则的支持度获得高效率的关联规则增量更新挖掘的方法。首先对输入数据集进行模糊离散化,确定相应的模糊属性集,模糊支持数和各属性原先的模糊聚类中心;然后检查是否满足最小支持度条件,将其添加到更新后的模糊频繁属性集集合中;最后比较模糊频繁属性集和负边界的变化,得到最终更新后的模糊频繁属性集和相应的关联规则。采用实际飞行数据验证了该算法可以避免反复和多层扫描数据库的时间消耗问题,模糊关联规则挖掘算法可以高效和准确提取增量关联规则。  相似文献   

9.
提出一种新的基于术语簇和关联规则的文档聚类方法。首先对文档集合进行分词,根据术语之间的平均互信息形成术语簇,用术语簇来表示文档矢量空间模型,使用关联规则挖掘文档的初始聚类,对此进行聚类分析获得最终的文档聚类。实验结果表明,与传统的聚类方法相比,其运行速度快,聚类效果和聚类质量都有明显提高。  相似文献   

10.
聚类分析在模式识别和图像处理领域中有着极为重要的意义和广泛的应用前景。常用的聚类分析的方法是模糊C均值算法(FCM),但是FCM算法容易陷入局部最优解。提出一种基于FCM和遗传算法对图像进行模糊聚类分析的方法。对输入图像进行纹理特征提取,通过主成分分析法对提取的特征向量进行降维处理,降低图像聚类分析算法的复杂度,提高结果的精确度,结合FCM和遗传算法对图像数据进行模糊聚类分析。实验结果表明该方法可以得到较好的分类效果。  相似文献   

11.
韩涛  张春海  李华 《计算机工程与设计》2005,26(7):1842-1844,1899
关联是数据挖掘领域的一个重要研究课题。对模糊关联规则挖掘进行了研究,针对普通关联规则不能精确表达数据库中模糊信息关联性的问题,提出了一种新的模糊关联规则挖掘算法FARM_New,结果表明算法是有效的,提高了模糊挖掘的速度。  相似文献   

12.
关联挖掘中的时效度研究   总被引:1,自引:0,他引:1  
传统的关联挖掘算法,以支持度和置信度作为评价标准来衡量规则是否有价值。然而,这种模式不能体现出数据的时效敏感特性,如Web数据和长期积累数据。文中将首次建立一个全新的时基模型来重新估计数据规则的价值,并给出时效度(time validity)作为新的规则价值衡量标准。最后,给出了基于这个新的时基模型的一种新并行算法。这种算法使得我们在挖掘过程中使用增量挖掘,而且使得用户可以通过互操作来优化挖掘过程。  相似文献   

13.
In the last decade, the interest in microarray technology has exponentially increased due to its ability to monitor the expression of thousands of genes simultaneously. The reconstruction of gene association networks from gene expression profiles is a relevant task and several statistical techniques have been proposed to build them. The problem lies in the process to discover which genes are more relevant and to identify the direct regulatory relationships among them. We developed a multi-objective evolutionary algorithm for mining quantitative association rules to deal with this problem. We applied our methodology named GarNet to a well-known microarray data of yeast cell cycle. The performance analysis of GarNet was organized in three steps similarly to the study performed by Gallo et al. GarNet outperformed the benchmark methods in most cases in terms of quality metrics of the networks, such as accuracy and precision, which were measured using YeastNet database as true network. Furthermore, the results were consistent with previous biological knowledge.  相似文献   

14.
This paper presents an investigation into two fuzzy association rule mining models for enhancing prediction performance. The first model (the FCM–Apriori model) integrates Fuzzy C-Means (FCM) and the Apriori approach for road traffic performance prediction. FCM is used to define the membership functions of fuzzy sets and the Apriori approach is employed to identify the Fuzzy Association Rules (FARs). The proposed model extracts knowledge from a database for a Fuzzy Inference System (FIS) that can be used in prediction of a future value. The knowledge extraction process and the performance of the model are demonstrated through two case studies of road traffic data sets with different sizes. The experimental results show the merits and capability of the proposed KD model in FARs based knowledge extraction. The second model (the FCM–MSapriori model) integrates FCM and a Multiple Support Apriori (MSapriori) approach to extract the FARs. These FARs provide the knowledge base to be utilized within the FIS for prediction evaluation. Experimental results have shown that the FCM–MSapriori model predicted the future values effectively and outperformed the FCM–Apriori model and other models reported in the literature.  相似文献   

15.
在对关联规则冗余问题产生机理分析的基础上,提出了针对于支持度阀值设置的惩罚函数和一个改进的遗传算法。该改进算法采用了频繁项分布、素因子编码、择偶和共享函数等新颖技术,使染色体总是能在频繁项密集区进行挖掘,从而对组合搜索空间进行了有效修剪。并且对事务进行了数值转换,有效地压缩了事务数据库存储空间,提高了运算速度。从实验效果来看,改进的挖掘方法在发现有价值规则的效率与精准率方面具有一定优势。  相似文献   

16.
商业活动和工程实践中通常会积累一些大规模的携带重要信息的数据,由于这种数据集经常有更新且数据量较大,在对它们进行增量式关联规则挖掘时,若采用基于传统的Apriori算法进行计算,一方面难以取得较好的效率;另一方面支持度设置过低会产生大量的冗余规则,设置过高则会把一些支持度不高但有用的规则过滤掉而导致算法对这些新规则感应迟钝。因此,借助遗传算法的相关机理,同时结合自然界的免疫进化理论及相关仿生机制,提出一种IOGA(Immune Optimization based Genetic Algorithm,基于免疫优化的遗传算法)增量式关联规则挖掘方法。通过实验表明,该方法应用于大规模数据集的增量式关联规则挖掘时,可以及时地感知规则的变更并发现有用的规则,减少了冗余规则的产生,同时挖掘效率也有明显提高。  相似文献   

17.
一种基于模糊关联规则挖掘的攻击识别系统   总被引:1,自引:0,他引:1  
降低攻击识别中的漏报率和误报率是现在一个急需解决的问题。论文分析了攻击识别的需求与模糊关联规则挖掘的有关概念,并且以此为基础构建了一个攻击识别系统。该系统不但能够很好地满足攻击识别的要求,而且还能同时对异常攻击和滥用攻击进行识别,并且在很大程度上降低了攻击识别中的漏报率和误报率,极大地增强了信息系统的生存力。  相似文献   

18.
Elicitation of classification rules by fuzzy data mining   总被引:1,自引:0,他引:1  
Data mining techniques can be used to find potentially useful patterns from data and to ease the knowledge acquisition bottleneck in building prototype rule-based systems. Based on the partition methods presented in simple-fuzzy-partition-based method (SFPBM) proposed by Hu et al. (Comput. Ind. Eng. 43(4) (2002) 735), the aim of this paper is to propose a new fuzzy data mining technique consisting of two phases to find fuzzy if–then rules for classification problems: one to find frequent fuzzy grids by using a pre-specified simple fuzzy partition method to divide each quantitative attribute, and the other to generate fuzzy classification rules from frequent fuzzy grids. To improve the classification performance of the proposed method, we specially incorporate adaptive rules proposed by Nozaki et al. (IEEE Trans. Fuzzy Syst. 4(3) (1996) 238) into our methods to adjust the confidence of each classification rule. For classification generalization ability, the simulation results from the iris data demonstrate that the proposed method may effectively derive fuzzy classification rules from training samples.  相似文献   

19.
Data mining techniques managing imprecision are very useful to obtain meaningful and interesting information for the user. Among some other techniques, fuzzy association rules have been developed as a powerful tool for dealing with imprecision in databases and offering a good representation of found knowledge. In this paper we introduce a formal model for managing the imprecision in fuzzy transactional databases using the restriction level representation theory, a recent representation of imprecision that extends that of fuzzy sets. This theory introduces some new operators, keeping the usual crisp properties even when negation is involved.The model allows us to mine fuzzy association rules in a straightforward way, extending the accuracy measures from the crisp case. In addition, we introduce several ways of representing and summarizing the obtained results, in order to offer new and very interesting semantics. As an application, we present how to extract fuzzy association rules involving both the presence and the absence of items using the proposed model, and we also perform some experiments with real fuzzy transactional datasets.  相似文献   

20.
李斌  马戈  孙志挥 《计算机应用》2004,24(12):105-107
从项目集发生变化的角度考虑数据库的增量式更新问题,并有效的改进了著名的FUP算法,提出了项目集发生变化的快速增量式更新算法FUPIC(Fast UPdate algorithm for Itemsets Changed)。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号