运用模糊集挖掘数量属性数据的关联规则   总被引:3,自引:0,他引:3  
王咏  申瑞民 《计算机仿真》2004,21(8):129-131
绝大多数关联规则的挖掘方法基于布尔属性数据,但在现实应用中会经常需要对数量属性的数据进行关联挖掘。该文就提出一种算法,在经典Apriori后选集算法的基础上引入了模糊逻辑集合的概念,将数据集中的数量属性按照模糊集合定义进行划分从而将原始事务数据转化成基于模糊集的数据,然后再运用Apriori算法发现潜在的关联规则。  相似文献   

基于数据挖掘的中医药数据预处理方法   总被引:7,自引:0,他引:7       下载免费PDF全文
朱金伟  鞠时光  辛燕 《计算机工程》2006,32(15):280-282,F0003
中药文化的地区差异带来了中医药数据的众多不确定性,为解决基于数据挖掘的新药研制决策支持系统的数据问题,提出了一套规范原始中医药数据的处理方法。应用了数据归约技术、聚类的方法、模糊集理论改进了中医药数据的质量,使得在预处理后的中药方剂数据库中成功挖掘出重要规则,为研制中药新药提供了有力的决策支持。  相似文献   

基于完备性和语义性的隶属函数GA优化方法   总被引:2,自引:0,他引:2  
提出优化隶属函数的GA编码方法可以保证其完备性和语义性,并给出了仿真结果。  相似文献   

大型数据库中关联规则的向量法挖掘   总被引:6,自引:1,他引:6  
提出一个基于向量运算的崭新的挖掘算法, 它特别适用于并行运算,并且,在整个挖掘过程中,只需扫描数据库一次,而传统的Apriori算法需要多次扫描数据库。因此,数据挖掘效率大大提高。  相似文献   

马慧  汤庸  潘炎 《计算机工程》2006,32(17):132-134
随着各种形式的数据的迅速增长,业务数据中的时态信息挖掘问题受到人们普遍关注。该文提出了一种带有效时间区间的时态关联规则,给出了一种基于FP-树的挖掘方法。该方法利用分区挖掘的思想,以分区为单位表示项集的有效时间区间,并为每个分区构建FP-树,大大简化了对某个项集在其有效时间区间中的出现次数的计算,从而更有效地计算时态置信度。最后用一个例子对该方法的执行过程进行了阐述。  相似文献   

基于日历约束的时序关联规则挖掘由于其实用性,越来越受到研究者的关注。由于现实中用户很难对时间模式进行精确描述,因此基于模糊日历的时序关联规则挖掘更有现实意义。借助模糊概念和模糊运算,对时间区间的描述很容易实现。对于用户指定的日历模式,不同的时间区间可根据它们的隶属度具有不同的权重。在模糊日历代数的基础上,结合增量挖掘和累进计数的思想,本文提出了一种基于模糊日历约束的关联规则挖掘方法,理论分析和实验结果均表明,该算法是高效可行的。  相似文献   

为了挖掘集合值关系数据库的模糊关联规则,应用竞争聚集算法将记录在数量型属性上的取值划分成若干个模糊集,接着给出集合值关系数据库上数量型属的模糊关联规则的挖掘算法,此算法能将数量型属性模糊关联规则的挖掘问题转化为布尔属性关联规则的挖掘问题。最后通过一个实例说明挖掘算法的合理性。  相似文献   

一种新的关联规则挖掘思想   总被引:3,自引:0,他引:3  
提出的新的关联规则挖掘思想(以下称为“记录加权型关联规则挖掘”)是为每一条历史记录加上相应的权重值,以反映“不同记录对挖掘结果贡献不同”这一数据挖掘的实际要求。在此基础上,还对支持度、可信度和挖掘算法作了相应的修正,提出了RWApriori-Tid算法。  相似文献   

基于免疫遗传算法的多维关联规则挖掘   总被引:7,自引:1,他引:7  
高坚 《计算机工程与应用》2003,39(32):185-186,225
关联规则挖掘是数据挖掘中一个很重要的研究课题。文章给出了一种基于免疫遗传算法的关联规则挖掘算法,该算法具有很好的鲁棒性和隐含并行性,能快速、有效地进行全局优化搜索。特别适用于大规模、海量数据库的挖掘。  相似文献   

李乃乾  沈钧毅 《计算机工程》2002,28(11):13-14,22
提出了一种新的基于模糊概念的量化关联规则挖掘方法,该方法利用在量化属性域上定义的一组模糊概念表示属性间的关联关系,克服了传统的离散分区法的不足,使得规则的表示自然,简明,有利于专家理解,同时,给出了挖掘算法。  相似文献   

遗传算法在关联规则挖掘中的应用   总被引:14,自引:0,他引:14  
该文尝试和遗传算法挖掘关联规则,并结合图书馆智能型读者测评系统,给出了一个基于遗传算法进行了关联规则挖掘的实例。  相似文献   

An ACS-based framework for fuzzy data mining   总被引:1,自引:0,他引:1  
Data mining is often used to find out interesting and meaningful patterns from huge databases. It may generate different kinds of knowledge such as classification rules, clusters, association rules, and among others. A lot of researches have been proposed about data mining and most of them focused on mining from binary-valued data. Fuzzy data mining was thus proposed to discover fuzzy knowledge from linguistic or quantitative data. Recently, ant colony systems (ACS) have been successfully applied to optimization problems. However, few works have been done on applying ACS to fuzzy data mining. This thesis thus attempts to propose an ACS-based framework for fuzzy data mining. In the framework, the membership functions are first encoded into binary-bits and then fed into the ACS to search for the optimal set of membership functions. The problem is then transformed into a multi-stage graph, with each route representing a possible set of membership functions. When the termination condition is reached, the best membership function set (with the highest fitness value) can then be used to mine fuzzy association rules from a database. At last, experiments are made to make a comparison with other approaches and show the performance of the proposed framework.  相似文献   

Data mining is most commonly used in attempts to induce association rules from transaction data. In the past, we used the fuzzy and GA concepts to discover both useful fuzzy association rules and suitable membership functions from quantitative values. The evaluation for fitness values was, however, quite time-consuming. Due to dramatic increases in available computing power and concomitant decreases in computing costs over the last decade, learning or mining by applying parallel processing techniques has become a feasible way to overcome the slow-learning problem. In this paper, we thus propose a parallel genetic-fuzzy mining algorithm based on the master–slave architecture to extract both association rules and membership functions from quantitative transactions. The master processor uses a single population as a simple genetic algorithm does, and distributes the tasks of fitness evaluation to slave processors. The evolutionary processes, such as crossover, mutation and production are performed by the master processor. It is very natural and efficient to run the proposed algorithm on the master–slave architecture. The time complexities for both sequential and parallel genetic-fuzzy mining algorithms have also been analyzed, with results showing the good effect of the proposed one. When the number of generations is large, the speed-up can be nearly linear. The experimental results also show this point. Applying the master–slave parallel architecture to speed up the genetic-fuzzy data mining algorithm is thus a feasible way to overcome the low-speed fitness evaluation problem of the original algorithm.  相似文献   

一种基于关联规则挖掘的组织数据方法   总被引:3,自引:0,他引:3       下载免费PDF全文
孔令富  王晗  练秋生 《计算机工程》2006,32(21):12-14,5
针对在数据挖掘中采用二进制转换的方法,定义了二进制序列集的相关概念并为此提供依据。分析了事务与关联规则在二进制序列集中的表示方法及其在空间、时间上的复杂度。通过实验验证,在关联规则数据挖掘中采用二进制序列集这一组织数据方法是有效且可行的。  相似文献   

佟强  周园春  吴开超    阎保平 《计算机工程》2007,33(10):34-35,69
提出了一种新的挖掘量化关联规则的方法.该方法使用聚类算法把数据库中的交易记录分成若干个簇,把簇投影到数值型属性所在的域,形成重叠的、有意义的区间.实验结果显示,这种方法能够有效地挖掘量化关联规则,并且能够发现以前的算法可能遗漏的重要的规则.  相似文献   

提出了一种利用模糊集理论进行聚类的技术,详细阐述了在关系数据库中利用此技术实现聚类的方法和过程,并给出了程序流程和程序实现;经过聚类后的数据对象,既可以从中获取分类知识和信息,也可以为下一步的关联规则挖掘提供低噪声的数据源。  相似文献   

Time series analysis has always been an important and interesting research field due to its frequent appearance in different applications. In the past, many approaches based on regression, neural networks and other mathematical models were proposed to analyze the time series. In this paper, we attempt to use the data mining technique to analyze time series. Many previous studies on data mining have focused on handling binary-valued data. Time series data, however, are usually quantitative values. We thus extend our previous fuzzy mining approach for handling time-series data to find linguistic association rules. The proposed approach first uses a sliding window to generate continues subsequences from a given time series and then analyzes the fuzzy itemsets from these subsequences. Appropriate post-processing is then performed to remove redundant patterns. Experiments are also made to show the performance of the proposed mining algorithm. Since the final results are represented by linguistic rules, they will be friendlier to human than quantitative representation.  相似文献   

关联规则挖掘AprioriTid算法优化研究   总被引:19,自引:0,他引:19  
提出了一种基于事务压缩和项目压缩的AprioriTid优化算法。该算法的特点是:项目集采用关键字识别,同时对事务数据进行事务和项目压缩。从而省去了Apriori算法和AprioriTid算法中的剪枝和模式匹配步骤,减小了扫描事务数据库的大小,提高了发现规则的效率。通过实验表明,优化的算法执行效率明显优于AprioriTid算法。  相似文献   

This paper will present a novel method based on harmonic mean, geometric mean, arithmetic mean and root mean square to help reduce fuzzy rules. The objective of the new method proposed is to produce fuzzy models with both a small number of interpretable rules and sufficiently high precision. Comparisons will be made between systems utilizing reduced rules and original rules to verify efficacy of the new methods in terms of the defuzzified outputs. As a practical example of a nonlinear system, an inverted pendulum will be controlled by a minimal set of rules to illustrate the performance and applicability of the proposed method.  相似文献   

运用模糊集理论,对基于属性的模糊聚类从概念、原理、算法等方面做了深入的论述,最后给出了一个应用实例,实践证明该算法是有效的。  相似文献   

