首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Genetic-Fuzzy Data Mining With Divide-and-Conquer Strategy   总被引:1,自引:0,他引:1  
Data mining is most commonly used in attempts to induce association rules from transaction data. Most previous studies focused on binary-valued transaction data. Transaction data in real-world applications, however, usually consist of quantitative values. This paper, thus, proposes a fuzzy data-mining algorithm for extracting both association rules and membership functions from quantitative transactions. A genetic algorithm (GA)-based framework for finding membership functions suitable for mining problems is proposed. The fitness of each set of membership functions is evaluated by the fuzzy-supports of the linguistic terms in the large 1-itemsets and by the suitability of the derived membership functions. The evaluation by the fuzzy supports of large 1-itemsets is much faster than that when considering all itemsets or interesting association rules. It can also help divide-and-conquer the derivation process of the membership functions for different items. The proposed GA framework, thus, maintains multiple populations, each for one item's membership functions. The final best sets of membership functions in all the populations are then gathered together to be used for mining fuzzy association rules. Experiments are conducted to analyze different fitness functions and set different fitness functions and setting different supports and confidences. Experiments are also conducted to compare the proposed algorithm, the one with uniform fuzzy partition, and the existing one without divide-and-conquer, with results validating the performance of the proposed algorithm.  相似文献   

2.
Fuzzy mining approaches have recently been discussed for deriving fuzzy knowledge. Since items may have their own characteristics, different minimum supports and membership functions may be specified for different items. In the past, we proposed a genetic-fuzzy data-mining algorithm for extracting minimum supports and membership functions for items from quantitative transactions. In that paper, minimum supports and membership functions of all items are encoded in a chromosome such that it may be not easy to converge. In this paper, an enhanced approach is proposed, which processes the items in a divide-and-conquer strategy. The approach is called divide-and-conquer genetic-fuzzy mining algorithm for items with Multiple Minimum Supports (DGFMMS), and is designed for finding minimum supports, membership functions, and fuzzy association rules. Possible solutions are evaluated by their requirement satisfaction divided by their suitability of derived membership functions. The proposed GA framework maintains multiple populations, each for one item’s minimum support and membership functions. The final best minimum supports and membership functions in all the populations are then gathered together to be used for mining fuzzy association rules. Experimental results also show the effectiveness of the proposed approach.  相似文献   

3.
Fuzzy data mining is used to extract fuzzy knowledge from linguistic or quantitative data. It is an extension of traditional data mining and the derived knowledge is relatively meaningful to human beings. In the past, we proposed a mining algorithm to find suitable membership functions for fuzzy association rules based on ant colony systems. In that approach, precision was limited by the use of binary bits to encode the membership functions. This paper elaborates on the original approach to increase the accuracy of results by adding multi-level processing. A multi-level ant colony framework is thus designed and an algorithm based on the structure is proposed to achieve the purpose. The proposed approach first transforms the fuzzy mining problem into a multi-stage graph, with each route representing a possible set of membership functions. The new approach then extends the previous one, using multi-level processing to solve the problem in which the maximum quantities of item values in the transactions may be large. The membership functions derived in a given level will be refined in the subsequent level. The final membership functions in the last level are then outputted to the rule-mining phase to find fuzzy association rules. Experiments are also performed to show the performance of the proposed approach. The experimental results show that the proposed multi-level ant colony systems mining approach can obtain improved results.  相似文献   

4.
An ACS-based framework for fuzzy data mining   总被引:1,自引:0,他引:1  
Data mining is often used to find out interesting and meaningful patterns from huge databases. It may generate different kinds of knowledge such as classification rules, clusters, association rules, and among others. A lot of researches have been proposed about data mining and most of them focused on mining from binary-valued data. Fuzzy data mining was thus proposed to discover fuzzy knowledge from linguistic or quantitative data. Recently, ant colony systems (ACS) have been successfully applied to optimization problems. However, few works have been done on applying ACS to fuzzy data mining. This thesis thus attempts to propose an ACS-based framework for fuzzy data mining. In the framework, the membership functions are first encoded into binary-bits and then fed into the ACS to search for the optimal set of membership functions. The problem is then transformed into a multi-stage graph, with each route representing a possible set of membership functions. When the termination condition is reached, the best membership function set (with the highest fitness value) can then be used to mine fuzzy association rules from a database. At last, experiments are made to make a comparison with other approaches and show the performance of the proposed framework.  相似文献   

5.
Today, development of e-commerce has provided many transaction databases with useful information for investigators exploring dependencies among the items. In data mining, the dependencies among different items can be shown using an association rule. The new fuzzy-genetic (FG) approach is designed to mine fuzzy association rules from a quantitative transaction database. Three important advantages are associated with using the FG approach: (1) the association rules can be extracted from the transaction database with a quantitative value; (2) extracting proper membership functions and support threshold values with the genetic algorithm will exert a positive effect on the mining process results; (3) expressing the association rules in a fuzzy representation is more understandable for humans. In this paper, we design a comprehensive and fast algorithm that mines level-crossing fuzzy association rules on multiple concept levels with learning support threshold values and membership functions using the cluster-based master–slave integrated FG approach. Mining the fuzzy association rules on multiple concept levels helps find more important, useful, accurate, and practical information.  相似文献   

6.
A genetic-fuzzy mining approach for items with multiple minimum supports   总被引:2,自引:2,他引:0  
Data mining is the process of extracting desirable knowledge or interesting patterns from existing databases for specific purposes. Mining association rules from transaction data is most commonly seen among the mining techniques. Most of the previous mining approaches set a single minimum support threshold for all the items and identify the relationships among transactions using binary values. In the past, we proposed a genetic-fuzzy data-mining algorithm for extracting both association rules and membership functions from quantitative transactions under a single minimum support. In real applications, different items may have different criteria to judge their importance. In this paper, we thus propose an algorithm which combines clustering, fuzzy and genetic concepts for extracting reasonable multiple minimum support values, membership functions and fuzzy association rules from quantitative transactions. It first uses the k-means clustering approach to gather similar items into groups. All items in the same cluster are considered to have similar characteristics and are assigned similar values for initializing a better population. Each chromosome is then evaluated by the criteria of requirement satisfaction and suitability of membership functions to estimate its fitness value. Experimental results also show the effectiveness and the efficiency of the proposed approach.  相似文献   

7.
Data mining is the process of extracting desirable knowledge or interesting patterns from existing databases for specific purposes. In real-world applications, transactions may contain quantitative values and each item may have a lifespan from a temporal database. In this paper, we thus propose a data mining algorithm for deriving fuzzy temporal association rules. It first transforms each quantitative value into a fuzzy set using the given membership functions. Meanwhile, item lifespans are collected and recorded in a temporal information table through a transformation process. The algorithm then calculates the scalar cardinality of each linguistic term of each item. A mining process based on fuzzy counts and item lifespans is then performed to find fuzzy temporal association rules. Experiments are finally performed on two simulation datasets and the foodmart dataset to show the effectiveness and the efficiency of the proposed approach.  相似文献   

8.
Data mining is most commonly used in attempts to induce association rules from transaction data. In the past, we used the fuzzy and GA concepts to discover both useful fuzzy association rules and suitable membership functions from quantitative values. The evaluation for fitness values was, however, quite time-consuming. Due to dramatic increases in available computing power and concomitant decreases in computing costs over the last decade, learning or mining by applying parallel processing techniques has become a feasible way to overcome the slow-learning problem. In this paper, we thus propose a parallel genetic-fuzzy mining algorithm based on the master–slave architecture to extract both association rules and membership functions from quantitative transactions. The master processor uses a single population as a simple genetic algorithm does, and distributes the tasks of fitness evaluation to slave processors. The evolutionary processes, such as crossover, mutation and production are performed by the master processor. It is very natural and efficient to run the proposed algorithm on the master–slave architecture. The time complexities for both sequential and parallel genetic-fuzzy mining algorithms have also been analyzed, with results showing the good effect of the proposed one. When the number of generations is large, the speed-up can be nearly linear. The experimental results also show this point. Applying the master–slave parallel architecture to speed up the genetic-fuzzy data mining algorithm is thus a feasible way to overcome the low-speed fitness evaluation problem of the original algorithm.  相似文献   

9.
提出了一种结合Apriori和Kuok's算法的改进的模糊关联规则算法.在定义隶属函数、决策树结构和规则集相似度的基础上,采用改进的挖掘算法挖掘数值属性的关联规则.实验结果表明,算法在规则生成和时间效率方面都显示了良好的性能.  相似文献   

10.
模糊数据挖掘和遗传算法在入侵检测中的应用   总被引:2,自引:0,他引:2  
论述了数据挖掘和遗传算法在入侵检测中的应用,详细描述了模糊关联规则和模糊频繁序列挖掘,并进一步介绍了如何采用遗传算法优化模糊集合隶属函数,从而达到改善入侵检测系统性能的目的。  相似文献   

11.
《Knowledge》2006,19(1):57-66
This paper propose a new method, that employs the genetic algorithm, to find fuzzy association rules for classification problems based on an effective method for discovering the fuzzy association rules, namely the fuzzy grids based rules mining algorithm (FGBRMA). It is considered that some important parameters, including the number and shapes of membership functions in each quantitative attribute and the minimum fuzzy support, are not easily user-specified. Thus, the above-mentioned parameters are automatically determined by a binary string or chromosome is composed of two substrings: one for each quantitative attribute by the coding method proposed by Ishibuchi and Murata, and the other for the minimum fuzzy support. In each generation, the fitness value, which maximizes the classification accuracy rate and minimizes the number of fuzzy rules, of each chromosome can be obtained. When reaching the termination condition, a chromosome with maximum fitness value is then used to test its performance. For classification generalization ability, the simulation results from the iris data and the appendicitis data demonstrate that proposed method performs well in comparison with other classification methods.  相似文献   

12.
Wang  Ling  Gui  Lingpeng  Zhu  Hui 《Applied Intelligence》2022,52(2):1389-1405

Traditional temporal association rules mining algorithms cannot dynamically update the temporal association rules within the valid time interval with increasing data. In this paper, a new algorithm called incremental fuzzy temporal association rule mining using fuzzy grid table (IFTARMFGT) is proposed by combining the advantages of boolean matrix with incremental mining. First, multivariate time series data are transformed into discrete fuzzy values that contain the time intervals and fuzzy membership. Second, in order to improve the mining efficiency, the concept of boolean matrices was introduced into the fuzzy membership to generate a fuzzy grid table to mine the frequent itemsets. Finally, in view of the Fast UPdate (FUP) algorithm, fuzzy temporal association rules are incrementally mined and updated without repeatedly scanning the original database by considering the lifespan of each item and inheriting the information from previous mining results. The experiments show that our algorithm provides better efficiency and interpretability in mining temporal association rules than other algorithms.

  相似文献   

13.
It is not an easy task to know a priori the most appropriate fuzzy sets that cover the domains of quantitative attributes for fuzzy association rules mining. In general, it is unrealistic that experts can always provide such sets. And finding the most appropriate fuzzy sets becomes a more complex problem when items are not considered to have equal importance and the support and confidence parameters required for the association rules mining process are specified as linguistic terms. Existing clustering based automated methods are not satisfactory because they do not consider the optimization of the discovered membership functions. In order to tackle this problem, we propose Genetic Algorithms (GAs) based clustering method, which dynamically adjusts the fuzzy sets to provide maximum profit based on user specified linguistic minimum support and confidence terms. This is achieved by tuning the base values of the membership functions for each quantitative attribute with respect to two different evaluation functions maximizing the number of large itemsets and the average of the confidence intervals of the generated rules. To the best of our knowledge, this is the first effort in this direction. Experiments conducted on 100 K transactions from the adult database of United States census in year 2000 demonstrate that the proposed clustering method exhibits good performance in terms of the number of produced large itemsets and interesting association rules.  相似文献   

14.
文中提出了一种基于遗传算法的生成隶属度函数的方法,该方法通过遗传算法对初始种群进行优化,获得一个适应度较高的隶属度函数编码,然后再根据机场噪声数据的实际标准对优化后得到的隶属度函数进行修正,进而得到梯形分布的隶属度函数编码.最后通过得到的隶属度函数对数据进行模糊化,并采用FP-trees算法生成模糊关联规则.该文针对数量型属性提出了这种方法,它的优点是能够使通过遗传算法得到的较优的隶属度函数更加适用于实际的数据集.  相似文献   

15.
为了挖掘集合值关系数据库的模糊关联规则,应用竞争聚集算法将记录在数量型属性上的取值划分成若干个模糊集,接着给出集合值关系数据库上数量型属的模糊关联规则的挖掘算法,此算法能将数量型属性模糊关联规则的挖掘问题转化为布尔属性关联规则的挖掘问题。最后通过一个实例说明挖掘算法的合理性。  相似文献   

16.
研究了模糊关联规则挖掘模型在分解炉中的应用,并阐述了模糊聚类算法和关联规则的相关内容,提出运用模糊聚类KFCM算法对实际数据进行计算,得到数据归属于不同类别的隶属度.同时,利用规则挖掘算法MFAR对模糊化的参数进行处理,得到了有价值的模糊规则,解决了实际中专家经验获取的瓶颈问题.试验表明,该方法为水泥生产环节中分解炉的温度控制提供了理论依据和生产优化指导.  相似文献   

17.
基于日历约束的时序关联规则挖掘由于其实用性,越来越受到研究者的关注。由于现实中用户很难对时间模式进行精确描述,因此基于模糊日历的时序关联规则挖掘更有现实意义。借助模糊概念和模糊运算,对时间区间的描述很容易实现。对于用户指定的日历模式,不同的时间区间可根据它们的隶属度具有不同的权重。在模糊日历代数的基础上,结合增量挖掘和累进计数的思想,本文提出了一种基于模糊日历约束的关联规则挖掘方法,理论分析和实验结果均表明,该算法是高效可行的。  相似文献   

18.
The goal of data mining is to find out interesting and meaningful patterns from large databases. In some real applications, many data are quantitative and linguistic. Fuzzy data mining was thus proposed to discover fuzzy knowledge from this kind of data. In the past, two mining algorithms based on the ant colony systems were proposed to find suitable membership functions for fuzzy association rules. They transformed the problem into a multi-stage graph, with each route representing a possible set of membership functions, and then, used the any colony system to solve it. They, however, searched for solutions in a discrete solution space in which the end points of membership functions could be adjusted only in a discrete way. The paper, thus, extends the original approaches to continuous search space, and a fuzzy mining algorithm based on the continuous ant approach is proposed. The end points of the membership functions may be moved in the continuous real-number space. The encoding representation and the operators are also designed for being suitable in the continuous space, such that the actual global optimal solution is contained in the search space. Besides, the proposed approach does not have fixed edges and nodes in the search process. It can dynamically produce search edges according to the distribution functions of pheromones in the solution space. Thus, it can get a better nearly global optimal solution than the previous two ant-based fuzzy mining approaches. The experimental results show the good performance of the proposed approach as well.  相似文献   

19.
关联规则是数据挖掘的重要研究内容之一。传统的关联规则挖掘算法仅适于处理二元属性与分类属性。为更好地处理数量属性,提出了一种自适应的基于模糊概念的量化关联规则挖掘算法。该算法克服了传统的离散分区法的不足,改进了已有模糊关联规则支持度的计算方法。引入了一种基于聚类的隶属函数自动生成方法,使得模糊关联规则的发现不依赖于人类专家给出的隶属函数,使得关联规则的表示自然、简明,有利于专家理解。实验表明该算法是有效的。  相似文献   

20.
针对不确定性数据中模糊关联规则的挖掘问题,提出一种基于群搜索优化(GSO)算法优化隶属度函数(MF)的模糊关联规则挖掘方法。首先,将不确定性数据通过三元语言表示模型进行表示;然后,给定一个初始MF,并以最大化模糊项集支持度和语义可解释性作为适应度函数,通过GSO算法的优化学习获得最佳MF;最后,根据获得的最佳MF,利用改进型的FFP-growth算法来从不确定数据中挖掘模糊关联规则。实验结果表明,该方法能够根据数据集自适应优化MF,以此实现从不确定数据中有效地挖掘关联规则。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号