首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Data mining is most commonly used in attempts to induce association rules from transaction data. Transactions in real-world applications, however, usually consist of quantitative values. This paper thus proposes a fuzzy data-mining algorithm for extracting both association rules and membership functions from quantitative transactions. We present a GA-based framework for finding membership functions suitable for mining problems and then use the final best set of membership functions to mine fuzzy association rules. The fitness of each chromosome is evaluated by the number of large 1-itemsets generated from part of the previously proposed fuzzy mining algorithm and by the suitability of the membership functions. Experimental results also show the effectiveness of the framework.  相似文献   

2.
Fuzzy mining approaches have recently been discussed for deriving fuzzy knowledge. Since items may have their own characteristics, different minimum supports and membership functions may be specified for different items. In the past, we proposed a genetic-fuzzy data-mining algorithm for extracting minimum supports and membership functions for items from quantitative transactions. In that paper, minimum supports and membership functions of all items are encoded in a chromosome such that it may be not easy to converge. In this paper, an enhanced approach is proposed, which processes the items in a divide-and-conquer strategy. The approach is called divide-and-conquer genetic-fuzzy mining algorithm for items with Multiple Minimum Supports (DGFMMS), and is designed for finding minimum supports, membership functions, and fuzzy association rules. Possible solutions are evaluated by their requirement satisfaction divided by their suitability of derived membership functions. The proposed GA framework maintains multiple populations, each for one item’s minimum support and membership functions. The final best minimum supports and membership functions in all the populations are then gathered together to be used for mining fuzzy association rules. Experimental results also show the effectiveness of the proposed approach.  相似文献   

3.
Cluster-Based Evaluation in Fuzzy-Genetic Data Mining   总被引:2,自引:0,他引:2  
Data mining is commonly used in attempts to induce association rules from transaction data. Most previous studies focused on binary-valued transaction data. Transactions in real-world applications, however, usually consist of quantitative values. In the past, we proposed a fuzzy-genetic data-mining algorithm for extracting both association rules and membership functions from quantitative transactions. It used a combination of large 1-itemsets and membership-function suitability to evaluate the fitness values of chromosomes. The calculation for large 1-itemsets could take a lot of time, especially when the database to be scanned could not totally fed into main memory. In this paper, an enhanced approach, called the cluster-based fuzzy-genetic mining algorithm, is thus proposed to speed up the evaluation process and keep nearly the same quality of solutions as the previous one. It divides the chromosomes in a population into clusters by the - means clustering approach and evaluates each individual according to both cluster and their own information. Experimental results also show the effectiveness and efficiency of the proposed approach.  相似文献   

4.
Data mining is most commonly used in attempts to induce association rules from transaction data. In the past, we used the fuzzy and GA concepts to discover both useful fuzzy association rules and suitable membership functions from quantitative values. The evaluation for fitness values was, however, quite time-consuming. Due to dramatic increases in available computing power and concomitant decreases in computing costs over the last decade, learning or mining by applying parallel processing techniques has become a feasible way to overcome the slow-learning problem. In this paper, we thus propose a parallel genetic-fuzzy mining algorithm based on the master–slave architecture to extract both association rules and membership functions from quantitative transactions. The master processor uses a single population as a simple genetic algorithm does, and distributes the tasks of fitness evaluation to slave processors. The evolutionary processes, such as crossover, mutation and production are performed by the master processor. It is very natural and efficient to run the proposed algorithm on the master–slave architecture. The time complexities for both sequential and parallel genetic-fuzzy mining algorithms have also been analyzed, with results showing the good effect of the proposed one. When the number of generations is large, the speed-up can be nearly linear. The experimental results also show this point. Applying the master–slave parallel architecture to speed up the genetic-fuzzy data mining algorithm is thus a feasible way to overcome the low-speed fitness evaluation problem of the original algorithm.  相似文献   

5.
A genetic-fuzzy mining approach for items with multiple minimum supports   总被引:2,自引:2,他引:0  
Data mining is the process of extracting desirable knowledge or interesting patterns from existing databases for specific purposes. Mining association rules from transaction data is most commonly seen among the mining techniques. Most of the previous mining approaches set a single minimum support threshold for all the items and identify the relationships among transactions using binary values. In the past, we proposed a genetic-fuzzy data-mining algorithm for extracting both association rules and membership functions from quantitative transactions under a single minimum support. In real applications, different items may have different criteria to judge their importance. In this paper, we thus propose an algorithm which combines clustering, fuzzy and genetic concepts for extracting reasonable multiple minimum support values, membership functions and fuzzy association rules from quantitative transactions. It first uses the k-means clustering approach to gather similar items into groups. All items in the same cluster are considered to have similar characteristics and are assigned similar values for initializing a better population. Each chromosome is then evaluated by the criteria of requirement satisfaction and suitability of membership functions to estimate its fitness value. Experimental results also show the effectiveness and the efficiency of the proposed approach.  相似文献   

6.
文中提出了一种基于遗传算法的生成隶属度函数的方法,该方法通过遗传算法对初始种群进行优化,获得一个适应度较高的隶属度函数编码,然后再根据机场噪声数据的实际标准对优化后得到的隶属度函数进行修正,进而得到梯形分布的隶属度函数编码.最后通过得到的隶属度函数对数据进行模糊化,并采用FP-trees算法生成模糊关联规则.该文针对数量型属性提出了这种方法,它的优点是能够使通过遗传算法得到的较优的隶属度函数更加适用于实际的数据集.  相似文献   

7.
Fuzzy data mining is used to extract fuzzy knowledge from linguistic or quantitative data. It is an extension of traditional data mining and the derived knowledge is relatively meaningful to human beings. In the past, we proposed a mining algorithm to find suitable membership functions for fuzzy association rules based on ant colony systems. In that approach, precision was limited by the use of binary bits to encode the membership functions. This paper elaborates on the original approach to increase the accuracy of results by adding multi-level processing. A multi-level ant colony framework is thus designed and an algorithm based on the structure is proposed to achieve the purpose. The proposed approach first transforms the fuzzy mining problem into a multi-stage graph, with each route representing a possible set of membership functions. The new approach then extends the previous one, using multi-level processing to solve the problem in which the maximum quantities of item values in the transactions may be large. The membership functions derived in a given level will be refined in the subsequent level. The final membership functions in the last level are then outputted to the rule-mining phase to find fuzzy association rules. Experiments are also performed to show the performance of the proposed approach. The experimental results show that the proposed multi-level ant colony systems mining approach can obtain improved results.  相似文献   

8.
Today, development of e-commerce has provided many transaction databases with useful information for investigators exploring dependencies among the items. In data mining, the dependencies among different items can be shown using an association rule. The new fuzzy-genetic (FG) approach is designed to mine fuzzy association rules from a quantitative transaction database. Three important advantages are associated with using the FG approach: (1) the association rules can be extracted from the transaction database with a quantitative value; (2) extracting proper membership functions and support threshold values with the genetic algorithm will exert a positive effect on the mining process results; (3) expressing the association rules in a fuzzy representation is more understandable for humans. In this paper, we design a comprehensive and fast algorithm that mines level-crossing fuzzy association rules on multiple concept levels with learning support threshold values and membership functions using the cluster-based master–slave integrated FG approach. Mining the fuzzy association rules on multiple concept levels helps find more important, useful, accurate, and practical information.  相似文献   

9.
An ACS-based framework for fuzzy data mining   总被引:1,自引:0,他引:1  
Data mining is often used to find out interesting and meaningful patterns from huge databases. It may generate different kinds of knowledge such as classification rules, clusters, association rules, and among others. A lot of researches have been proposed about data mining and most of them focused on mining from binary-valued data. Fuzzy data mining was thus proposed to discover fuzzy knowledge from linguistic or quantitative data. Recently, ant colony systems (ACS) have been successfully applied to optimization problems. However, few works have been done on applying ACS to fuzzy data mining. This thesis thus attempts to propose an ACS-based framework for fuzzy data mining. In the framework, the membership functions are first encoded into binary-bits and then fed into the ACS to search for the optimal set of membership functions. The problem is then transformed into a multi-stage graph, with each route representing a possible set of membership functions. When the termination condition is reached, the best membership function set (with the highest fitness value) can then be used to mine fuzzy association rules from a database. At last, experiments are made to make a comparison with other approaches and show the performance of the proposed framework.  相似文献   

10.
为了在事务数据库中发现关联规则,在现实挖掘应用中,经常采用不同的标准去判断不同项目的重要性,管理项目之间的分类关系和处理定量数据集这3个方法去处理问题,因此提出一个在定量事务数据库中采用多最小支持度,在项目集中获取隐含知识的多层模糊关联规则挖掘算法。该挖掘算法使用两种支持度约束和至上而下逐步细化的方法推导出频繁项集,同时可以发现交叉层次的模糊关联规则。通过实例证明了该挖掘算法在多最小支持度约束下推导出的多层模糊关联规则是易于理解和有意义的,具有很好的效率和伸缩性。  相似文献   

11.
《Knowledge》2006,19(1):57-66
This paper propose a new method, that employs the genetic algorithm, to find fuzzy association rules for classification problems based on an effective method for discovering the fuzzy association rules, namely the fuzzy grids based rules mining algorithm (FGBRMA). It is considered that some important parameters, including the number and shapes of membership functions in each quantitative attribute and the minimum fuzzy support, are not easily user-specified. Thus, the above-mentioned parameters are automatically determined by a binary string or chromosome is composed of two substrings: one for each quantitative attribute by the coding method proposed by Ishibuchi and Murata, and the other for the minimum fuzzy support. In each generation, the fitness value, which maximizes the classification accuracy rate and minimizes the number of fuzzy rules, of each chromosome can be obtained. When reaching the termination condition, a chromosome with maximum fitness value is then used to test its performance. For classification generalization ability, the simulation results from the iris data and the appendicitis data demonstrate that proposed method performs well in comparison with other classification methods.  相似文献   

12.
Data mining is the process of extracting desirable knowledge or interesting patterns from existing databases for specific purposes. Most of the previous approaches set a single minimum support threshold for all the items and identify the relationships among transactions using binary values. In real applications, different items may have different criteria to judge their importance. In the past, we proposed an algorithm for extracting appropriate multiple minimum support values, membership functions and fuzzy association rules from quantitative transactions. It used requirement satisfaction and suitability of membership functions to evaluate fitness values of chromosomes. The calculation for requirement satisfaction might take a lot of time, especially when the database to be scanned could not be totally fed into main memory. In this paper, an enhanced approach, called the fuzzy cluster-based genetic-fuzzy mining approach for items with multiple minimum supports (FCGFMMS), is thus proposed to speed up the evaluation process and keep nearly the same quality of solutions as the previous one. It divides the chromosomes in a population into several clusters by the fuzzy k-means clustering approach and evaluates each individual according to both their cluster and their own information. Experimental results also show the effectiveness and the efficiency of the proposed approach.  相似文献   

13.
Data mining is the process of extracting desirable knowledge or interesting patterns from existing databases for specific purposes. In real-world applications, transactions may contain quantitative values and each item may have a lifespan from a temporal database. In this paper, we thus propose a data mining algorithm for deriving fuzzy temporal association rules. It first transforms each quantitative value into a fuzzy set using the given membership functions. Meanwhile, item lifespans are collected and recorded in a temporal information table through a transformation process. The algorithm then calculates the scalar cardinality of each linguistic term of each item. A mining process based on fuzzy counts and item lifespans is then performed to find fuzzy temporal association rules. Experiments are finally performed on two simulation datasets and the foodmart dataset to show the effectiveness and the efficiency of the proposed approach.  相似文献   

14.
分组多支持度关联规则研究   总被引:4,自引:1,他引:3  
关联规则是数据挖掘的重要任务之一,传统关联规则算法只有一个最小支持度,假设项出现的频率大致相同,而在谮实际中并非如此,由此产生了多支持度关联规则问题.该问题针对每个项给定不同的支持度,而在实际应用中项可以划分成若干个组,每组有一个支持度.由此提出了分组多支持度关联规则问题,针对该问题给出了基于多支持度性质对项进行分组的方法.该方法可以降低2-项候选集的数目.在此基础上,进一步给出了相应的多支持度关联规则发现算法,并通过实验证明了算法的有效性.  相似文献   

15.
针对单一层次结构实现规则提取具有规则提取准确性不高、算法运行时间长、难以满足用户使用需求的问题,提出一种基于改进多层次模糊关联规则的定量数据挖掘算法。采用高频项目集合,通过不断深化迭代的方法形成自顶向下的挖掘过程,整合模糊集合理论、数据挖掘算法以及多层次分类技术,从事务数据集中寻找模糊关联规则,挖掘出储存在多层次结构事务数据库中定量值信息的隐含知识,实现用户的定制化信息挖掘需求。实验结果表明,提出的数据挖掘算法在挖掘精度和运算时间方面相较于其他算法具有突出优势,可为多层次关联规则提取方法的实际应用带来新的发展空间。  相似文献   

16.
It is not an easy task to know a priori the most appropriate fuzzy sets that cover the domains of quantitative attributes for fuzzy association rules mining. In general, it is unrealistic that experts can always provide such sets. And finding the most appropriate fuzzy sets becomes a more complex problem when items are not considered to have equal importance and the support and confidence parameters required for the association rules mining process are specified as linguistic terms. Existing clustering based automated methods are not satisfactory because they do not consider the optimization of the discovered membership functions. In order to tackle this problem, we propose Genetic Algorithms (GAs) based clustering method, which dynamically adjusts the fuzzy sets to provide maximum profit based on user specified linguistic minimum support and confidence terms. This is achieved by tuning the base values of the membership functions for each quantitative attribute with respect to two different evaluation functions maximizing the number of large itemsets and the average of the confidence intervals of the generated rules. To the best of our knowledge, this is the first effort in this direction. Experiments conducted on 100 K transactions from the adult database of United States census in year 2000 demonstrate that the proposed clustering method exhibits good performance in terms of the number of produced large itemsets and interesting association rules.  相似文献   

17.
Data-mining is the process of extracting desirable knowledge or interesting patterns from existing databases for specific purposes. Most conventional data-mining algorithms identify the relationships among transactions using binary values, however, transactions with quantitative values are commonly seen in real-world applications. This paper thus proposes a new data-mining algorithm for extracting interesting knowledge from transactions stored as quantitative values. The proposed algorithm integrates fuzzy set concepts and the apriori mining algorithm to find interesting fuzzy association rules in given transaction data sets. Experiments with student grades at I-Shou University were also made to verify the performance of the proposed algorithm.  相似文献   

18.
基于聚类的模糊遗传挖掘算法的研究   总被引:2,自引:0,他引:2       下载免费PDF全文
通过分析连续型属性数据的特点和已有的关联规则挖掘算法,在定量描述的准确性和算法的高效性方面作了进一步研究,针对已有的通过结合最大一项集和隶属函数值去计算染色体的适应值的模糊遗传挖掘算法速度慢的问题,提出一种基于聚类的模糊遗传关联规则挖掘算法。该算法采用模糊遗传原理在交易数据中同时提取关联规则和隶属函数。同时,采用k-means聚类算法对种群中的染色体进行分类并且依据分类得到的信息和自身的信息评估每个染色体的适应性,从而降低了扫描数据库的次数,测试结果表明该算法速度快,准确度高。  相似文献   

19.
A Genetic Fuzzy System (GFS) is basically a fuzzy system augmented by a learning process based on a genetic algorithm (GA). Fuzzy systems have demonstrated their ability to solve different kinds of problems in various application domains. Currently, there is an increasing interest to augment fuzzy systems with learning and adaptation capabilities. Two of the most successful approaches to hybridize fuzzy systems with learning and adaptation methods have been made in the realm of soft computing. The GA can be merged with Fuzzy system for different purposes like rule selection, membership function optimization, rule generation, co-efficient optimization, for data classification. Here we propose an Adaptive Genetic Fuzzy System (AGFS) for optimizing rules and membership functions for medical data classification process. The primary intension of the research is 1) Generating rules from data as well as for the optimized rules selection, adapting of genetic algorithm is done and to explain the exploration problem in genetic algorithm, introduction of new operator, called systematic addition is done, 2) Proposing a simple technique for scheming of membership function and Discretization, and 3) Designing a fitness function by allowing the frequency of occurrence of the rules in the training data. Finally, to establish the efficiency of the proposed classifier the presentation of the anticipated genetic-fuzzy classifier is evaluated with quantitative, qualitative and comparative analysis. From the outcome, AGFS obtained better accuracy when compared to the existing systems.  相似文献   

20.
多时间序列跨事务关联分析研究   总被引:1,自引:0,他引:1  
论文的研究目的是为了对时间序列的发展趋势进行预测。采用的方法是对多时间序列进行跨事务关联规则分析,利用关联规则中前件和后件的时间差进行预测。提出了跨事务关联规则挖掘ITARM,该算法采用了基于压缩FP-树的、分而治之的挖掘方法。算法在产生了频繁1-项集之后,分别利用1-项集中的项作为约束条件,建立压缩FP-树,挖掘跨事务关联规则。文中给出了算法的主要设计思想和算法的伪代码,并对算法的性能进行了测试。测试结果表明,ITARM算法是一个时间和空间性能都较高的跨事务关联规则挖掘算法。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号