首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Association rules form one of the most widely used techniques to discover correlations among attribute in a database. So far, some efficient methods have been proposed to obtain these rules with respect to an optimal goal, such as: to maximize the number of large itemsets and interesting rules or the values of support and confidence for the discovered rules. This paper first introduces optimized fuzzy association rule mining in terms of three important criteria; strongness, interestingness and comprehensibility. Then, it proposes multi-objective Genetic Algorithm (GA) based approaches for discovering these optimized rules. Optimization technique according to given criterion may be one of two different forms; The first tries to determine the appropriate fuzzy sets of quantitative attributes in a prespecified rule, which is also called as certain rule. The second deals with finding both uncertain rules and their appropriate fuzzy sets. Experimental results conducted on a real data set show the effectiveness and applicability of the proposed approach.  相似文献   

2.
Researchers realized the importance of integrating fuzziness into association rules mining in databases with binary and quantitative attributes. However, most of the earlier algorithms proposed for fuzzy association rules mining either assume that fuzzy sets are given or employ a clustering algorithm, like CURE, to decide on fuzzy sets; for both cases the number of fuzzy sets is pre-specified. In this paper, we propose an automated method to decide on the number of fuzzy sets and for the autonomous mining of both fuzzy sets and fuzzy association rules. We achieve this by developing an automated clustering method based on multi-objective Genetic Algorithms (GA); the aim of the proposed approach is to automatically cluster values of a quantitative attribute in order to obtain large number of large itemsets in less time. We compare the proposed multi-objective GA based approach with two other approaches, namely: 1) CURE-based approach, which is known as one of the most efficient clustering algorithms; 2) Chien et al. clustering approach, which is an automatic interval partition method based on variation of density. Experimental results on 100 K transactions extracted from the adult data of USA census in year 2000 showed that the proposed automated clustering method exhibits good performance over both CURE-based approach and Chien et al.’s work in terms of runtime, number of large itemsets and number of association rules.  相似文献   

3.
In the last decade, the interest in microarray technology has exponentially increased due to its ability to monitor the expression of thousands of genes simultaneously. The reconstruction of gene association networks from gene expression profiles is a relevant task and several statistical techniques have been proposed to build them. The problem lies in the process to discover which genes are more relevant and to identify the direct regulatory relationships among them. We developed a multi-objective evolutionary algorithm for mining quantitative association rules to deal with this problem. We applied our methodology named GarNet to a well-known microarray data of yeast cell cycle. The performance analysis of GarNet was organized in three steps similarly to the study performed by Gallo et al. GarNet outperformed the benchmark methods in most cases in terms of quality metrics of the networks, such as accuracy and precision, which were measured using YeastNet database as true network. Furthermore, the results were consistent with previous biological knowledge.  相似文献   

4.
This article presents a multi-objective genetic algorithm which considers the problem of data clustering. A given dataset is automatically assigned into a number of groups in appropriate fuzzy partitions through the fuzzy c-means method. This work has tried to exploit the advantage of fuzzy properties which provide capability to handle overlapping clusters. However, most fuzzy methods are based on compactness and/or separation measures which use only centroid information. The calculation from centroid information only may not be sufficient to differentiate the geometric structures of clusters. The overlap-separation measure using an aggregation operation of fuzzy membership degrees is better equipped to handle this drawback. For another key consideration, we need a mechanism to identify appropriate fuzzy clusters without prior knowledge on the number of clusters. From this requirement, an optimization with single criterion may not be feasible for different cluster shapes. A multi-objective genetic algorithm is therefore appropriate to search for fuzzy partitions in this situation. Apart from the overlap-separation measure, the well-known fuzzy Jm index is also optimized through genetic operations. The algorithm simultaneously optimizes the two criteria to search for optimal clustering solutions. A string of real-coded values is encoded to represent cluster centers. A number of strings with different lengths varied over a range correspond to variable numbers of clusters. These real-coded values are optimized and the Pareto solutions corresponding to a tradeoff between the two objectives are finally produced. As shown in the experiments, the approach provides promising solutions in well-separated, hyperspherical and overlapping clusters from synthetic and real-life data sets. This is demonstrated by the comparison with existing single-objective and multi-objective clustering techniques.  相似文献   

5.
The paper presents a multi-objective genetic approach to design interpretability-oriented fuzzy rule-based classifiers from data. The proposed approach allows us to obtain systems with various levels of compromise between their accuracy and interpretability. During the learning process, parameters of the membership functions, as well as the structure of the classifier's fuzzy rule base (i.e., the number of rules, the number of rule antecedents, etc.) evolve simultaneously using a Pittsburgh-type genetic approach. Since there is no particular coding of fuzzy rule structures in a chromosome (it reduces computational complexity of the algorithm), original crossover and mutation operators, as well as chromosome-repairing technique to directly transform the rules are also proposed. To evaluate both the accuracy and interpretability of the system, two measures are used. The first one – an accuracy measure – is based on the root mean square error of the system's response. The second one – an interpretability measure – is based on the arithmetic mean of three components: (a) the average length of rules (the average number of antecedents used in the rules), (b) the number of active fuzzy sets and (c) the number of active inputs of the system (an active fuzzy set or input means a set or input used by at least one fuzzy rule). Both measures are used as objectives in multi-objective (2-objective in our case) genetic optimization approaches such as well-known SPEA2 and NSGA-II algorithms. Moreover, for the purpose of comparison with several alternative approaches, the experiments are carried out both considering the so-called strong fuzzy partitions (SFPs) of attribute domains and without them. SFPs provide more semantically meaningful solutions, usually at the expense of their accuracy. The operation of the proposed technique in various classification problems is tested with the use of 20 benchmark data sets and compared to 11 alternative classification techniques. The experiments show that the proposed approach generates classifiers of significantly improved interpretability, while still characterized by competitive accuracy.  相似文献   

6.
Reliability-based robust design optimization (RBRDO) is one of the most important tools developed in recent years to improve both quality and reliability of the products at an early design stage. This paper presents a comparative study of different formulation approaches of RBRDO models and their performances. The paper also proposes an evolutionary multi-objective genetic algorithm (MOGA) to one of the promising hybrid quality loss functions (HQLF)-based RBRDO model. The enhanced effectiveness of the HQLF-based RBRDO model is demonstrated by optimizing suitable examples.  相似文献   

7.
Linguistic rules in natural language are useful and consistent with human way of thinking. They are very important in multi-criteria decision making due to their interpretability. In this paper, our discussions concentrate on extracting linguistic rules from data sets. In the end, we firstly analyze how to extract complex linguistic data summaries based on fuzzy logic. Then, we formalize linguistic rules based on complex linguistic data summaries, in which, the degree of confidence of linguistic rules from a data set can be explained by linguistic quantifiers and its linguistic truth from the fuzzy logical point of view. In order to obtain a linguistic rule with a higher degree of linguistic truth, a genetic algorithm is used to optimize the number and parameters of membership functions of linguistic values. Computational results show that the proposed method is an alternative method for extracting linguistic rules with linguistic truth from data sets.  相似文献   

8.
闫伟  张浩  陆剑峰 《计算机应用》2005,25(11):2676-2678
采用数据挖掘中的模糊聚类分析了流程企业中历史数据的区间值,然后用模糊关联规则挖掘出有用的规则。首先阐述了模糊聚类的RFCM算法和关联规则的Apriori算法的内容,分析了实现模糊关联规则的Fuzzy_ClustApriori算法流程,并用RFCM算法对实际数据进行分析,得到不同类别的模糊数。根据Fuzzy_ClustApriori算法的步骤对模糊化的参数点进行处理,得到了有价值的模糊规则,为流程企业的生产优化提供了理论依据。  相似文献   

9.
In this research, a data clustering algorithm named as non-dominated sorting genetic algorithm-fuzzy membership chromosome (NSGA-FMC) based on K-modes method which combines fuzzy genetic algorithm and multi-objective optimization was proposed to improve the clustering quality on categorical data. The proposed method uses fuzzy membership value as chromosome. In addition, due to this innovative chromosome setting, a more efficient solution selection technique which selects a solution from non-dominated Pareto front based on the largest fuzzy membership is integrated in the proposed algorithm. The multiple objective functions: fuzzy compactness within a cluster (π) and separation among clusters (sep) are used to optimize the clustering quality. A series of experiments by using three UCI categorical datasets were conducted to compare the clustering results of the proposed NSGA-FMC with two existing methods: genetic algorithm fuzzy K-modes (GA-FKM) and multi-objective genetic algorithm-based fuzzy clustering of categorical attributes (MOGA (π, sep)). Adjusted Rand index (ARI), π, sep, and computation time were used as performance indexes for comparison. The experimental result showed that the proposed method can obtain better clustering quality in terms of ARI, π, and sep simultaneously with shorter computation time.  相似文献   

10.
The automated warehouse management requires to fulfill objectives that are usually conflicting with each other. The decisions taken must ensure optimized usage of resources, cost reduction and better customer service. The warehouse replenishment task is a typical example of multi-objective optimization. In this paper, a genetic algorithm with a new crossover operator is developed to solve the replenishment problem. This algorithm is applied to real warehouse data and produces Pareto-optimal permutations of the stored products. A fuzzy rule-base is proposed to increase the diversity of the optimal solutions.  相似文献   

11.
在对关联规则冗余问题产生机理分析的基础上,提出了针对于支持度阀值设置的惩罚函数和一个改进的遗传算法。该改进算法采用了频繁项分布、素因子编码、择偶和共享函数等新颖技术,使染色体总是能在频繁项密集区进行挖掘,从而对组合搜索空间进行了有效修剪。并且对事务进行了数值转换,有效地压缩了事务数据库存储空间,提高了运算速度。从实验效果来看,改进的挖掘方法在发现有价值规则的效率与精准率方面具有一定优势。  相似文献   

12.
This research is based on a new hybrid approach, which deals with the improvement of shape optimization process. The objective is to contribute to the development of more efficient shape optimization approaches in an integrated optimal topology and shape optimization area with the help of genetic algorithms and robustness issues. An improved genetic algorithm is introduced to solve multi-objective shape design optimization problems. The specific issue of this research is to overcome the limitations caused by larger population of solutions in the pure multi-objective genetic algorithm. The combination of genetic algorithm with robust parameter design through a smaller population of individuals results in a solution that leads to better parameter values for design optimization problems. The effectiveness of the proposed hybrid approach is illustrated and evaluated with test problems taken from literature. It is also shown that the proposed approach can be used as first stage in other multi-objective genetic algorithms to enhance the performance of genetic algorithms. Finally, the shape optimization of a vehicle component is presented to illustrate how the present approach can be applied for solving multi-objective shape design optimization problems.  相似文献   

13.
基于免疫遗传退火算法的Web关联规则挖掘方法*   总被引:1,自引:0,他引:1  
摘要:根据关联规则挖掘的要求与特点,结合免疫算法,遗传算法和模拟退火算法的优点,提出一个基于免疫遗传退火算法的Web关联规则挖掘方法。实验结果表明,与遗传算法和模拟退火算法相比,基于免疫遗传退火算法的关联规则发现在Web挖掘中具有一定的优势。  相似文献   

14.
In this paper, the solutions produced by the fuzzy c-means algorithm for a general class of problems are examined and a method to test for the local optimality of such solutions is established. An equivalent mathematical program is defined for the c-means problem utilizing a generalized norm, then the properties of the resulting optimization problem are investigated. It is shown that the gradient of the resulting objective function at the solution produced by the c-means algorithm in this case takes a special structure which can be used in terminating the algorithm. Moreover, the local optimality of the solution obtained is checked utilizing the Hessian of the criterion function. The solution is a local minimum point if the Hessian matrix at this point is positive semidefinite. Simple rules are proposed to help in checking the definiteness of the matrix.  相似文献   

15.
In the domain of association rules mining (ARM) discovering the rules for numerical attributes is still a challenging issue. Most of the popular approaches for numerical ARM require a priori data discretization to handle the numerical attributes. Moreover, in the process of discovering relations among data, often more than one objective (quality measure) is required, and in most cases, such objectives include conflicting measures. In such a situation, it is recommended to obtain the optimal trade-off between objectives. This paper deals with the numerical ARM problem using a multi-objective perspective by proposing a multi-objective particle swarm optimization algorithm (i.e., MOPAR) for numerical ARM that discovers numerical association rules (ARs) in only one single step. To identify more efficient ARs, several objectives are defined in the proposed multi-objective optimization approach, including confidence, comprehensibility, and interestingness. Finally, by using the Pareto optimality the best ARs are extracted. To deal with numerical attributes, we use rough values containing lower and upper bounds to show the intervals of attributes. In the experimental section of the paper, we analyze the effect of operators used in this study, compare our method to the most popular evolutionary-based proposals for ARM and present an analysis of the mined ARs. The results show that MOPAR extracts reliable (with confidence values close to 95%), comprehensible, and interesting numerical ARs when attaining the optimal trade-off between confidence, comprehensibility and interestingness.  相似文献   

16.
In this paper, a genetic algorithm (GA) is proposed as a search strategy for not only positive but also negative quantitative association rule (AR) mining within databases. Contrary to the methods used as usual, ARs are directly mined without generating frequent itemsets. The proposed GA performs a database-independent approach that does not rely upon the minimum support and the minimum confidence thresholds that are hard to determine for each database. Instead of randomly generated initial population, uniform population that forces the initial population to be not far away from the solutions and distributes it in the feasible region uniformly is used. An adaptive mutation probability, a new operator called uniform operator that ensures the genetic diversity, and an efficient adjusted fitness function are used for mining all interesting ARs from the last population in only single run of GA. The efficiency of the proposed GA is validated upon synthetic and real databases.  相似文献   

17.
正相关关联规则及其在中医药中的应用   总被引:1,自引:0,他引:1       下载免费PDF全文
关联规则是数据挖掘的重要模式之一,有着极其重要的应用价值,但是传统的基于支持度-置信度框架的关联规则挖掘算法在实际应用中存在诸多不足。引入相关性分析,设计了一种基于遗传算法的正相关关联规则挖掘算法。最后,将该算法应用于名老中医临证经验分析挖掘的实际问题,实验证明,它能有效地弥补传统关联规则挖掘算法的不足。  相似文献   

18.
A significant class of decision making problems consists of choosing actions, to be carried out simultaneously, in order to achieve a trade-off between different objectives. When such decisions concern complex systems, decision support tools including formal methods of reasoning and probabilistic models are of noteworthy helpfulness. These models are often built through learning procedures, based on an available knowledge base. Nevertheless, in many fields of application (e.g. when dealing with complex political, economic and social systems), it is frequently not possible to determine the model automatically, and this must then largely be derived from the opinions and value judgements expressed by domain experts. The BayMODE decision support tool (Bayesian Multi Objective Decision Environment), which we describe in this paper, operates precisely in such contexts. The principal component of the program is a multi-objective Decision Network, where actions are executed simultaneously. If the noisy-OR assumptions are applicable, such a the model has a reasonably small number of parameters, even when actions are represented as non-binary variables. This makes the model building procedure accessible and easy. Moreover, BayMODE operates with a multi-objective approach, which provides the decision maker with a set of non-dominated solutions, computed using a multi-objective genetic algorithm. Ivan Blecic is Assistant Professor of Economic Appraisal and Evaluation at the Faculty of Architecture in Alghero (University of Sassari, Italy) and member of Interuniversity Laboratory of Analysis and Models for Planning (LAMP). He received a Ph.D. in Planning and Public Policies in 2005 from IUAV University of Venice where he has also been a research fellow at the Department of Planning. His current research interests include analysis and modelling for planning, evaluation techniques and modelling, decision support systems and methods for public participation. Arnaldo Cecchini graduated cum laude in Physics at the University of Bologna in 1972. He is Professor of Analysis of Urban Systems at the Faculty of Architecture in Alghero (University of Sassari), Director of the Urban and Environmental Planning Course, Vice-Dean of the Faculty of Architecture in Alghero and Director of the Interuniversity Laboratory of Analysis and Models for Planning - LAMP. He is the author of more than 100 articles and papers published in books and refereed journals and is an expert in techniques of urban analysis and for public participation: simulation, gaming simulation, cellular automata, scenario techniques. Giuseppe A. Trunfio gained a Ph.D. in Computational Mechanics in 1999 at the University of Calabria, Italy. He has been a research fellow at the Italian National Research Council where he has worked extensively on the application of parallel computing to the simulation of complex systems. He is Assistant Professor of Computer Engineering at the Department of Architecture and Planning of the University of Sassari and his current research interests include decision support, probabilistic models, neural networks, evolutionary computation and cellular automata.  相似文献   

19.
Multiple sequence alignment is of central importance to bioinformatics and computational biology. Although a large number of algorithms for computing a multiple sequence alignment have been designed, the efficient computation of highly accurate and statistically significant multiple alignments is still a challenge. In this paper, we propose an efficient method by using multi-objective genetic algorithm (MSAGMOGA) to discover optimal alignments with affine gap in multiple sequence data. The main advantage of our approach is that a large number of tradeoff (i.e., non-dominated) alignments can be obtained by a single run with respect to conflicting objectives: affine gap penalty minimization and similarity and support maximization. To the best of our knowledge, this is the first effort with three objectives in this direction. The proposed method can be applied to any data set with a sequential character. Furthermore, it allows any choice of similarity measures for finding alignments. By analyzing the obtained optimal alignments, the decision maker can understand the tradeoff between the objectives. We compared our method with the three well-known multiple sequence alignment methods, MUSCLE, SAGA and MSA-GA. As the first of them is a progressive method, and the other two are based on evolutionary algorithms. Experiments on the BAliBASE 2.0 database were conducted and the results confirm that MSAGMOGA obtains the results with better accuracy statistical significance compared with the three well-known methods in aligning multiple sequence alignment with affine gap. The proposed method also finds solutions faster than the other evolutionary approaches mentioned above.  相似文献   

20.
In this paper, a hybrid neural network that is capable of incremental learning and classification of patterns with incomplete data is proposed. Fuzzy ARTMAP (FAM) is employed as the constituting network for pattern classification while fuzzy c-means (FCM) clustering is used as the underlying algorithm for processing training as well as test samples with missing features. To handle an incomplete training set, FAM is first trained using complete samples only. Missing features of the training samples are estimated and replaced using two FCM-based strategies. Then, network training is conducted using all the complete and estimated samples. To handle an incomplete test set, a non-substitution FCM-based strategy is employed so that a predicted output can be produced rapidly. The performance of the proposed hybrid network is evaluated using a benchmark problem, and its practical applicability is demonstrated using a medical diagnosis task. The results are compared, analysed and quantified statistically with the bootstrap method. Implications of the proposed network for pattern classification tasks with incomplete data are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号