首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper presents an integrated modeling method for multi-criteria land-use suitability assessment (LSA) using classification rule discovery (CRD) by ant colony optimisation (ACO) in ArcGIS. This new attempt applies artificial intelligent algorithms to intelligentise LSA by discovering suitability classification rules. The methodology is implemented as a tool called ACO-LSA. The tool can generate rules which are straightforward and comprehensible for users with high classification accuracy and simple rule list in solving CRD problems. A case study in the Macintyre Brook Catchment of southern Queensland in Australia is proposed to demonstrate the feasibility of this new modeling technique. The results have addressed the major advantages of this novel approach.  相似文献   

2.
The cAnt-Miner algorithm is an Ant Colony Optimization (ACO) based technique for classification rule discovery in problem domains which include continuous attributes. In this paper, we propose several extensions to cAnt-Miner. The main extension is based on the use of multiple pheromone types, one for each class value to be predicted. In the proposed μcAnt-Miner algorithm, an ant first selects a class value to be the consequent of a rule and the terms in the antecedent are selected based on the pheromone levels of the selected class value; pheromone update occurs on the corresponding pheromone type of the class value. The pre-selection of a class value also allows the use of more precise measures for the heuristic function and the dynamic discretization of continuous attributes, and further allows for the use of a rule quality measure that directly takes into account the confidence of the rule. Experimental results on 20 benchmark datasets show that our proposed extension improves classification accuracy to a statistically significant extent compared to cAnt-Miner, and has classification accuracy similar to the well-known Ripper and PART rule induction algorithms.  相似文献   

3.
A belief classification rule for imprecise data   总被引:1,自引:1,他引:0  
The classification of imprecise data is a difficult task in general because the different classes can partially overlap. Moreover, the available attributes used for the classification are often insufficient to make a precise discrimination of the objects in the overlapping zones. A credal partition (classification) based on belief functions has already been proposed in the literature for data clustering. It allows the objects to belong (with different masses of belief) not only to the specific classes, but also to the sets of classes called meta-classes which correspond to the disjunction of several specific classes. In this paper, we propose a new belief classification rule (BCR) for the credal classification of uncertain and imprecise data. This new BCR approach reduces the misclassification errors of the objects difficult to classify by the conventional methods thanks to the introduction of the meta-classes. The objects too far from the others are considered as outliers. The basic belief assignment (bba) of an object is computed from the Mahalanobis distance between the object and the center of each specific class. The credal classification of the object is finally obtained by the combination of these bba’s associated with the different classes. This approach offers a relatively low computational burden. Several experiments using both artificial and real data sets are presented at the end of this paper to evaluate and compare the performances of this BCR method with respect to other classification methods.  相似文献   

4.
5.
6.
On optimal rule discovery   总被引:4,自引:0,他引:4  
In machine learning and data mining, heuristic and association rules are two dominant schemes for rule discovery. Heuristic rule discovery usually produces a small set of accurate rules, but fails to find many globally optimal rules. Association rule discovery generates all rules satisfying some constraints, but yields too many rules and is infeasible when the minimum support is small. Here, we present a unified framework for the discovery of a family of optimal rule sets and characterize the relationships with other rule-discovery schemes such as nonredundant association rule discovery. We theoretically and empirically show that optimal rule discovery is significantly more efficient than association rule discovery independent of data structure and implementation. Optimal rule discovery is an efficient alternative to association rule discovery, especially when the minimum support is low.  相似文献   

7.
This paper deals with the problem of discovering rules that govern social interactions and relations in preliteral societies. Two older computer programs are first described which can receive data, possibly incomplete and redundant, representing kinship relations among named individuals. The programs then establish a knowledge base in the form of a directed graph, which the user can query in a variety of ways. Another program, written on the top of these (rewritten in LISP), can form concepts of various properties, including kinship relations, of and between the individuals. The concepts are derived from the examples and non-examples of a certain social pattern, such as inheritance, succession, marriage, class (tribe, moiety, clan, etc.) membership, domination-subordination, incest and exogamy. The concepts become hypotheses about the rules, which are corroborated, modified or rejected by further examples and non-examples.Dedicated to Claude Levi-StraussNicholas Findler is Research Professor of Computer Science, Director of the Artificial Intelligence Laboratory and Adjunct Professor of Mathematics at Arizona State University. He has worked in various areas of Artificial Intelligence since 1957 and has authored many articles and books. The two most recent books are Contributions to a Computer-Based Theory of Strategies (New York: Springer-Verlag) and An Artificial Intelligence Technique for Information and Fact Retrieval — An Application in Medical Knowledge Processing (Cambridge, MA: MIT Press). His current interests include Artificial Intelligence, Simulation of Cognitive Behavior, Heuristic Programming, Decision Making under Uncertainty and Risk, Theory of Strategies, Computational Linguistics, Information Retrieval, and Expert Systems  相似文献   

8.
提出了基于属性重要性的关联分类方法.与传统算法不同的是根据属性重要性程度生成类别关联规则;并且在构造分类器时改进了CBA算法中对于具有相同支持度、置信度规则选择时的随机性.实验结果证明,用该方法得到的分类规则与传统的关联分类算法相比,复杂度低,且有效提高了分类效果.  相似文献   

9.
10.
 This paper presents a novel hybrid of the two complimentary technologies of soft computing viz. neural networks and fuzzy logic to design a fuzzy rule based pattern classifier for problems with higher dimensional feature spaces. The neural network component of the hybrid, which acts as a pre-processor, is designed to take care of the all-important issue of feature selection. To circumvent the disadvantages of the popular back propagation algorithm to train the neural network, a meta-heuristic viz. threshold accepting (TA) has been used instead. Then, a fuzzy rule based classifier takes over the classification task with a reduced feature set. A combinatorial optimisation problem is formulated to minimise the number of rules in the classifier while guaranteeing high classification power. A modified threshold accepting algorithm proposed elsewhere by the authors (Ravi V, Zimmermann H.-J. (2000) Eur J Oper Res 123: 16–28) has been employed to solve this optimization problem. The proposed methodology has been demonstrated for (1) the wine classification problem having 13 features and (2) the Wisconsin breast cancer determination problem having 9 features. On the basis of these examples the results seem to be very interesting, as there is no reduction in the classification power in either of the problems, despite the fact that some of the original features have been completely eliminated from the study. On the contrary, the chosen features in both the problems yielded 100% classification power in some cases.  相似文献   

11.
Proteins can be grouped into families according to some features such as hydrophobicity, composition or structure, aiming to establish common biological functions. This paper presents MAHATMA—memetic algorithm-based highly adapted tool for motif ascertainment—a system that was conceived to discover features (particular sequences of amino acids, or motifs) that occur very often in proteins of a given family but rarely occur in proteins of other families. These features can be used for the classification of unknown proteins, that is, to predict their function by analyzing their primary structure. Experiments were done with a set of enzymes extracted from the Protein Data Bank. The heuristic method used was based on genetic programming using operators specially tailored for the target problem. The final performance was measured using sensitivity, specificity and hit rate. The best results obtained for the enzyme dataset suggest that the proposed evolutionary computation method is effective in finding predictive features (motifs) for protein classification.  相似文献   

12.

The discovery of multi-level knowledge is important to allow queries at and across different levels of abstraction. While there are some similarities between our research and that of others in this area, the work reported in this paper does not directly involve databases and is differently motivated. Our research is interested in taking data in the form of rule-bases and finding multi-level knowledge. This paper describes our motivation, our preferred technique for acquiring the initial knowledge known as Ripple-Down Rules, the use of Formal Concept Analysis to develop an abstraction hierarchy, and our application of these ideas to knowledge bases from the domain of chemical pathology. We also provide an example of how the approach can be applied to other prepositional knowledge bases and suggest that it can be used as an additional phase to many existing data mining approaches.  相似文献   

13.
This paper addresses an approach that recommends investment types to stock investors by discovering useful rules from past changing patterns of stock prices in databases. First, we define a new rule model for recommending stock investment types. For a frequent pattern of stock prices, if its subsequent stock prices are matched to a condition of an investor, the model recommends a corresponding investment type for this stock. The frequent pattern is regarded as a rule head, and the subsequent part a rule body. We observed that the conditions on rule bodies are quite different depending on dispositions of investors while rule heads are independent of characteristics of investors in most cases. With this observation, we propose a new method that discovers and stores only the rule heads rather than the whole rules in a rule discovery process. This allows investors to impose various conditions on rule bodies flexibly, and also improves the performance of a rule discovery process by reducing the number of rules to be discovered. For efficient discovery and matching of rules, we propose methods for discovering frequent patterns, constructing a frequent pattern base, and its indexing. We also suggest a method that finds the rules matched to a query from a frequent pattern base, and a method that recommends an investment type by using the rules. Finally, we verify the effectiveness and the efficiency of our approach through extensive experiments with real-life stock data.  相似文献   

14.
Social media, especially Twitter is now one of the most popular platforms where people can freely express their opinion. However, it is difficult to extract important summary information from many millions of tweets sent every hour. In this work we propose a new concept, sentimental causal rules, and techniques for extracting sentimental causal rules from textual data sources such as Twitter which combine sentiment analysis and causal rule discovery. Sentiment analysis refers to the task of extracting public sentiment from textual data. The value in sentiment analysis lies in its ability to reflect popularly voiced perceptions that are stated in natural language. Causal rules on the other hand indicate associations between different concepts in a context where one (or several concepts) cause(s) the other(s). We believe that sentimental causal rules are an effective summarization mechanism that combine causal relations among different aspects extracted from textual data as well as the sentiment embedded in these causal relationships. In order to show the effectiveness of sentimental causal rules, we have conducted experiments on Twitter data collected on the Kurdish political issue in Turkey which has been an ongoing heated public debate for many years. Our experiments on Twitter data show that sentimental causal rule discovery is an effective method to summarize information about important aspects of an issue in Twitter which may further be used by politicians for better policy making.  相似文献   

15.
The paper presents a new approach for fault classification in transmission line using a systematic fuzzy rule based approach. Fault classification is one of the important requirements in distance relaying for identifying the accurate phases involved in the fault process. The proposed technique starts with preprocessing the fault current signal using advanced time–frequency transform such as S-transform to compute various statistical features. After the required features are extracted, the Decision Tree (DT), a knowledge representation method, is used for initial classification. From the DT classification boundaries, the fuzzy membership functions (MFs) and corresponding fuzzy rule-base is developed for final classification. Thus a systematic fuzzy rule base is developed for fault classification, reducing the redundancies and complexities involved compared to Heuristic fuzzy rule-based approach. Also a qualitative comparison is made between S-transform and Wavelet transform, where S-transform based DT-fuzzy provides highly improved results compared to the later during simulation as well as experimental tests.  相似文献   

16.
A new scheme of knowledge-based classification and rule generation using a fuzzy multilayer perceptron (MLP) is proposed. Knowledge collected from a data set is initially encoded among the connection weights in terms of class a priori probabilities. This encoding also includes incorporation of hidden nodes corresponding to both the pattern classes and their complementary regions. The network architecture, in terms of both links and nodes, is then refined during training. Node growing and link pruning are also resorted to. Rules are generated from the trained network using the input, output, and connection weights in order to justify any decision(s) reached. Negative rules corresponding to a pattern not belonging to a class can also be obtained. These are useful for inferencing in ambiguous cases. Results on real life and synthetic data demonstrate that the speed of learning and classification performance of the proposed scheme are better than that obtained with the fuzzy and conventional versions of the MLP (involving no initial knowledge encoding). Both convex and concave decision regions are considered in the process.  相似文献   

17.
A drawback of traditional data-mining methods is that they do not leverage prior knowledge of users. In prior work, we proposed a method that could discover unexpected patterns in data by using domain knowledge in a systematic manner. In this paper, we present new methods for discovering a minimal set of unexpected patterns by combining the two, independent concepts of minimality and unexpectedness, both of which have been well-studied in the KDD literature. We demonstrate the strengths of this approach experimentally using a case study in a marketing domain.  相似文献   

18.
Data mining methods have been successfully applied to different fields. Aviation industry is one of them. There is a large amount of knowledge and data accumulation in aviation industry. These data could be stored in the form of pilot reports, maintenance reports, incident reports or delay reports. This paper explains the data mining application on the incident reports of the Federal Aviation Administration (FAA) Accident/Incident Data System database, contains incident data records for all categories of civil aviation between the years of 2000 and 2006. In this study, we applied data mining methods on the incident reports. Moreover rough sets concept is used to reduce the attributes of data set. The purpose of this application is to find out the effective attributes in order to reduce the number of the fatality in the incidents. The categorization tools and decision trees are used to find the relations and rules about the incidents resulted in fatality. For this purpose data-mining analysis is conducted. As a result some rules about the fatality are obtained and also the parameters that affect the fatality in the incident have determined. The rules found are tested in terms of their accuracy and reliability, and these results are seen to be meaningful.  相似文献   

19.
In this article,a novel unordered classification rule list discovery algorithm is presented based on Ant Colony Optimization(ACO). The proposed classifier is compared empirically with two other ACO-based classification techniques on 26 data sets,selected from miscellaneous domains,based on several performance measures. As opposed to its ancestors,our technique has the flexibility of generating a list of IF-THEN rules with unrestricted order. It makes the generated classification model more comprehensible and easily interpretable.The results indicate that the performance of the proposed method is statistically significantly better as compared with previous versions of AntMiner based on predictive accuracy and comprehensibility of the classification model.  相似文献   

20.
Pattern classification based on Bayesian statistical decision theory needs a complete knowledge of the probability laws to perform the classification. In the actual pattern classification, however, it is generally impossible to get the complete knowledge as constant feature values are influenced by noise. Therefore, it is necessary to construct more flexible and robust theory for pattern classification. In this paper, a pattern classification theory using feature values defined on closed interval is formalized in the framework of Dempster-Shafer measure. Then, in order to make up the lack of information, an integration algorithm is proposed, which integrates the information observed by several information sources with considering source values  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号