首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
遥感图像分类是遥感领域的研究热点之一.提出了一种基于自适应区间划分的模糊关联遥感图像分类方法(fuzzy associative remote sensing classification,FARSC).算法根据遥感图像分类的特点,利用模糊C均值聚类算法自适应地建立连续型属性模糊区间,使用新的剪枝策略对项集进行筛选从而避免生成无用规则,采用一种新的规则重要性度量方法对多模糊分类规则进行融合,从而有效地提高分类效率和精确度.在UCI数据和遥感图像上所作实验结果表明,算法具有较高的分类精度以及对样本数量变化的不敏感性,对于解决遥感图像分类问题,FARSC算法具有较高的实用性,是一种有效的遥感图像分类方法.  相似文献   

2.
针对现有关联分类技术的不足,提出了一种适用于关联分类的增量更新算法IUAC。该算法是基于频繁模式树挖掘和更新关联规则的,并使用一种树形结构来存储最终用于分类的关联规则。同时,增加了对分类规则的约束条件,进一步控制了用于分类的关联规则的数量。最后,对算法整体进行了分析和讨论。  相似文献   

3.
Associative classifiers are a classification system based on associative classification rules. Although associative classification is more accurate than a traditional classification approach, it cannot handle numerical data and its relationships. Therefore, an ongoing research problem is how to build associative classifiers from numerical data. In this work, we focus on stock trading data with many numerical technical indicators, and the classification problem is finding sell and buy signals from the technical indicators. This study proposes a GA-based algorithm used to build an associative classifier that can discover trading rules from these numerical indicators. The experiment results show that the proposed approach is an effective classification technique with high prediction accuracy and is highly competitive when compared with the data distribution method.  相似文献   

4.
Classification, a data mining technique, has widespread applications including medical diagnosis, targeted marketing, and others. Knowledge discovery from databases in the form of association rules is one of the important data mining tasks. An integrated approach, classification based on association rules, has drawn the attention of the data mining community over the last decade. While attention has been mainly focused on increasing classifier accuracies, not much efforts have been devoted towards building interpretable and less complex models. This paper discusses the development of a compact associative classification model using a hill-climbing approach and fuzzy sets. The proposed methodology builds the rule-base by selecting rules which contribute towards increasing training accuracy, thus balancing classification accuracy with the number of classification association rules. The results indicated that the proposed associative classification model can achieve competitive accuracies on benchmark datasets with continuous attributes and lend better interpretability, when compared with other rule-based systems.  相似文献   

5.
One of the known classification approaches in data mining is rule induction (RI). RI algorithms such as PRISM usually produce If-Then classifiers, which have a comparable predictive performance to other traditional classification approaches such as decision trees and associative classification. Hence, these classifiers are favourable for carrying out decisions by users and therefore they can be utilised as decision making tools. Nevertheless, RI methods, including PRISM and its successors, suffer from a number of drawbacks primarily the large number of rules derived. This can be a burden especially when the input data is largely dimensional. Therefore, pruning unnecessary rules becomes essential for the success of this type of classifiers. This article proposes a new RI algorithm that reduces the search space for candidate rules by early pruning any irrelevant items during the process of building the classifier. Whenever a rule is generated, our algorithm updates the candidate items frequency to reflect the discarded data examples associated with the rules derived. This makes items frequency dynamic rather static and ensures that irrelevant rules are deleted in preliminary stages when they don't hold enough data representation. The major benefit will be a concise set of decision making rules that are easy to understand and controlled by the decision maker. The proposed algorithm has been implemented in WEKA (Waikato Environment for Knowledge Analysis) environment and hence it can now be utilised by different types of users such as managers, researchers, students and others. Experimental results using real data from the security domain as well as sixteen classification datasets from University of California Irvine (UCI) repository reveal that the proposed algorithm is competitive in regards to classification accuracy when compared to known RI algorithms. Moreover, the classifiers produced by our algorithm are smaller in size which increase their possible use in practical applications.  相似文献   

6.
基于排序的关联分类算法   总被引:1,自引:0,他引:1  
提出了一种基于排序的关联分类算法.利用基于规则的分类方法中择优方法偏爱高精度规则的思想和考虑尽可能多的规则,改进了CBA(Classification Based on Associations)只根据少数几条覆盖训练集的规则构造分类器的片面性.首先采用关联规则挖掘算法产生后件为类标号的关联规则,然后根据长度、置信度、支持度和提升度等对规则进行排序,并在排序时删除对分类结果没有影响的规则.排序后的规则加上一个默认分类便构成最终的分类器.选用20个UCI公共数据集的实验结果表明,提出的算法比CBA具有更高的平均分类精度.  相似文献   

7.
Ant colony optimization (ACO) algorithms have been successfully applied in data classification, which aim at discovering a list of classification rules. However, due to the essentially random search in ACO algorithms, the lists of classification rules constructed by ACO-based classification algorithms are not fixed and may be distinctly different even using the same training set. Those differences are generally ignored and some beneficial information cannot be dug from the different data sets, which may lower the predictive accuracy. To overcome this shortcoming, this paper proposes a novel classification rule discovery algorithm based on ACO, named AntMinermbc, in which a new model of multiple rule sets is presented to produce multiple lists of rules. Multiple base classifiers are built in AntMinermbc, and each base classifier is expected to remedy the weakness of other base classifiers, which can improve the predictive accuracy by exploiting the useful information from various base classifiers. A new heuristic function for ACO is also designed in our algorithm, which considers both of the correlation and coverage for the purpose to avoid deceptive high accuracy. The performance of our algorithm is studied experimentally on 19 publicly available data sets and further compared to several state-of-the-art classification approaches. The experimental results show that the predictive accuracy obtained by our algorithm is statistically higher than that of the compared targets.  相似文献   

8.
Building a highly-compact and accurate associative classifier   总被引:1,自引:1,他引:0  
Associative classification has aroused significant research attention in recent years due to its advantage in rule forms with satisfactory accuracy. However, the rules in associative classifiers derived from typical association rule mining (e.g., Apriori-type) may easily become too many to be understood and even be sometimes redundant or conflicting. To deal with these issues of concern, a recently proposed approach (i.e., GARC) appears to be superior to other existing approaches (e.g., C4.5-type, NN, SVM, CBA) in two respects: one is its classification accuracy that is equally satisfactory; the other is the compactness that the generated classifier is constituted with much fewer rules. Along with this line of methodological thinking, this paper presents a novel GARC-type approach, namely GEAR, to build an associative classifier with three distinctive and desirable features. First, the rules in the GEAR classifier are more intuitively appealing; second, the GEAR classification accuracy is improved or at least as good as others; and third, the GEAR classifier is significantly more compact in size. In doing so, a number of notions including rule redundancy and compact set are provided, together with related properties that could be incorporated into the rule mining process as algorithmic pruning strategies. The experimental results with benchmarking datasets also reveal that GEAR outperforms GARC and other approaches in an effective manner.  相似文献   

9.
方敏  王宝树 《计算机科学》2003,30(10):52-54
The fuzzy associative classifier is investigated in this paper. The design methods of the fuzzy associative classifier with genetic algorithm for training are presented. This method trains the weight and back terms to obtain classification rules automatically. Radar radiant points are classified by using of this algorithm, and the simulation results show that the method has higher identification precision than available fuzzy classifiers.  相似文献   

10.
Association rule mining is a data mining technique for discovering useful and novel patterns or relationships from databases. These rules are simple to infer and intuitive and can be easily used for classification in any domain that requires explanation for and investigation into how the classification works. Examples of such areas are medicine, agriculture, education, etc. For such a system to find wide adoptability, it should give output that is correct and comprehensible. The amount of data has been growing very fast and so has the search space of these problems. So we need to change traditional methods. This paper discusses a rule mining classifier called DA-AC (dynamic adaptive-associative classifier) which is based on a Dynamic Particle Swarm Optimizer. Due to its seeding method, exemplar selection, adaptive parameters, dynamic reconstruction of regions and velocity update, it avoids premature convergence and provides a better value in every dimension. Quality evaluation is done both for individual rules as well as entire rulesets. Experiments were conducted over fifteen benchmark datasets to evaluate performance of proposed algorithm in comparison with six other state-of-the-art non associative classifiers and eight associative classifiers. Results demonstrate competitive performance of proposed DA-AC while considering predictive accuracy and number of mined patterns as parameters. The method was then applied to predict life expectancy of post operative thoracic surgery patients.  相似文献   

11.
Security administrators need to prioritise which feature to focus on amidst the various possibilities and avenues of attack, especially via Web Service in e-commerce applications. This study addresses the feature selection problem by proposing a predictive fuzzy associative rule model (FARM). FARM validates inputs by segregating the anomalies based fuzzy associative patterns discovered from five attributes in the intrusion datasets. These associative patterns leads to the discovery of a set of 18 interesting rules at 99% confidence and subsequently, categorisation into not only certainly allow/deny but also probably deny access decision class. FARM's classification provides 99% classification accuracy and less than 1% false alarm rate. Our findings indicate two benefits to using fuzzy datasets. First, fuzzy enables the discovery of fuzzy association patterns, fuzzy association rules and more sensitive classification. In addition, the root mean squared error (RMSE) and classification accuracy for fuzzy and crisp datasets do not differ much when using the Random Forest classifier. However, when other classifiers are used with increasing number of instances on the fuzzy and crisp datasets, the fuzzy datasets perform much better. Future research will involve experimentation on bigger data sets on different data types.  相似文献   

12.
Associative classification is a new classification approach integrating association mining and classification. It becomes a significant tool for knowledge discovery and data mining. However, high-order association mining is time consuming when the number of attributes becomes large. The recent development of the AdaBoost algorithm indicates that boosting simple rules could often achieve better classification results than the use of complex rules. In view of this, we apply the AdaBoost algorithm to an associative classification system for both learning time reduction and accuracy improvement. In addition to exploring many advantages of the boosted associative classification system, this paper also proposes a new weighting strategy for voting multiple classifiers.  相似文献   

13.
研究分析了现有关联规则分类算法,总结了一般关联规则分类存在的不足,提出了一个基于关联规则挖掘技术构造分类器的新方法。该方法解决了传统算法产生规则太多,分类模型难以理解的问题。  相似文献   

14.
The comprehensibility aspect of rule discovery is of emerging interest in the realm of knowledge discovery in databases. Of the many cognitive and psychological factors relating the comprehensibility of knowledge, we focus on the use of human amenable concepts as a representation language in expressing classification rules. Existing work in neural logic networks (or neulonets) provides impetus for our research; its strength lies in its ability to learn and represent complex human logic in decision-making using symbolic-interpretable net rules. A novel technique is developed for neulonet learning by composing net rules using genetic programming. Coupled with a sequential covering approach for generating a list of neulonets, the straightforward extraction of human-like logic rules from each neulonet provides an alternate perspective to the greater extent of knowledge that can potentially be expressed and discovered, while the entire list of neulonets together constitute an effective classifier. We show how the sequential covering approach is analogous to association-based classification, leading to the development of an association-based neulonet classifier. Empirical study shows that associative classification integrated with the genetic construction of neulonets performs better than general association-based classifiers in terms of higher accuracies and smaller rule sets. This is due to the richness in logic expression inherent in the neulonet learning paradigm.  相似文献   

15.
In Spatial Data Mining, spatial dimension adds a substantial complexity to the data mining task. First, spatial objects are characterized by a geometrical representation and relative positioning with respect to a reference system, which implicitly define both spatial relationships and properties. Second, spatial phenomena are characterized by autocorrelation, i.e., observations of spatially distributed random variables are not location-independent. Third, spatial objects can be considered at different levels of abstraction (or granularity). The recently proposed SPADA algorithm deals with all these sources of complexity, but it offers a solution for the task of spatial association rules discovery. In this paper the problem of mining spatial classifiers is faced by building an associative classification framework on SPADA. We consider two alternative solutions for associative classification: a propositional and a structural method. In the former, SPADA obtains a propositional representation of training data even in spatial domains which are inherently non-propositional, thus allowing the application of traditional data mining algorithms. In the latter, the Bayesian framework is extended following a multi-relational data mining approach in order to cope with spatial classification tasks. Both methods are evaluated and compared on two real-world spatial datasets and results provide several empirical insights on them.  相似文献   

16.
A genetic algorithm-based rule extraction system   总被引:1,自引:0,他引:1  
Individual classifiers predict unknown objects. Although, these are usually domain specific, and lack the property of scaling up prediction while handling data sets with huge size and high-dimensionality or imbalance class distribution. This article introduces an accuracy-based learning system called DTGA (decision tree and genetic algorithm) that aims to improve prediction accuracy over any classification problem irrespective to domain, size, dimensionality and class distribution. More specifically, the proposed system consists of two rule inducing phases. In the first phase, a base classifier, C4.5 (a decision tree based rule inducer) is used to produce rules from training data set, whereas GA (genetic algorithm) in the next phase refines them with the aim to provide more accurate and high-performance rules for prediction. The system has been compared with competent non-GA based systems: neural network, Naïve Bayes, rule-based classifier using rough set theory and C4.5 (i.e., the base classifier of DTGA), on a number of benchmark datasets collected from UCI (University of California at Irvine) machine learning repository. Empirical results demonstrate that the proposed hybrid approach provides marked improvement in a number of cases.  相似文献   

17.
Mining fuzzy association rules for classification problems   总被引:3,自引:0,他引:3  
The effective development of data mining techniques for the discovery of knowledge from training samples for classification problems in industrial engineering is necessary in applications, such as group technology. This paper proposes a learning algorithm, which can be viewed as a knowledge acquisition tool, to effectively discover fuzzy association rules for classification problems. The consequence part of each rule is one class label. The proposed learning algorithm consists of two phases: one to generate large fuzzy grids from training samples by fuzzy partitioning in each attribute, and the other to generate fuzzy association rules for classification problems by large fuzzy grids. The proposed learning algorithm is implemented by scanning training samples stored in a database only once and applying a sequence of Boolean operations to generate fuzzy grids and fuzzy rules; therefore, it can be easily extended to discover other types of fuzzy association rules. The simulation results from the iris data demonstrate that the proposed learning algorithm can effectively derive fuzzy association rules for classification problems.  相似文献   

18.
一种大数据环境中分布式辅助关联分类算法   总被引:4,自引:0,他引:4  
张明卫  朱志良  刘莹  张斌 《软件学报》2015,26(11):2795-2810
在很多现实的分类应用中,新数据的类标需要由领域专家最终确定,而分类器的分类结果仅起辅助作用.另外,随着大数据所隐含价值越发被人们重视,分类器的训练会从面向单一数据集逐渐过渡到面向分布式空间数据集,大数据环境下辅助分类也将成为未来分类应用的重要分支.然而,现有的分类研究缺乏对此类应用的关注.大数据环境中的辅助分类面临以下3个问题:1) 训练集是分布式大数据集;2) 在空间上,训练集所包含的各局部数据源的类别分布不尽相同;3) 在时间上,训练集是动态变化的,会发生类别迁移现象.在考虑以上问题的基础上,提出一种大数据环境中分布式辅助关联分类方法.该方法首先给出一种大数据环境中分布式关联分类器构建算法,在该算法中,通过横向加权考虑分类数据集在空间上的类别分布差异,并给出"前件空间支持度-相关系数"的度量框架,改进关联分类算法面对不平衡数据的性能缺陷;然后,给出一种基于适应因子的辅助关联分类器动态调整方法,能够在分类器应用过程中充分利用领域专家实时反馈的结果对分类器进行动态调整,以提升其面向动态数据集的分类性能,减缓分类器的退化和重新训练的频率.实验结果表明,该方法能够面向分布式数据集较快地训练出有较高分类准确率的关联分类器,并在数据集不断扩充变化时提升分类性能,是一种有效的大数据环境中辅助分类应用方法.  相似文献   

19.
关联分类通常产生大量的分类规则,导致在分类新实例时经常产生规则冲突问题。针对这种规则冲突问题,提出了一种基于改进关联分类的两次学习框架。利用频繁且互关联的项集产生分类规则改进关联分类算法,有效减少了规则数。应用改进的关联分类算法产生的一级规则一次性分离出训练集中规则冲突的所有实例。然后,在冲突实例上应用改进的关联分类算法进行第二次学习得到二级规则。分类新实例时,首先利用第一级规则进行分类。如果出现规则冲突,则利用第二级规则分类该实例。实验结果表明,基于改进关联分类的两次学习方法降低了规则冲突比率,并且显著提高了分类准确率。  相似文献   

20.
An increasing number of computational and statistical approaches have been used for text classification, including nearest-neighbor classification, naïve Bayes classification, support vector machines, decision tree induction, rule induction, and artificial neural networks. Among these approaches, naïve Bayes classifiers have been widely used because of its simplicity. Due to the simplicity of the Bayes formula, the naïve Bayes classification algorithm requires a relatively small number of training data and shorter time in both the training and classification stages as compared to other classifiers. However, a major short coming of this technique is the fact that the classifier will pick the highest probability category as the one to which the document is annotated too. Doing this is tantamount to classifying using only one dimension of a multi-dimensional data set. The main aim of this work is to utilize the strengths of the self organizing map (SOM) to overcome the inadvertent dimensionality reduction resulting from using only the Bayes formula to classify. Combining the hybrid system with new ranking techniques further improves the performance of the proposed document classification approach. This work describes the implementation of an enhanced hybrid classification approach which affords a better classification accuracy through the utilization of two familiar algorithms, the naïve Bayes classification algorithm which is used to vectorize the document using a probability distribution and the self organizing map (SOM) clustering algorithm which is used as the multi-dimensional unsupervised classifier.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号