首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 546 毫秒
1.
In this paper, we present a novel meta-feature generation method in the context of meta-learning, which is based on rules that compare the performance of individual base learners in a one-against-one manner. In addition to these new meta-features, we also introduce a new meta-learner called Approximate Ranking Tree Forests (ART Forests) that performs very competitively when compared with several state-of-the-art meta-learners. Our experimental results are based on a large collection of datasets and show that the proposed new techniques can improve the overall performance of meta-learning for algorithm ranking significantly. A key point in our approach is that each performance figure of any base learner for any specific dataset is generated by optimising the parameters of the base learner separately for each dataset.  相似文献   

2.
Mining association rules plays an important role in data mining and knowledge discovery since it can reveal strong associations between items in databases. Nevertheless, an important problem with traditional association rule mining methods is that they can generate a huge amount of association rules depending on how parameters are set. However, users are often only interested in finding the strongest rules, and do not want to go through a large amount of rules or wait for these rules to be generated. To address those needs, algorithms have been proposed to mine the top-k association rules in databases, where users can directly set a parameter k to obtain the k most frequent rules. However, a major issue with these techniques is that they remain very costly in terms of execution time and memory. To address this issue, this paper presents a novel algorithm named ETARM (Efficient Top-k Association Rule Miner) to efficiently find the complete set of top-k association rules. The proposed algorithm integrates two novel candidate pruning properties to more effectively reduce the search space. These properties are applied during the candidate selection process to identify items that should not be used to expand a rule based on its confidence, to reduce the number of candidates. An extensive experimental evaluation on six standard benchmark datasets show that the proposed approach outperforms the state-of-the-art TopKRules algorithm both in terms of runtime and memory usage.  相似文献   

3.
Artificial neural network (ANN) is one of the most widely used techniques in classification data mining. Although ANNs can achieve very high classification accuracies, their explanation capability is very limited. Therefore one of the main challenges in using ANNs in data mining applications is to extract explicit knowledge from them. Based on this motivation, a novel approach is proposed in this paper for generating classification rules from feed forward type ANNs. Although there are several approaches in the literature for classification rule extraction from ANNs, the present approach is fundamentally different from them. In the previous studies, ANN training and rule extraction is generally performed independently in a sequential (hierarchical) manner. However, in the present study, training and rule extraction phases are integrated within a multiple objective evaluation framework for generating accurate classification rules directly. The proposed approach makes use of differential evolution algorithm for training and touring ant colony optimization algorithm for rule extracting. The proposed algorithm is named as DIFACONN-miner. Experimental study on the benchmark data sets and comparisons with some other classical and state-of-the art rule extraction algorithms has shown that the proposed approach has a big potential to discover more accurate and concise classification rules.  相似文献   

4.
Current trends clearly indicate that online learning has become an important learning mode. However, no effective assessment mechanism for learning performance yet exists for e-learning systems. Learning performance assessment aims to evaluate what learners learned during the learning process. Traditional summative evaluation only considers final learning outcomes, without concerning the learning processes of learners. With the evolution of learning technology, the use of learning portfolios in a web-based learning environment can be beneficially adopted to record the procedure of the learning, which evaluates the learning performances of learners and produces feedback information to learners in ways that enhance their learning. Accordingly, this study presents a mobile formative assessment tool using data mining, which involves six computational intelligence theories, i.e. statistic correlation analysis, fuzzy clustering analysis, grey relational analysis, K-means clustering, fuzzy association rule mining and fuzzy inference, in order to identify the key formative assessment rules according to the web-based learning portfolios of an individual learner for the performance promotion of web-based learning. Restated, the proposed method can help teachers to precisely assess the learning performance of individual learner utilizing only the learning portfolios in a web-based learning environment. Hence, teachers can devote themselves to teaching and designing courseware, since they save a lot of time in measuring learning performance. More importantly, teachers can understand the main factors influencing learning performance in a web-based learning environment based on the interpretable learning performance assessment rules obtained. Experimental results indicate that the evaluation results of the proposed scheme are very close to those of summative assessment results and the factor analysis provides simple and clear learning performance assessment rules. Furthermore, the proposed learning feedback with formative assessment can clearly promote the learning performances and interests of learners.  相似文献   

5.
A large volume of research in temporal data mining is focusing on discovering temporal rules from time-stamped data. The majority of the methods proposed so far have been mainly devoted to the mining of temporal rules which describe relationships between data sequences or instantaneous events and do not consider the presence of complex temporal patterns into the dataset. Such complex patterns, such as trends or up and down behaviors, are often very interesting for the users. In this paper we propose a new kind of temporal association rule and the related extraction algorithm; the learned rules involve complex temporal patterns in both their antecedent and consequent. Within our proposed approach, the user defines a set of complex patterns of interest that constitute the basis for the construction of the temporal rule; such complex patterns are represented and retrieved in the data through the formalism of knowledge-based Temporal Abstractions. An Apriori-like algorithm looks then for meaningful temporal relationships (in particular, precedence temporal relationships) among the complex patterns of interest. The paper presents the results obtained by the rule extraction algorithm on a simulated dataset and on two different datasets related to biomedical applications: the first one concerns the analysis of time series coming from the monitoring of different clinical variables during hemodialysis sessions, while the other one deals with the biological problem of inferring relationships between genes from DNA microarray data.  相似文献   

6.
关联规则挖掘可以深入发现空间数据间的感兴趣知识。空间数据格式多样、数据量大,现有的算法并不适合。本文以RSI及产量图为数据源,提出了基于图像分割的两阶段空间关联规则挖掘算法,挖掘图像像素颜色值之间的空间关联规则。通过算法分析和实验,该算法是有效、可行的。  相似文献   

7.
Simple association rules (SAR) and the SAR-based rule discovery   总被引:13,自引:0,他引:13  
Association rule mining is one of the most important fields in data mining and knowledge discovery in databases. Rules explosion is a problem of concern, as conventional mining algorithms often produce too many rules for decision makers to digest. Instead, this paper concentrates on a smaller set of rules, namely, a set of simple association rules each with its consequent containing only a single attribute. Such a rule set can be used to derive all other association rules, meaning that the original rule set based on conventional algorithms can be ‘recovered’ from the simple rules without any information loss. The number of simple rules is much less than the number of all rules. Moreover, corresponding algorithms are developed such that certain forms of rules (e.g. ‘P?’ or ‘?Q’) can be generated in a more efficient manner based on simple rules.  相似文献   

8.
针对单一层次结构实现规则提取具有规则提取准确性不高、算法运行时间长、难以满足用户使用需求的问题,提出一种基于改进多层次模糊关联规则的定量数据挖掘算法。采用高频项目集合,通过不断深化迭代的方法形成自顶向下的挖掘过程,整合模糊集合理论、数据挖掘算法以及多层次分类技术,从事务数据集中寻找模糊关联规则,挖掘出储存在多层次结构事务数据库中定量值信息的隐含知识,实现用户的定制化信息挖掘需求。实验结果表明,提出的数据挖掘算法在挖掘精度和运算时间方面相较于其他算法具有突出优势,可为多层次关联规则提取方法的实际应用带来新的发展空间。  相似文献   

9.
The problem of sharp boundary widely exists in image classification algorithms that use traditional association rules. This problem makes classification more difficult and inaccurate. On the other hand, massive image data will produce a lot of redundant association rules, which greatly decrease the accuracy and efficiency of image classification. To relieve the influence of these two problems, this paper proposes a novel approach integrating fuzzy association rules and decision tree to accomplish the task of automatic image annotation. According to the original features with membership functions, the approach first obtains fuzzy feature vectors, which can describe the ambiguity and vagueness of images. Then fuzzy association rules are generated from fuzzy feature vectors with fuzzy support and fuzzy confidence. Fuzzy association rules can capture correlations between low-level visual features and high-level semantic concepts of images. Finally, to tackle the large size of fuzzy association rules base, we adopt decision tree to reduce the unnecessary rules. As a result, the algorithm complexity is decreased to a large extent. We conduct the experiments on two baseline datasets, i.e. Corel5k and IAPR-TC12. The evaluation measures include precision, recall, F-measure and rule number. The experimental results show that our approach performs better than many state-of-the-art automatic image annotation approaches.  相似文献   

10.
Associative classification (AC), which is based on association rules, has shown great promise over many other classification techniques. To implement AC effectively, we need to tackle the problems on the very large search space of candidate rules during the rule discovery process and incorporate the discovered association rules into the classification process. This paper proposes a new approach that we call artificial immune system-associative classification (AIS-AC), which is based on AIS, for mining association rules effectively for classification. Instead of massively searching for all possible association rules, AIS-AC will only find a subset of association rules that are suitable for effective AC in an evolutionary manner. In this paper, we also evaluate the performance of the proposed AIS-AC approach for AC based on large datasets. The performance results have shown that the proposed approach is efficient in dealing with the problem on the complexity of the rule search space, and at the same time, good classification accuracy has been achieved. This is especially important for mining association rules from large datasets in which the search space of rules is huge.  相似文献   

11.
Image Mining: Trends and Developments   总被引:9,自引:0,他引:9  
Advances in image acquisition and storage technology have led to tremendous growth in very large and detailed image databases. These images, if analyzed, can reveal useful information to the human users. Image mining deals with the extraction of implicit knowledge, image data relationship, or other patterns not explicitly stored in the images. Image mining is more than just an extension of data mining to image domain. It is an interdisciplinary endeavor that draws upon expertise in computer vision, image processing, image retrieval, data mining, machine learning, database, and artificial intelligence. In this paper, we will examine the research issues in image mining, current developments in image mining, particularly, image mining frameworks, state-of-the-art techniques and systems. We will also identify some future research directions for image mining.  相似文献   

12.
One of the major challenges in data mining is the extraction of comprehensible knowledge from recorded data. In this paper, a coevolutionary-based classification technique, namely COevolutionary Rule Extractor (CORE), is proposed to discover classification rules in data mining. Unlike existing approaches where candidate rules and rule sets are evolved at different stages in the classification process, the proposed CORE coevolves rules and rule sets concurrently in two cooperative populations to confine the search space and to produce good rule sets that are comprehensive. The proposed coevolutionary classification technique is extensively validated upon seven datasets obtained from the University of California, Irvine (UCI) machine learning repository, which are representative artificial and real-world data from various domains. Comparison results show that the proposed CORE produces comprehensive and good classification rules for most datasets, which are competitive as compared with existing classifiers in literature. Simulation results obtained from box plots also unveil that CORE is relatively robust and invariant to random partition of datasets.  相似文献   

13.
张春生  庄丽艳 《计算机应用》2013,33(10):2796-2800
Apriori关联规则数据挖掘算法只针对一类相关数据集进行数据挖掘,而现实世界中各种不同的数据集非常庞大,如何在不相关数据集间进行数据挖掘,拓展规则的数量具有挑战性。目前Apriori关联规则算法研究基本上集中在算法性能优化和针对不同数据形式的基础上,没有突破不相关数据集的界限。针对这个问题,首先给出了相关数据集、不相关数据集、相容数据集的概念,进一步给出了一种基于Apriori的不相关数据集中相容数据集间的关联规则演绎算法,给出了算法演绎规则,通过构建法证明了算法的正确性。通过实例演示了应用方法,该算法可实现基于Apriori的相容数据集间关联规则的规则演绎,是普通数据挖掘算法无法实现的,扩展了关联规则算法的应用领域;同时,由于关联规则是在相容数据集上独立挖掘出来的,没有进行原始数据间的交换,在一定程度上实现了隐私保护  相似文献   

14.
Association rule mining is one of most popular data analysis methods that can discover associations within data. Association rule mining algorithms have been applied to various datasets, due to their practical usefulness. Little attention has been paid, however, on how to apply the association mining techniques to analyze questionnaire data. Therefore, this paper first identifies the various data types that may appear in a questionnaire. Then, we introduce the questionnaire data mining problem and define the rule patterns that can be mined from questionnaire data. A unified approach is developed based on fuzzy techniques so that all different data types can be handled in a uniform manner. After that, an algorithm is developed to discover fuzzy association rules from the questionnaire dataset. Finally, we evaluate the performance of the proposed algorithm, and the results indicate that our method is capable of finding interesting association rules that would have never been found by previous mining algorithms.  相似文献   

15.
数据挖掘是关联规则中一个重要的研究方向。该文对关联规则的数据挖掘和遗传算法进行了概述,提出了一种改进型遗传算法的关联规则提取算法。最后结合实例给出了用遗传算法进行关联规则的挖掘方法。  相似文献   

16.
Support vector machines (SVMs) are state-of-the-art tools used to address issues pertinent to classification. However, the explanation capabilities of SVMs are also their main weakness, which is why SVMs are typically regarded as incomprehensible black box models. In the present study, a rule extraction algorithm to extract the comprehensible rule from SVMs and enhance their explanation capability is proposed. The proposed algorithm seeks to use the support vectors from a training model of SVMs and combine genetic algorithms for constructing rule sets. The proposed method can not only generate rule sets from SVMs based on the mixed discrete and continuous variables but can also select important variables in the rule set simultaneously. Measurements of accuracy, sensitivity, specificity, and fidelity are utilized to compare the performance of the proposed method with direct learner algorithms and several rule-extraction techniques from SVMs. The results indicate that the proposed method performs at least as well as with the most successful direct rule learners. Finally, an actual case of pressure ulcer was studied, and the results indicated the practicality of our proposed method in real applications.  相似文献   

17.
In concept learning and data mining tasks, the learner is typically faced with a choice of many possible hypotheses or patterns characterizing the input data. If one can assume that training data contain no noise, then the primary conditions a hypothesis must satisfy are consistency and completeness with regard to the data. In real-world applications, however, data are often noisy, and the insistence on the full completeness and consistency of the hypothesis is no longer valid. In such situations, the problem is to determine a hypothesis that represents the best trade-off between completeness and consistency. This paper presents an approach to this problem in which a learner seeks rules optimizing a rule quality criterion that combines the rule coverage (a measure of completeness) and training accuracy (a measure of inconsistency). These factors are combined into a single rule quality measure through a lexicographical evaluation functional (LEF). The method has been implemented in the AQ18 learning system for natural induction and pattern discovery, and compared with several other methods. Experiments have shown that the proposed method can be easily tailored to different problems and can simulate different rule learners by modifying the parameter of the rule quality criterion.  相似文献   

18.
数据挖掘是关联规则中一个重要的研究方向。该文对关联规则的数据挖掘和遗传算法进行了概述,提出了一种改进型遗传算法的关联规则提取算法。最后结合实例给出了用遗传算法进行关联规则的挖掘方法。  相似文献   

19.
董林  舒红 《计算机应用》2013,33(11):3049-3051
为了得到有趣且有效的空间关联规则通常需要多次执行挖掘操作,可以使用增量维护算法来提高挖掘效率。然而,能够直接使用空间数据的关联规则增量更新算法尚属空白。为解决这一问题,对挖掘阈值改变和空间数据集更新后通过筛选或增量挖掘等方法实现规则维护的策略进行了分析,并提出适用于支持度阈值减小和空间图层增加这两类情况的增量挖掘算法——ISA。ISA算法不依赖于空间事务表的构建与更新,可以直接使用空间图层作为输入数据。在基于实际数据的实验中,采用ISA算法所得结果与类Apriori算法一致,耗时则相对缩短20.0%至71.0%;此外,对1372772条规则进行了基于筛选的更新,耗时低于0.1s。实验结果表明,所提出的空间关联规则增量维护策略和算法是可行、正确且高效的。  相似文献   

20.
桂现才  彭宏 《微机发展》2005,15(10):35-38
在大型数据库项目之间发现关联规则是一个重要的数据挖掘问题,而挖掘出的关联规则数目常常是巨大的。文中介绍了简单关联规则和原关联规则的概念,而传统算法挖掘出的关联规则集中的任何规则,均可以由原关联规则导出,并且原关联规则的数目远远小于传统算法挖掘出的关联规则数目。对简单关联规则和原关联规则进行了分析比较,给出了挖掘原关联规则算法,并举例说明算法的执行过程。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号