共查询到20条相似文献,搜索用时 765 毫秒
1.
2.
3.
案例检索是案例推理系统的中心环节,检索质量关系着整个系统的质量。利用遗传算法GA和层次分析法AHP相结合,从案例库,属性的约简,权值确定三方面对案例检索进行优化。利用遗传算法在搜索优化上的优势,使用两维的编码结合权值从而形成三维优化,并利用经验和权值中间表进行权值学习。从而提高检索命中率。并将这种模型运用到基于旅游的多策略数据挖掘系统进行实验,结果表明在案例检索的命中率上有明显提高。 相似文献
4.
We are witnessing the era of big data computing where computing the resources is becoming the main
bottleneck to deal with those large datasets. In the case of high-dimensional data where each view of
data is of high dimensionality, feature selection is necessary for further improving the clustering and
classification results. In this paper, we propose a new feature selection method, Incremental Filtering
Feature Selection (IF2S) algorithm, and a new clustering algorithm, Temporal Interval based Fuzzy
Minimal Clustering (TIFMC) algorithm that employs the Fuzzy Rough Set for selecting optimal subset
of features and for effective grouping of large volumes of data, respectively. An extensive experimental
comparison of the proposed method and other methods are done using four different classifiers. The
performance of the proposed algorithms yields promising results on the feature selection, clustering
and classification accuracy in the field of biomedical data mining. 相似文献
5.
6.
粗糙集理论是一种有效的信息处理工具,决策表属性约简是粗糙集理论研究的一个核心内容。利用RoughSet理论的相关知识,提出了一种基于包含度的决策表属性约简算法。与现有的决策表属性约简算法进行比较,它具有较低的复杂度和较强的可使用性。最后对UCI机器学习数据库中的例子进行约简的实验结果证明,它可以取得比较满意的效果。 相似文献
7.
基于属性相关性的属性约简新方法 总被引:7,自引:0,他引:7
文章给出了一个基于粗糙集理论的属性相关性的新定义,并在此基础上给出了基于属性相关性的属性约简新方法。本算法不但能过滤掉属性集合中的无关属性,而且能有效地找到属性集合中的冗余属性,从而得到满意的属性约简。对UCI机器学习数据集的测试结果也验证了算法的有效性。 相似文献
8.
Luiz S. Oliveira Marisa Morita Robert Sabourin 《International Journal on Document Analysis and Recognition》2006,8(4):262-279
Feature selection for ensembles has shown to be an effective strategy for ensemble creation due to its ability of producing good subsets of features, which make the classifiers of the ensemble disagree on difficult cases. In this paper we present an ensemble feature selection approach based on a hierarchical multi-objective genetic algorithm. The underpinning paradigm is the “overproduce and choose”. The algorithm operates in two levels. Firstly, it performs feature selection in order to generate a set of classifiers and then it chooses the best team of classifiers. In order to show its robustness, the method is evaluated in two different contexts:supervised and unsupervised feature selection. In the former, we have considered the problem of handwritten digit recognition and used three different feature sets and multi-layer perceptron neural networks as classifiers. In the latter, we took into account the problem of handwritten month word recognition and used three different feature sets and hidden Markov models as classifiers. Experiments and comparisons with classical methods, such as Bagging and Boosting, demonstrated that the proposed methodology brings compelling improvements when classifiers have to work with very low error rates. Comparisons have been done by considering the recognition rates only. 相似文献
9.
10.
11.
12.
《Expert systems with applications》2014,41(17):8003-8015
The current research presents a methodology for classification based on Mahalanobis Distance (MD) and Association Mining using Rough Sets Theory (RST). MD has been used in Mahalanobis Taguchi System (MTS) to develop classification scheme for systems having dichotomous states or categories. In MTS, selection of important features or variables to improve classification accuracy is done using Signal-to-Noise (S/N) ratios and Orthogonal Arrays (OAs). OAs has been reviewed for limitations in handling large number of variables. Secondly, penalty for over-fitting or regularization is not included in the feature selection process for the MTS classifier. Besides, there is scope to enhance the utility of MTS to a classification-cum-causality analysis method by adding comprehensive information about the underlying process which generated the data. This paper proposes to select variables based on maximization of degree-of-dependency between Subset of System Variables (SSVs) and system classes or categories (R). Degree-of-dependency, which reflects goodness-of-model and hence goodness of the SSV, is measured by conditional probability of system states on subset of variables. Moreover, a suitable regularization factor equivalent to L0 norm is introduced in an optimization problem which jointly maximizes goodness-of-model and effect of regularization. Dependency between SSVs and R is modeled via the equivalent sets of Rough Set Theory. Two new variants of MTS classifier are developed and their performance in terms of accuracy of classification is evaluated on test datasets from five case studies. The proposed variants of MTS are observed to be performing better than existing MTS methods and other classification techniques found in literature. 相似文献
13.
14.
《Engineering Applications of Artificial Intelligence》2003,16(1):39-43
In contingency management of a complex system, identification of error condition or faults diagnosis is a very important stage. It determines the methods and techniques to be applied in the following stages of contingency management. In this paper, Rough Set Theory as a new fault-diagnosing tool is used to identify the valve fault for a multi-cylinder diesel engine. This method overcomes the shortcoming of conventional methods where each method of fault diagnosis on diesel engine can only provide one corresponding fault category. By the analysis of the final reducts generated using Rough Set Theory, it is shown that this new method is effective for valve fault diagnosis and it is a new powerful tool that can be applied in contingency management. 相似文献
15.
A genetic algorithm-based method for feature subset selection 总被引:5,自引:2,他引:3
Feng Tan Xuezheng Fu Yanqing Zhang Anu G. Bourgeois 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2008,12(2):111-120
As a commonly used technique in data preprocessing, feature selection selects a subset of informative attributes or variables
to build models describing data. By removing redundant and irrelevant or noise features, feature selection can improve the
predictive accuracy and the comprehensibility of the predictors or classifiers. Many feature selection algorithms with different
selection criteria has been introduced by researchers. However, it is discovered that no single criterion is best for all
applications. In this paper, we propose a framework based on a genetic algorithm (GA) for feature subset selection that combines
various existing feature selection methods. The advantages of this approach include the ability to accommodate multiple feature
selection criteria and find small subsets of features that perform well for a particular inductive learning algorithm of interest
to build the classifier. We conducted experiments using three data sets and three existing feature selection methods. The
experimental results demonstrate that our approach is a robust and effective approach to find subsets of features with higher
classification accuracy and/or smaller size compared to each individual feature selection algorithm. 相似文献
16.
粗糙模糊集的不确定性度量 总被引:8,自引:1,他引:7
粗糙集理论是一种有效处理不精确、不确定含糊信息的数学理论,近年来已被广泛应用于机器学习、数据挖掘、智能数据分析。该文结合知识粗糙性与信息熵给出了一种关于粗糙模糊集(RF集)的不确定性度量。 相似文献
17.
基于案例推理的供应商选择决策支持系统研究 总被引:10,自引:1,他引:10
在介绍了基于案例推理方法的基本原理基础之上,分析了基于案例推理技术的供应商选择决策支持系统的工作原理、框架结构及功能;重点论述了基于案例推理的供应商选择决策支持系统中的一些关键步骤,并结合实例给出了基于案例推理的供应商选择与评价方法,用来验证基于案例推理技术在供应商选择决策支持系统中应用的可行性和有效性,为企业供应商选择决策提供了一个系统模型。 相似文献
18.
Enislay Ramentol Yailé Caballero Rafael Bello Francisco Herrera 《Knowledge and Information Systems》2012,33(2):245-265
Imbalanced data is a common problem in classification. This phenomenon is growing in importance since it appears in most real domains. It has special relevance to highly imbalanced data-sets (when the ratio between classes is high). Many techniques have been developed to tackle the problem of imbalanced training sets in supervised learning. Such techniques have been divided into two large groups: those at the algorithm level and those at the data level. Data level groups that have been emphasized are those that try to balance the training sets by reducing the larger class through the elimination of samples or increasing the smaller one by constructing new samples, known as undersampling and oversampling, respectively. This paper proposes a new hybrid method for preprocessing imbalanced data-sets through the construction of new samples, using the Synthetic Minority Oversampling Technique together with the application of an editing technique based on the Rough Set Theory and the lower approximation of a subset. The proposed method has been validated by an experimental study showing good results using C4.5 as the learning algorithm. 相似文献
19.
粗糙集与泛系理论相结合已成为一个新兴的研究领域,基于泛系理论中的泛权场/网等理论,对粗糙集理论的基本概念进行了基本的概括和扩展,将粗糙集理论泛系化扩展加以研究,进而构建了粗糙集的泛系化扩展模型,并通过实例给予解释,为粗糙集的进一步完善和扩展找到了一条新路。 相似文献
20.
A Case-Based Explanation System for Black-Box Systems 总被引:4,自引:0,他引:4
Most users of machine-learning products are reluctant to use them without any sense of the underlying logic that has led to
the system’s predictions. Unfortunately many of these systems lack any transparency in the way they operate and are deemed
to be black boxes. In this paper we present a Case-Based Reasoning (CBR) solution to providing supporting explanations of
black-box systems. This CBR solution has two key facets; it uses local information to assess the importance of each feature
and using this, it selects the cases from the data used to build the black-box system for use in explanation. The retrieval
mechanism takes advantage of the derived feature importance information to help select cases that are a better reflection
of the black-box solution and thus more convincing explanations. 相似文献