期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

On Taxonomy and Evaluation of Feature Selection‐Based Learning Classifier System Ensemble Approaches for Data Mining Problems

下载免费PDF全文

Essam Debie Kamran Shafi Kathryn Merrick Chris Lokan 《Computational Intelligence》2017,33(3):554-578

Ensemble methods aim at combining multiple learning machines to improve the efficacy in a learning task in terms of prediction accuracy, scalability, and other measures. These methods have been applied to evolutionary machine learning techniques including learning classifier systems (LCSs). In this article, we first propose a conceptual framework that allows us to appropriately categorize ensemble‐based methods for fair comparison and highlights the gaps in the corresponding literature. The framework is generic and consists of three sequential stages: a pre‐gate stage concerned with data preparation; the member stage to account for the types of learning machines used to build the ensemble; and a post‐gate stage concerned with the methods to combine ensemble output. A taxonomy of LCSs‐based ensembles is then presented using this framework. The article then focuses on comparing LCS ensembles that use feature selection in the pre‐gate stage. An evaluation methodology is proposed to systematically analyze the performance of these methods. Specifically, random feature sampling and rough set feature selection‐based LCS ensemble methods are compared. Experimental results show that the rough set‐based approach performs significantly better than the random subspace method in terms of classification accuracy in problems with high numbers of irrelevant features. The performance of the two approaches are comparable in problems with high numbers of redundant features. 相似文献

2.

Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithms

Ozcift A Gulten A 《Computer methods and programs in biomedicine》2011,104(3):443-451

Improving accuracies of machine learning algorithms is vital in designing high performance computer-aided diagnosis (CADx) systems. Researches have shown that a base classifier performance might be enhanced by ensemble classification strategies. In this study, we construct rotation forest (RF) ensemble classifiers of 30 machine learning algorithms to evaluate their classification performances using Parkinson's, diabetes and heart diseases from literature.While making experiments, first the feature dimension of three datasets is reduced using correlation based feature selection (CFS) algorithm. Second, classification performances of 30 machine learning algorithms are calculated for three datasets. Third, 30 classifier ensembles are constructed based on RF algorithm to assess performances of respective classifiers with the same disease data. All the experiments are carried out with leave-one-out validation strategy and the performances of the 60 algorithms are evaluated using three metrics; classification accuracy (ACC), kappa error (KE) and area under the receiver operating characteristic (ROC) curve (AUC).Base classifiers succeeded 72.15%, 77.52% and 84.43% average accuracies for diabetes, heart and Parkinson's datasets, respectively. As for RF classifier ensembles, they produced average accuracies of 74.47%, 80.49% and 87.13% for respective diseases.RF, a newly proposed classifier ensemble algorithm, might be used to improve accuracy of miscellaneous machine learning algorithms to design advanced CADx systems. 相似文献

3.

Parallelizing Feature Selection

Jerffeson Teixeira de Souza Stan Matwin Nathalie Japkowicz 《Algorithmica》2006,45(3):433-456

Classification is a key problem in machine learning/data mining. Algorithms for classification have the ability to predict the class of a new instance after having been trained on data representing past experience in classifying instances. However, the presence of a large number of features in training data can hurt the classification capacity of a machine learning algorithm. The Feature Selection problem involves discovering a subset of features such that a classifier built only with this subset would attain predictive accuracy no worse than a classifier built from the entire set of features. Several algorithms have been proposed to solve this problem. In this paper we discuss how parallelism can be used to improve the performance of feature selection algorithms. In particular, we present, discuss and evaluate a coarse-grained parallel version of the feature selection algorithm FortalFS. This algorithm performs well compared with other solutions and it has certain characteristics that makes it a good candidate for parallelization. Our parallel design is based on the master--slave design pattern. Promising results show that this approach is able to achieve near optimum speedups in the context of Amdahl's Law. 相似文献

4.

Double-layer bayesian classifier ensembles based on frequent itemsets

Wei-Guo Yi Jing Duan Ming-Yu Lu 《国际自动化与计算杂志》2012,9(2):215-220

Numerous models have been proposed to reduce the classification error of Na¨ ve Bayes by weakening its attribute independence assumption and some have demonstrated remarkable error performance. Considering that ensemble learning is an effective method of reducing the classification error of the classifier, this paper proposes a double-layer Bayesian classifier ensembles (DLBCE) algorithm based on frequent itemsets. DLBCE constructs a double-layer Bayesian classifier (DLBC) for each frequent itemset the new instance contained and finally ensembles all the classifiers by assigning different weight to different classifier according to the conditional mutual information. The experimental results show that the proposed algorithm outperforms other outstanding algorithms. 相似文献

5.

Classification by ensembles from random partitions of high-dimensional data

Hongshik Ahn Hojin Moon Noha Lim Ralph L. Kodell 《Computational statistics & data analysis》2007,51(12):6166-6179

A robust classification procedure is developed based on ensembles of classifiers, with each classifier constructed from a different set of predictors determined by a random partition of the entire set of predictors. The proposed methods combine the results of multiple classifiers to achieve a substantially improved prediction compared to the optimal single classifier. This approach is designed specifically for high-dimensional data sets for which a classifier is sought. By combining classifiers built from each subspace of the predictors, the proposed methods achieve a computational advantage in tackling the growing problem of dimensionality. For each subspace of the predictors, we build a classification tree or logistic regression tree. Our study shows, using four real data sets from different areas, that our methods perform consistently well compared to widely used classification methods. For unbalanced data, our approach maintains the balance between sensitivity and specificity more adequately than many other classification methods considered in this study. 相似文献

6.

DF-SVM: a decision forest constructed on artificially enlarged feature space by support vector machine

M. Faisal Zaman Hideo Hirose 《Artificial Intelligence Review》2013,40(4):467-494

Enlarging the feature space of the base tree classifiers in a decision forest by means of informative features extracted from an additional predictive model is advantageous for classification tasks. In this paper, we have empirically examined the performance of this type of decision forest with three different base tree classifier models including; (1) the full decision tree, (2) eight-node decision tree and (3) two-node decision tree (or decision stump). The hybrid decision forest with these base classifiers are trained in nine different sized resampled training sets. We have examined the performance of all these ensembles from different point of views; we have studied the bias-variance decomposition of the misclassification error of the ensembles, then we have investigated the amount of dependence and degree of uncertainty among the base classifiers of these ensembles using information theoretic measures. The experiment was designed to find out: (1) optimal training set size for each base classifier and (2) which base classifier is optimal for this kind of decision forest. In the final comparison, we have checked whether the subsampled version of the decision forest outperform the bootstrapped version. All the experiments have been conducted with 20 benchmark datasets from UCI machine learning repository. The overall results clearly point out that with careful selection of the base classifier and training sample size, the hybrid decision forest can be an efficient tool for real world classification tasks. 相似文献

7.

Applying a new localized generalization error model to design neural networks trained with extreme learning machine

Qiang Liu Jianping Yin Victor C. M. Leung Jun-Hai Zhai Zhiping Cai Jiarun Lin 《Neural computing & applications》2016,27(1):59-66

High accuracy and low overhead are two key features of a well-designed classifier for different classification scenarios. In this paper, we propose an improved classifier using a single-hidden layer feedforward neural network (SLFN) trained with extreme learning machine. The novel classifier first utilizes principal component analysis to reduce the feature dimension and then selects the optimal architecture of the SLFN based on a new localized generalization error model in the principal component space. Experimental and statistical results on the NSL-KDD data set demonstrate that the proposed classifier can achieve a significant performance improvement compared with previous classifiers. 相似文献

8.

Aerial scene classification via an ensemble extreme learning machine classifier based on discriminative hybrid convolutional neural networks features

Lihua Ye Lei Wang Yaxin Sun Rong Zhu Yuanwang Wei 《International journal of remote sensing》2019,40(7):2759-2783

Identifying a discriminative feature can effectively improve the classification performance of aerial scene classification. Deep convolutional neural networks (DCNN) have been widely used in aerial scene classification for its learning discriminative feature ability. The DCNN feature can be more discriminative by optimizing the training loss function and using transfer learning methods. To enhance the discriminative power of a DCNN feature, the improved loss functions of pretraining models are combined with a softmax loss function and a centre loss function. To further improve performance, in this article, we propose hybrid DCNN features for aerial scene classification. First, we use DCNN models with joint loss functions and transfer learning from pretrained deep DCNN models. Second, the dense DCNN features are extracted, and the discriminative hybrid features are created using linear connection. Finally, an ensemble extreme learning machine (EELM) classifier is adopted for classification due to its general superiority and low computational cost. Experimental results based on the three public benchmark data sets demonstrate that the hybrid features obtained using the proposed approach and classified by the EELM classifier can result in remarkable performance. 相似文献

9.

Support vector learning for fuzzy rule-based classification systems 总被引：11，自引：0，他引：11

Yixin Chen Wang J.Z. 《Fuzzy Systems, IEEE Transactions on》2003,11(6):716-728

To design a fuzzy rule-based classification system (fuzzy classifier) with good generalization ability in a high dimensional feature space has been an active research topic for a long time. As a powerful machine learning approach for pattern recognition problems, the support vector machine (SVM) is known to have good generalization ability. More importantly, an SVM can work very well on a high- (or even infinite) dimensional feature space. This paper investigates the connection between fuzzy classifiers and kernel machines, establishes a link between fuzzy rules and kernels, and proposes a learning algorithm for fuzzy classifiers. We first show that a fuzzy classifier implicitly defines a translation invariant kernel under the assumption that all membership functions associated with the same input variable are generated from location transformation of a reference function. Fuzzy inference on the IF-part of a fuzzy rule can be viewed as evaluating the kernel function. The kernel function is then proven to be a Mercer kernel if the reference functions meet a certain spectral requirement. The corresponding fuzzy classifier is named positive definite fuzzy classifier (PDFC). A PDFC can be built from the given training samples based on a support vector learning approach with the IF-part fuzzy rules given by the support vectors. Since the learning process minimizes an upper bound on the expected risk (expected prediction error) instead of the empirical risk (training error), the resulting PDFC usually has good generalization. Moreover, because of the sparsity properties of the SVMs, the number of fuzzy rules is irrelevant to the dimension of input space. In this sense, we avoid the "curse of dimensionality." Finally, PDFCs with different reference functions are constructed using the support vector learning approach. The performance of the PDFCs is illustrated by extensive experimental results. Comparisons with other methods are also provided. 相似文献

10.

An improved approach to medical data sets classification: artificial immune recognition system with fuzzy resource allocation mechanism 总被引：1，自引：0，他引：1

Kemal Polat Salih Güne&#; 《Expert Systems》2007,24(4):252-270

Abstract: The artificial immune recognition system (AIRS) has been shown to be an efficient approach to tackling a variety of problems such as machine learning benchmark problems and medical classification problems. In this study, the resource allocation mechanism of AIRS was replaced with a new one based on fuzzy logic. The new system, named Fuzzy-AIRS, was used as a classifier in the classification of three well-known medical data sets, the Wisconsin breast cancer data set (WBCD), the Pima Indians diabetes data set and the ECG arrhythmia data set. The performance of the Fuzzy-AIRS algorithm was tested for classification accuracy, sensitivity and specificity values, confusion matrix, computation time and receiver operating characteristic curves. Also, the AIRS and Fuzzy-AIRS algorithms were compared with respect to the amount of resources required in the execution of the algorithm. The highest classification accuracy obtained from applying the AIRS and Fuzzy-AIRS algorithms using 10-fold cross-validation was, respectively, 98.53% and 99.00% for classification of WBCD; 79.22% and 84.42% for classification of the Pima Indians diabetes data set; and 100% and 92.86% for classification of the ECG arrhythmia data set. Hence, these results show that Fuzzy-AIRS can be used as an effective classifier for medical problems. 相似文献

11.

Ensemble feature selection with the simple Bayesian classification

《Information Fusion》2003,4(2):87-100

A popular method for creating an accurate classifier from a set of training data is to build several classifiers, and then to combine their predictions. The ensembles of simple Bayesian classifiers have traditionally not been a focus of research. One way to generate an ensemble of accurate and diverse simple Bayesian classifiers is to use different feature subsets generated with the random subspace method. In this case, the ensemble consists of multiple classifiers constructed by randomly selecting feature subsets, that is, classifiers constructed in randomly chosen subspaces. In this paper, we present an algorithm for building ensembles of simple Bayesian classifiers in random subspaces. The EFS_SBC algorithm includes a hill-climbing-based refinement cycle, which tries to improve the accuracy and diversity of the base classifiers built on random feature subsets. We conduct a number of experiments on a collection of 21 real-world and synthetic data sets, comparing the EFS_SBC ensembles with the single simple Bayes, and with the boosted simple Bayes. In many cases the EFS_SBC ensembles have higher accuracy than the single simple Bayesian classifier, and than the boosted Bayesian ensemble. We find that the ensembles produced focusing on diversity have lower generalization error, and that the degree of importance of diversity in building the ensembles is different for different data sets. We propose several methods for the integration of simple Bayesian classifiers in the ensembles. In a number of cases the techniques for dynamic integration of classifiers have significantly better classification accuracy than their simple static analogues. We suggest that a reason for that is that the dynamic integration better utilizes the ensemble coverage than the static integration. 相似文献

12.

使用PCA建立基于规则的组合分类器

石国强牛常勇范明《计算机科学与探索》2010,4(5):455-463

提出了一种使用基于规则的基分类器建立组合分类器的新方法PCARules。尽管新方法也采用基分类器预测的加权投票来决定待分类样本的类,但是为基分类器创建训练数据集的方法与bagging和boosting完全不同。该方法不是通过抽样为基分类器创建数据集,而是随机地将特征划分成K个子集,使用PCA得到每个子集的主成分,形成新的特征空间,并将所有训练数据映射到新的特征空间作为基分类器的训练集。在UCI机器学习库的30个随机选取的数据集上的实验表明:算法不仅能够显著提高基于规则的分类方法的分类性能,而且与bagging和boosting等传统组合方法相比,在大部分数据集上都具有更高的分类准确率。相似文献

13.

An adaptive rule-based classifier for mining big biological data

《Expert systems with applications》2016

In this paper, we introduce a new adaptive rule-based classifier for multi-class classification of biological data, where several problems of classifying biological data are addressed: overfitting, noisy instances and class-imbalance data. It is well known that rules are interesting way for representing data in a human interpretable way. The proposed rule-based classifier combines the random subspace and boosting approaches with ensemble of decision trees to construct a set of classification rules without involving global optimisation. The classifier considers random subspace approach to avoid overfitting, boosting approach for classifying noisy instances and ensemble of decision trees to deal with class-imbalance problem. The classifier uses two popular classification techniques: decision tree and k-nearest-neighbor algorithms. Decision trees are used for evolving classification rules from the training data, while k-nearest-neighbor is used for analysing the misclassified instances and removing vagueness between the contradictory rules. It considers a series of k iterations to develop a set of classification rules from the training data and pays more attention to the misclassified instances in the next iteration by giving it a boosting flavour. This paper particularly focuses to come up with an optimal ensemble classifier that will help for improving the prediction accuracy of DNA variant identification and classification task. The performance of proposed classifier is tested with compared to well-approved existing machine learning and data mining algorithms on genomic data (148 Exome data sets) of Brugada syndrome and 10 real benchmark life sciences data sets from the UCI (University of California, Irvine) machine learning repository. The experimental results indicate that the proposed classifier has exemplary classification accuracy on different types of biological data. Overall, the proposed classifier offers good prediction accuracy to new DNA variants classification where noisy and misclassified variants are optimised to increase test performance. 相似文献

14.

ECG beat classification using particle swarm optimization and support vector machine

Ali KHAZAEE A. E. ZADEH 《Frontiers of Computer Science》2014,8(2):217-231

In this paper, we propose a novel ECG arrhythmia classification method using power spectral-based features and support vector machine (SVM) classifier. The method extracts electrocardiogram’s spectral and three timing interval features. Non-parametric power spectral density (PSD) estimation methods are used to extract spectral features. The proposed approach optimizes the relevant parameters of SVM classifier through an intelligent algorithm using particle swarm optimization (PSO). These parameters are: Gaussian radial basis function (GRBF) kernel parameter σ and C penalty parameter of SVM classifier. ECG records from the MIT-BIH arrhythmia database are selected as test data. It is observed that the proposed power spectral-based hybrid particle swarm optimization-support vector machine (SVMPSO) classification method offers significantly improved performance over the SVM which has constant and manually extracted parameter. 相似文献

15.

Texture classification based on curvelet transform and extreme learning machine with reduced feature set

Sanae Berraho Samira El Margae Mounir Ait Kerroum Youssef Fakhri 《Multimedia Tools and Applications》2017,76(18):18425-18448

相似文献

16.

Ensemble based extreme learning machine for cross-modality face matching

Yi Jin Jiuwen Cao Yizhi Wang Ruicong Zhi 《Multimedia Tools and Applications》2016,75(19):11831-11846

相似文献

17.

A comparative study of classifier ensembles for bankruptcy prediction

《Applied Soft Computing》2014

The aim of bankruptcy prediction in the areas of data mining and machine learning is to develop an effective model which can provide the higher prediction accuracy. In the prior literature, various classification techniques have been developed and studied, in/with which classifier ensembles by combining multiple classifiers approach have shown their outperformance over many single classifiers. However, in terms of constructing classifier ensembles, there are three critical issues which can affect their performance. The first one is the classification technique actually used/adopted, and the other two are the combination method to combine multiple classifiers and the number of classifiers to be combined, respectively. Since there are limited, relevant studies examining these aforementioned disuses, this paper conducts a comprehensive study of comparing classifier ensembles by three widely used classification techniques including multilayer perceptron (MLP) neural networks, support vector machines (SVM), and decision trees (DT) based on two well-known combination methods including bagging and boosting and different numbers of combined classifiers. Our experimental results by three public datasets show that DT ensembles composed of 80–100 classifiers using the boosting method perform best. The Wilcoxon signed ranked test also demonstrates that DT ensembles by boosting perform significantly different from the other classifier ensembles. Moreover, a further study over a real-world case by a Taiwan bankruptcy dataset was conducted, which also demonstrates the superiority of DT ensembles by boosting over the others. 相似文献

18.

Age estimation using a hierarchical classifier based on global and local facial features

Sung Eun Choi Author Vitae 《Pattern recognition》2011,44(6):1262-1281

The research related to age estimation using face images has become increasingly important, due to the fact it has a variety of potentially useful applications. An age estimation system is generally composed of aging feature extraction and feature classification; both of which are important in order to improve the performance. For the aging feature extraction, the hybrid features, which are a combination of global and local features, have received a great deal of attention, because this method can compensate for defects found in individual global and local features. As for feature classification, the hierarchical classifier, which is composed of an age group classification (e.g. the class of less than 20 years old, the class of 20-39 years old, etc.) and a detailed age estimation (e.g. 17, 23 years old, etc.), provide a much better performance than other methods. However, both the hybrid features and hierarchical classifier methods have only been studied independently and no research combining them has yet been conducted in the previous works. Consequently, we propose a new age estimation method using a hierarchical classifier method based on both global and local facial features. Our research is novel in the following three ways, compared to the previous works. Firstly, age estimation accuracy is greatly improved through a combination of the proposed hybrid features and the hierarchical classifier. Secondly, new local feature extraction methods are proposed in order to improve the performance of the hybrid features. The wrinkle feature is extracted using a set of region specific Gabor filters, each of which is designed based on the regional direction of the wrinkles, and the skin feature is extracted using a local binary pattern (LBP), capable of extracting the detailed textures of skin. Thirdly, the improved hierarchical classifier is based on a support vector machine (SVM) and a support vector regression (SVR). To reduce the error propagation of the hierarchical classifier, each age group classifier is designed so that the age range to be estimated is overlapped by consideration of false acceptance error (FAE) and false rejection error (FRE) of each classifier. The experimental results showed that the performance of the proposed method was superior to that of the previous methods when using the BERC, PAL and FG-Net aging databases. 相似文献

19.

K nearest neighbor reinforced expectation maximization method

Mehmet Aci Mutlu Avci 《Expert systems with applications》2011,38(10):12585-12591

K nearest neighbor and Bayesian methods are effective methods of machine learning. Expectation maximization is an effective Bayesian classifier. In this work a data elimination approach is proposed to improve data clustering. The proposed method is based on hybridization of k nearest neighbor and expectation maximization algorithms. The k nearest neighbor algorithm is considered as the preprocessor for expectation maximization algorithm to reduce the amount of training data making it difficult to learn. The suggested method is tested on well-known machine learning data sets iris, wine, breast cancer, glass and yeast. Simulations are done in MATLAB environment and performance results are concluded. 相似文献

20.

Adaptive multi-parent crossover GA for feature optimization in epileptic seizure identification

《Applied Soft Computing》2019

EEG signal analysis involves multi-frequency non-stationary brain waves from multiple channels. Segmenting these signals, extracting features to obtain the important properties of the signal and classification are key aspects of detecting epileptic seizures. Despite the introduction of several techniques, it is very challenging when multiple EEG channels are involved. When many channels exist, a spatial filter is required to eliminate noise and extract relevant information. This adds a new dimension of complexity to the frequency feature space. In order to stabilize the classifier of the channels, feature selection is very important. Furthermore, and to improve the performance of a classifier, more data is required from EEG channels for complex problems. The increase of such data poses some challenges as it becomes difficult to identify the subject dependent bands when the channels increase. Hence, an automated process is required for such identification.The proposed approach in this work tends to tackle the multiple EEG channels problem by segmenting the EEG signals in the frequency domain based on changing spikes rather than the traditional time based windowing approach. While to reduce the overall dimensionality and preserve the class-dependent features an optimization approach is used. This process of selecting an optimal feature subset is an optimization problem. Thus, we propose an adaptive multi-parent crossover Genetic Algorithm (GA) for optimizing the features used in classifying epileptic seizures. The GA-based approach is used to optimize the various features obtained. It encodes the temporal and spatial filter estimates and optimize the feature selection with respect to the classification error. The classification was done using a Support Vector Machine (SVM).The proposed technique was evaluated using the publicly available epileptic seizure data from the machine learning repository of the UCI center for machine learning and intelligent systems. The proposed approach outperforms other ones and achieved a high level of accuracy. These results, indicate the ability of a multi-parent crossover GA in optimizing the feature selection process in EEG classification. 相似文献