期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

MODE: multiobjective differential evolution for feature selection and classifier ensemble

Utpal Kumar Sikdar Asif Ekbal Sriparna Saha 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2015,19(12):3529-3549

相似文献

2.

Chained ensemble classifier for image annotation

Marin-Castro Heidy M. Hernandez-Resendiz Jaciel D. Escalante-Balderas Hugo J. Pellegrin Luis Tello-Leal Edgar 《Multimedia Tools and Applications》2019,78(18):26263-26285

相似文献

3.

A feature selection-based speaker clustering method for paralinguistic tasks

Gábor Gosztolya László Tóth 《Pattern Analysis & Applications》2018,21(1):193-204

In recent years, computational paralinguistics has emerged as a new topic within speech technology. It concerns extracting non-linguistic information from speech (such as emotions, the level of conflict, whether the speaker is drunk). It was shown recently that many methods applied here can be assisted by speaker clustering; for example, the features extracted from the utterances could be normalized speaker-wise instead of using a global method. In this paper, we propose a speaker clustering algorithm based on standard clustering approaches like K-means and feature selection. By applying this speaker clustering technique in two paralinguistic tasks, we were able to significantly improve the accuracy scores of several machine learning methods, and we also obtained an insight into what features could be efficiently used to separate the different speakers. 相似文献

4.

Multiobjective optimization for classifier ensemble and feature selection: an application to named entity recognition

Asif Ekbal Sriparna Saha 《International Journal on Document Analysis and Recognition》2012,15(2):143-166

In this paper, the concept of finding an appropriate classifier ensemble for named entity recognition is posed as a multiobjective optimization (MOO) problem. Our underlying assumption is that instead of searching for the best-fitting feature set for a particular classifier, ensembling of several classifiers those are trained using different feature representations could be a more fruitful approach, but it is crucial to determine the appropriate subset of classifiers that are most suitable for the ensemble. We use three heterogenous classifiers namely maximum entropy, conditional random field, and support vector machine in order to build a number of models depending upon the various representations of the available features. The proposed MOO-based ensemble technique is evaluated for three resource-constrained languages, namely Bengali, Hindi, and Telugu. Evaluation results yield the recall, precision, and F-measure values of 92.21, 92.72, and 92.46%, respectively, for Bengali; 97.07, 89.63, and 93.20%, respectively, for Hindi; and 80.79, 93.18, and 86.54%, respectively, for Telugu. We also evaluate our proposed technique with the CoNLL-2003 shared task English data sets that yield the recall, precision, and F-measure values of 89.72, 89.84, and 89.78%, respectively. Experimental results show that the classifier ensemble identified by our proposed MOO-based approach outperforms all the individual classifiers, two different conventional baseline ensembles, and the classifier ensemble identified by a single objective?Cbased approach. In a part of the paper, we formulate the problem of feature selection in any classifier under the MOO framework and show that our proposed classifier ensemble attains superior performance to it. 相似文献

5.

Margin-based ensemble classifier for protein fold recognition

Tao Yang Vojislav Kecman Longbing Cao Chengqi Zhang Joshua Zhexue Huang 《Expert systems with applications》2011,38(10):12348-12355

Recognition of protein folding patterns is an important step in protein structure and function predictions. Traditional sequence similarity-based approach fails to yield convincing predictions when proteins have low sequence identities, while the taxonometric approach is a reliable alternative. From a pattern recognition perspective, protein fold recognition involves a large number of classes with only a small number of training samples, and multiple heterogeneous feature groups derived from different propensities of amino acids. This raises the need for a classification method that is able to handle the data complexity with a high prediction accuracy for practical applications. To this end, a novel ensemble classifier, called MarFold, is proposed in this paper which combines three margin-based classifiers for protein fold recognition.The effectiveness of our method is demonstrated with the benchmark D-B dataset with 27 classes. The overall prediction accuracy obtained by MarFold is 71.7%, which surpasses the existing fold recognition methods by 3.1–15.7%. Moreover, one component classifier for MarFold, called ALH, has obtained a prediction accuracy of 65.5%, which is 4.7–9.5% higher than the prediction accuracies for the published methods using single classifiers. Additionally, the feature set of pairwise frequency information about the amino acids, which is adopted by MarFold, is found to be important for discriminating folding patterns. These results imply that the MarFold method and its operation engine ALH might become useful vehicles for protein fold recognition, as well as other bioinformatics tasks. The MarFold method and the datasets can be obtained from: (http://www-staff.it.uts.edu.au/～lbcao/publication/MarFold.7z). 相似文献

6.

基于差异性度量的多分类器集成系统设计

薛梅郑全弟《计算机工程与设计》2010,31(23)

为了解决在分类器集成过程中分类性能要求高和集成过程复杂等问题,分析常规集成方法的优缺点,研究已有的分类器差异性度量方法,提出了筛选差异性尽可能大的分类器作为基分类器而构建的一个层级式分类器集成系统.构建不同的基分类器,选择准确率较高的备选,分析其差异性,选出差异大的分类器作为系统所需基分类器,构成集成系统.通过在UCI数据集上进行的试验,获得了很好的分类识别效果,验证了这种分类集成系统的优越性. 相似文献

7.

Prototype selection for dynamic classifier and ensemble selection

Cruz Rafael M. O. Sabourin Robert Cavalcanti George D. C. 《Neural computing & applications》2018,29(2):447-457

In dynamic ensemble selection (DES) techniques, only the most competent classifiers, for the classification of a specific test sample, are selected to predict the sample’s class labels. The key in DES techniques is estimating the competence of the base classifiers for the classification of each specific test sample. The classifiers’ competence is usually estimated according to a given criterion, which is computed over the neighborhood of the test sample defined on the validation data, called the region of competence. A problem arises when there is a high degree of noise in the validation data, causing the samples belonging to the region of competence to not represent the query sample. In such cases, the dynamic selection technique might select the base classifier that overfitted the local region rather than the one with the best generalization performance. In this paper, we propose two modifications in order to improve the generalization performance of any DES technique. First, a prototype selection technique is applied over the validation data to reduce the amount of overlap between the classes, producing smoother decision borders. During generalization, a local adaptive K-Nearest Neighbor algorithm is used to minimize the influence of noisy samples in the region of competence. Thus, DES techniques can better estimate the classifiers’ competence. Experiments are conducted using 10 state-of-the-art DES techniques over 30 classification problems. The results demonstrate that the proposed scheme significantly improves the classification accuracy of dynamic selection techniques.

相似文献

8.

Hybrid PSO feature selection-based association classification approach for breast cancer detection

Sowan Bilal Eshtay Mohammed Dahal Keshav Qattous Hazem Zhang Li 《Neural computing & applications》2023,35(7):5291-5317

Neural Computing and Applications - Breast cancer is one of the leading causes of death among women worldwide. Many methods have been proposed for automatic breast cancer diagnosis. One popular... 相似文献

9.

Kernel sparse representation-based classifier ensemble for face recognition

Li Zhang Wei-Da Zhou Fan-Zhang Li 《Multimedia Tools and Applications》2015,74(1):123-137

相似文献

10.

基于分步特征提取和组合分类器的电信客户流失预测模型

《微型机与应用》2016,(13):51-54

针对电信客户流失数据集存在的数据维度过高及单一分类器预测效果较弱的问题,结合过滤式和封装式特征选择方法的优点及组合分类器的较高预测能力,提出了一种基于Fisher比率与预测风险准则的分步特征选择方法结合组合分类器的电信客户流失预测模型。首先,基于Fisher比率从原始特征集合中提取具有较高判别能力的特征;其次,采用预测风险准则进一步选取对分类模型预测效果影响较大的特征;最后,构建基于平均概率输出和加权概率输出的组合分类器,以进一步提高客户流失预测效果。实验结果表明,相对于单步特征提取和单分类器模型,该方法能够提高对客户流失预测的效果。相似文献

11.

A data driven ensemble classifier for credit scoring analysis 总被引：2，自引：0，他引：2

Nan-Chen Hsieh Lun-Ping Hung 《Expert systems with applications》2010,37(1):534-545

This study focuses on predicting whether a credit applicant can be categorized as good, bad or borderline from information initially supplied. This is essentially a classification task for credit scoring. Given its importance, many researchers have recently worked on an ensemble of classifiers. However, to the best of our knowledge, unrepresentative samples drastically reduce the accuracy of the deployment classifier. Few have attempted to preprocess the input samples into more homogeneous cluster groups and then fit the ensemble classifier accordingly. For this reason, we introduce the concept of class-wise classification as a preprocessing step in order to obtain an efficient ensemble classifier. This strategy would work better than a direct ensemble of classifiers without the preprocessing step. The proposed ensemble classifier is constructed by incorporating several data mining techniques, mainly involving optimal associate binning to discretize continuous values; neural network, support vector machine, and Bayesian network are used to augment the ensemble classifier. In particular, the Markov blanket concept of Bayesian network allows for a natural form of feature selection, which provides a basis for mining association rules. The learned knowledge is represented in multiple forms, including causal diagram and constrained association rules. The data driven nature of the proposed system distinguishes it from existing hybrid/ensemble credit scoring systems. 相似文献

12.

A dynamic classifier ensemble selection approach for noise data 总被引：2，自引：0，他引：2

Jin Xiao Xiaoyi Jiang 《Information Sciences》2010,180(18):3402-3421

Dynamic classifier ensemble selection (DCES) plays a strategic role in the field of multiple classifier systems. The real data to be classified often include a large amount of noise, so it is important to study the noise-immunity ability of various DCES strategies. This paper introduces a group method of data handling (GMDH) to DCES, and proposes a novel dynamic classifier ensemble selection strategy GDES-AD. It considers both accuracy and diversity in the process of ensemble selection. We experimentally test GDES-AD and six other ensemble strategies over 30 UCI data sets in three cases: the data sets do not include artificial noise, include class noise, and include attribute noise. Statistical analysis results show that GDES-AD has stronger noise-immunity ability than other strategies. In addition, we find out that Random Subspace is more suitable for GDES-AD compared with Bagging. Further, the bias-variance decomposition experiments for the classification errors of various strategies show that the stronger noise-immunity ability of GDES-AD is mainly due to the fact that it can reduce the bias in classification error better. 相似文献

13.

Dynamic classifier ensemble for positive unlabeled text stream classification 总被引：1，自引：0，他引：1

Shirui Pan Yang Zhang Xue Li 《Knowledge and Information Systems》2012,33(2):267-287

Most of studies on streaming data classification are based on the assumption that data can be fully labeled. However, in real-life applications, it is impractical and time-consuming to manually label the entire stream for training. It is very common that only a small part of positive data and a large amount of unlabeled data are available in data stream environments. In this case, applying the traditional streaming algorithms with straightforward adaptation to positive unlabeled stream may not work well or lead to poor performance. In this paper, we propose a Dynamic Classifier Ensemble method for Positive and Unlabeled text stream (DCEPU) classification scenarios. We address the problem of classifying positive and unlabeled text stream with various concept drift by constructing an appropriate validation set and designing a novel dynamic weighting scheme in the classification phase. Experimental results on benchmark dataset RCV1-v2 demonstrate that the proposed method DCEPU outperforms the existing LELC (Li et?al. 2009b), DVS (with necessary adaption) (Tsymbal et?al. in Inf Fusion 9(1):56?C68, 2008), and Stacking style ensemble-based algorithm (Zhang et?al. 2008b). 相似文献

14.

The random boosting ensemble classifier for land-use image classification

Wang Hainan Miao Yunqi 《Multimedia Tools and Applications》2018,77(22):29933-29947

This paper presents a random boosting ensemble (RBE) classifier for remote sensing image classification, which introduces the random projection feature selection and bootstrap methods to obtain base classifiers for classifier ensemble. The RBE method is built based on an improved boosting framework, which is quite efficient for the few-shot problem due to the bootstrap in use. In RBE, kernel extreme machine (KELM) is applied to design base classifiers, which actually make RBE quite efficient due to feature reduction. The experimental results on the remote scene image classification demonstrate that RBE can effectively improve the classification performance, and resulting into a better generalization ability on the 21-class land-use dataset and the India pine satellite scene dataset.

相似文献

15.

A probabilistic model of classifier competence for dynamic ensemble selection

Tomasz Woloszynski Marek Kurzynski 《Pattern recognition》2011,44(10-11):2656-2668

The concept of a classifier competence is fundamental to multiple classifier systems (MCSs). In this study, a method for calculating the classifier competence is developed using a probabilistic model. In the method, first a randomised reference classifier (RRC) whose class supports are realisations of the random variables with beta probability distributions is constructed. The parameters of the distributions are chosen in such a way that, for each feature vector in a validation set, the expected values of the class supports produced by the RRC and the class supports produced by a modelled classifier are equal. This allows for using the probability of correct classification of the RRC as the competence of the modelled classifier. The competences calculated for a validation set are then generalised to an entire feature space by constructing a competence function based on a potential function model or regression. Three systems based on a dynamic classifier selection and a dynamic ensemble selection (DES) were constructed using the method developed. The DES based system had statistically significant higher average rank than the ones of eight benchmark MCSs for 22 data sets and a heterogeneous ensemble. The results obtained indicate that the full vector of class supports should be used for evaluating the classifier competence as this potentially improves performance of MCSs. 相似文献

16.

An adaptive ensemble classifier for mining concept drifting data streams

Dewan Md. Farid Li Zhang Alamgir Hossain Chowdhury Mofizur Rahman Rebecca Strachan Graham Sexton Keshav Dahal 《Expert systems with applications》2013,40(15):5895-5906

It is challenging to use traditional data mining techniques to deal with real-time data stream classifications. Existing mining classifiers need to be updated frequently to adapt to the changes in data streams. To address this issue, in this paper we propose an adaptive ensemble approach for classification and novel class detection in concept drifting data streams. The proposed approach uses traditional mining classifiers and updates the ensemble model automatically so that it represents the most recent concepts in data streams. For novel class detection we consider the idea that data points belonging to the same class should be closer to each other and should be far apart from the data points belonging to other classes. If a data point is well separated from the existing data clusters, it is identified as a novel class instance. We tested the performance of this proposed stream classification model against that of existing mining algorithms using real benchmark datasets from UCI (University of California, Irvine) machine learning repository. The experimental results prove that our approach shows great flexibility and robustness in novel class detection in concept drifting and outperforms traditional classification models in challenging real-life data stream applications. 相似文献

17.

Evolved feature weighting for random subspace classifier.

L Nanni A Lumini 《Neural Networks, IEEE Transactions on》2008,19(2):363-366

The problem addressed in this letter concerns the multiclassifier generation by a random subspace method (RSM). In the RSM, the classifiers are constructed in random subspaces of the data feature space. In this letter, we propose an evolved feature weighting approach: in each subspace, the features are multiplied by a weight factor for minimizing the error rate in the training set. An efficient method based on particle swarm optimization (PSO) is here proposed for finding a set of weights for each feature in each subspace. The performance improvement with respect to the state-of-the-art approaches is validated through experiments with several benchmark data sets. 相似文献

18.

Vote counting measures for ensemble classifiers

Terry Windeatt^{Author Vitae} 《Pattern recognition》2003,36(12):2743-2756

Various measures, such as Margin and Bias/Variance, have been proposed with the aim of gaining a better understanding of why Multiple Classifier Systems (MCS) perform as well as they do. While these measures provide different perspectives for MCS analysis, it is not clear how to use them for MCS design. In this paper a different measure based on a spectral representation is proposed for two-class problems. It incorporates terms representing positive and negative correlation of pairs of training patterns with respect to class labels. Experiments employing MLP base classifiers, in which parameters are fixed but systematically varied, demonstrate the sensitivity of the proposed measure to base classifier complexity. 相似文献

19.

An SVM classifier incorporating simultaneous noise reduction and feature selection: illustrative case examples

R. Kumar Author VitaeAuthor Vitae B.D. Kulkarni^{Author Vitae} 《Pattern recognition》2005,38(1):41-49

A hybrid technique involving symbolization of data to remove noise and use of conditional entropy minima to extract relevant and non-redundant features is proposed in conjunction with support vector machines to obtain more robust classification algorithm. The technique tested on three data sets shows improvements in classification efficiencies. 相似文献

20.

Optimized feature selection-based clustering approach for computer-aided detection of lung nodules in different modalities

Narayanan Barath Narayanan Hardie Russell C. Kebede Temesguen M. Sprague Matthew J. 《Pattern Analysis & Applications》2019,22(2):559-571

Pattern Analysis and Applications - Early detection of pulmonary lung nodules plays a significant role in the diagnosis of lung cancer. Computed tomography (CT) and chest radiographs (CRs) are... 相似文献