首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Due to the important role of financial distress prediction (FDP) for enterprises, it is crucial to improve the accuracy of FDP model. In recent years, classifier ensemble has shown promising advantage over single classifier, but the study on classifier ensemble methods for FDP is still not comprehensive enough and leaves to be further explored. This paper constructs AdaBoost ensemble respectively with single attribute test (SAT) and decision tree (DT) for FDP, and empirically compares them with single DT and support vector machine (SVM). After designing the framework of AdaBoost ensemble method for FDP, the article describes AdaBoost algorithm as well as SAT and DT algorithm in detail, which is followed by the combination mechanism of multiple classifiers. On the initial sample of 692 Chinese listed companies and 41 financial ratios, 30 times of holdout experiments are carried out for FDP respectively one year, two years, and three years in advance. In terms of experimental results, AdaBoost ensemble with SAT outperforms AdaBoost ensemble with DT, single DT classifier and single SVM classifier. As a conclusion, the choice of weak learner is crucial to the performance of AdaBoost ensemble, and AdaBoost ensemble with SAT is more suitable for FDP of Chinese listed companies.  相似文献   

2.
基于粗集理论的选择性支持向量机集成   总被引:1,自引:0,他引:1       下载免费PDF全文
集成分类器的性能很大程度决定于各成员分类器的构造和对各成员分类器的组合方法。提出一种基于粗集理论的选择性支持向量机集成算法,该算法首先利用粗集技术产生一个属性约简集合,然后以各约简集为样本属性空间构造各成员分类器,其次通过对各成员分类器精度与差异度的计算,选择既满足个体的精度要求,又满足个体差异性要求的成员分类器进行集成。最后通过对UCI上一组实验数据的测试,证实该方法能够有效提高支持向量机的推广性能。  相似文献   

3.

This paper presents a random boosting ensemble (RBE) classifier for remote sensing image classification, which introduces the random projection feature selection and bootstrap methods to obtain base classifiers for classifier ensemble. The RBE method is built based on an improved boosting framework, which is quite efficient for the few-shot problem due to the bootstrap in use. In RBE, kernel extreme machine (KELM) is applied to design base classifiers, which actually make RBE quite efficient due to feature reduction. The experimental results on the remote scene image classification demonstrate that RBE can effectively improve the classification performance, and resulting into a better generalization ability on the 21-class land-use dataset and the India pine satellite scene dataset.

  相似文献   

4.
Decoding perceptual or cognitive states based on brain activity measured using functional magnetic resonance imaging (fMRI) can be achieved using machine learning algorithms to train classifiers of specific stimuli. However, the high dimensionality and intrinsically low signal to noise ratio (SNR) of fMRI data poses great challenges to such techniques. The problem is aggravated in the case of multiple subject experiments because of the high inter-subject variability in brain function. To address these difficulties, the majority of current approaches uses a single classifier. Since, in many cases, different stimuli activate different brain areas, it makes sense to use a set of classifiers each specialized in a different stimulus. Therefore, we propose in this paper using an ensemble of classifiers for decoding fMRI data. Each classifier in the ensemble has a favorite class or stimulus and uses an optimized feature set for that particular stimulus. The output for each individual stimulus is therefore obtained from the corresponding classifier and the final classification is achieved by simply selecting the best score. The method was applied to three empirical fMRI datasets from multiple subjects performing visual tasks with four classes of stimuli. Ensembles of GNB and k-NN base classifiers were tested. The ensemble of classifiers systematically outperformed a single classifier for the two most challenging datasets. In the remaining dataset, a ceiling effect was observed which probably precluded a clear distinction between the two classification approaches. Our results may be explained by the fact that different visual stimuli elicit specific patterns of brain activation and indicate that an ensemble of classifiers provides an advantageous alternative to commonly used single classifiers, particularly when decoding stimuli associated with specific brain areas.  相似文献   

5.
During the last few years there has been marked attention towards hybrid and ensemble systems development, having proved their ability to be more accurate than single classifier models. However, among the hybrid and ensemble models developed in the literature there has been little consideration given to: 1) combining data filtering and feature selection methods 2) combining classifiers of different algorithms; and 3) exploring different classifier output combination techniques other than the traditional ones found in the literature. In this paper, the aim is to improve predictive performance by presenting a new hybrid ensemble credit scoring model through the combination of two data pre-processing methods based on Gabriel Neighbourhood Graph editing (GNG) and Multivariate Adaptive Regression Splines (MARS) in the hybrid modelling phase. In addition, a new classifier combination rule based on the consensus approach (ConsA) of different classification algorithms during the ensemble modelling phase is proposed. Several comparisons will be carried out in this paper, as follows: 1) Comparison of individual base classifiers with the GNG and MARS methods applied separately and combined in order to choose the best results for the ensemble modelling phase; 2) Comparison of the proposed approach with all the base classifiers and ensemble classifiers with the traditional combination methods; and 3) Comparison of the proposed approach with recent related studies in the literature. Five of the well-known base classifiers are used, namely, neural networks (NN), support vector machines (SVM), random forests (RF), decision trees (DT), and naïve Bayes (NB). The experimental results, analysis and statistical tests prove the ability of the proposed approach to improve prediction performance against all the base classifiers, hybrid and the traditional combination methods in terms of average accuracy, the area under the curve (AUC) H-measure and the Brier Score. The model was validated over seven real world credit datasets.  相似文献   

6.
The amounts and types of remote sensing data have increased rapidly, and the classification of these datasets has become more and more overwhelming for a single classifier in practical applications. In this paper, an ensemble algorithm based on Diversity Ensemble Creation by Oppositional Relabeling of Artificial Training Examples (DECORATEs) and Rotation Forest is proposed to solve the classification problem of remote sensing image. In this ensemble algorithm, the RBF neural networks are employed as base classifiers. Furthermore, interpolation technology for identical distribution is used to remold the input datasets. These remolded datasets will construct new classifiers besides the initial classifiers constructed by the Rotation Forest algorithm. The change of classification error is used to decide whether to add another new classifier. Therefore, the diversity among these classifiers will be enhanced and the accuracy of classification will be improved. Adaptability of the proposed algorithm is verified in experiments implemented on standard datasets and actual remote sensing dataset.  相似文献   

7.
基于支持向量机集成的故障诊断   总被引:3,自引:2,他引:3  
为提高故障诊断的准确性,提出了一种基于遗传算法的支持向量机集成学习方法,定义了相应的遗传操作算子,并探讨了集成下的分类器的构造策略。对汽轮机转子不平衡故障诊断的仿真实验结果表明,集成学习方法的性能通常优于单个支持向量机,而所提方法性能则优于Bagging与Boosting等传统集成学习方法,获得的集成所包括的分类器数目更少,而且结合多种分类器构造策略可提高分类器的多样性。该方法能容易地推广到神经网络、决策树等其他学习算法。  相似文献   

8.
In this work a novel technique for building ensembles of classifiers for spectrogram classification is presented. We propose a simple approach for classifying signals from a large database of plant echoes, these echoes are highly complex stochastic signals, anyway their spectrograms contain enough information for extracting a good set of features for training the proposed ensemble of classifiers.The proposed ensemble of classifiers is a novel modified version of a recent feature transform based ensemble method: the Input Decimated Ensemble. In the proposed variant different subsets of randomly extracted training patterns are used to create a set of different Neighborhood Preserving Embedding subspace projections. These feature transformations are applied to the whole dataset and a set of decision trees are trained using these transformed spaces. Finally, the scores of this set of classifiers are combined by sum rule.Experiments carried out on a yet proposed dataset show the superiority of this method with respect to other approaches. The proposed approach outperforms the yet proposed, for the tested dataset, combination of principal component analysis and support vector machine (SVM). Moreover, we show that the fusion between the proposed ensemble and the system based on SVM outperforms both the stand-alone methods.  相似文献   

9.
针对集成分类器由于基分类器过弱,需要牺牲大量训练时间才能取得高精度的问题,提出一种基于实例的强分类器快速集成方法——FSE。首先通过基分类器评价方法剔除不合格分类器,再对分类器进行精确度和差异性排序,从而得到一组精度最高、差异性最大的分类器;然后通过FSE集成算法打破已有的样本分布,重新采样使分类器更多地关注难学习的样本,并以此决定各分类器的权重并集成。实验通过与集成分类器Boosting在UCI数据库和真实数据集上进行比对,Boosting构造的集成分类器的识别精度最高分别能达到90.2%和90.4%,而使用FSE方法的集成分类器精度分别能达到95.6%和93.9%;而且两者在达到相同精度时,使用FSE方法的集成分类器分别缩短了75%和80%的训练时间。实验结果表明,FSE集成模型能有效提高识别精度、缩短训练时间。  相似文献   

10.
Predicting future stock index price movement has always been a fascinating research area both for the investors who wish to yield a profit by trading stocks and for the researchers who attempt to expose the buried information from the complex stock market time series data. This prediction problem can be addressed as a binary classification problem with two class labels, one for the increasing movement and other for the decreasing movement. In literature, a wide range of classifiers has been tested for this application. As the performance of individual classifier varies for a diverse dataset with respect to different performance measures, it is impractical to acknowledge a specific classifier to be the best one. Hence, designing an efficient classifier ensemble instead of an individual classifier is fetching increasing attention from many researchers. Again selection of base classifiers and deciding their preferences in ensemble with respect to a variety of performance criteria can be considered as a Multi Criteria Decision Making (MCDM) problem. In this paper, an integrated TOPSIS Crow Search based weighted voting classifier ensemble is proposed for stock index price movement prediction. Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS), one of the popular MCDM techniques, is suggested for ranking and selecting a set of base classifiers for the ensemble whereas the weights of the classifiers used in the ensemble are tuned by the Crow Search method. The proposed ensemble model is validated for prediction of stock index price over the historical prices of BSE SENSEX, S&P500 and NIFTY 50 stock indices. The model has shown better performance compared to individual classifiers and other ensemble models such as majority voting, weighted voting, differential evolution and particle swarm optimization based classifier ensemble.  相似文献   

11.
Breast cancer is the most commonly occurring form of cancer in women. While mammography is the standard modality for diagnosis, thermal imaging provides an interesting alternative as it can identify tumors of smaller size and hence lead to earlier detection. In this paper, we present an approach to analysing breast thermograms based on image features and a hybrid multiple classifier system. The employed image features provide indications of asymmetry between left and right breast regions that are encountered when a tumor is locally recruiting blood vessels on one side, leading to a change in the captured temperature distribution. The presented multiple classifier system is based on a hybridisation of three computational intelligence techniques: neural networks or support vector machines as base classifiers, a neural fuser to combine the individual classifiers, and a fuzzy measure for assessing the diversity of the ensemble and removal of individual classifiers from the ensemble. In addition, we address the problem of class imbalance that often occurs in medical data analysis, by training base classifiers on balanced object subspaces. Our experimental evaluation, on a large dataset of about 150 breast thermograms, convincingly shows our approach not only to provide excellent classification accuracy and sensitivity but also to outperform both canonical classification approaches as well as other classifier ensembles designed for imbalanced datasets.  相似文献   

12.
Feature rankings are often used for supervised dimension reduction especially when discriminating power of each feature is of interest, dimensionality of dataset is extremely high, or computational power is limited to perform more complicated methods. In practice, it is recommended to start dimension reduction via simple methods such as feature rankings before applying more complex approaches. Single variable classifier (SVC) ranking is a feature ranking based on the predictive performance of a classifier built using only a single feature. While benefiting from capabilities of classifiers, this ranking method is not as computationally intensive as wrappers. In this paper, we report the results of an extensive study on the bias and stability of such feature ranking method. We study whether the classifiers influence the SVC rankings or the discriminative power of features themselves has a dominant impact on the final rankings. We show the common intuition of using the same classifier for feature ranking and final classification does not always result in the best prediction performance. We then study if heterogeneous classifiers ensemble approaches provide more unbiased rankings and if they improve final classification performance. Furthermore, we calculate an empirical prediction performance loss for using the same classifier in SVC feature ranking and final classification from the optimal choices.  相似文献   

13.
This article proposes a new approach to improve the classification performance of remotely sensed images with an aggregative model based on classifier ensemble (AMCE). AMCE is a multi-classifier system with two procedures, namely ensemble learning and predictions combination. Two ensemble algorithms (Bagging and AdaBoost.M1) were used in the ensemble learning process to stabilize and improve the performance of single classifiers (i.e. maximum likelihood classifier, minimum distance classifier, back propagation neural network, classification and regression tree, and support vector machine (SVM)). Prediction results from single classifiers were integrated according to a diversity measurement with an averaged double-fault indicator and different combination strategies (i.e. weighted vote, Bayesian product, logarithmic consensus, and behaviour knowledge space). The suitability of the AMCE model was examined using a Landsat Thematic Mapper (TM) image of Dongguan city (Guangdong, China), acquired on 2 January 2009. Experimental results show that the proposed model was significantly better than the most accurate single classification (i.e. SVM) in terms of classification accuracy (i.e. from 88.83% to 92.45%) and kappa coefficient (i.e. from 0.8624 to 0.9088). A stepwise comparison illustrates that both ensemble learning and predictions combination with the AMCE model improved classification.  相似文献   

14.
点击欺诈是近年来最常见的网络犯罪手段之一,互联网广告行业每年都会因点击欺诈而遭受巨大损失。为了能够在海量点击中有效地检测欺诈点击,构建了多种充分结合广告点击与时间属性关系的特征,并提出了一种点击欺诈检测的集成学习框架——CAT-RFE集成学习框架。CAT-RFE集成学习框架包含3个部分:基分类器、递归特征消除(RFE,recursive feature elimination)和voting集成学习。其中,将适用于类别特征的梯度提升模型——CatBoost(categorical boosting)作为基分类器;RFE是基于贪心策略的特征选择方法,可在多组特征中选出较好的特征组合;Voting集成学习是采用投票的方式将多个基分类器的结果进行组合的学习方法。该框架通过CatBoost和RFE在特征空间中获取多组较优的特征组合,再在这些特征组合下的训练结果通过voting进行集成,获得集成的点击欺诈检测结果。该框架采用了相同的基分类器和集成学习方法,不仅克服了差异较大的分类器相互制约而导致集成结果不理想的问题,也克服了RFE在选择特征时容易陷入局部最优解的问题,具备更好的检测能力。在实际互联网点击欺诈数据集上的性能评估和对比实验结果显示,CAT-RFE集成学习框架的点击欺诈检测能力超过了CatBoost模型、CatBoost和RFE组合的模型以及其他机器学习模型,证明该框架具备良好的竞争力。该框架为互联网广告点击欺诈检测提供一种可行的解决方案。  相似文献   

15.
一种基于旋转森林的集成协同训练算法   总被引:1,自引:0,他引:1       下载免费PDF全文
集成协同训练算法(ensemble co-training)是将集成学习(ensemble learning)和协同训练算法(co-training)相结合的半监督学习方法,旋转森林(rotation forest)是利用特征提取来构造基分类器差异性的集成学习方法,在对现有的集成协同训练算法研究基础上,提出了基于旋转森林的协同训练算法——ROFCO,该方法重在利用未标记数据提高基分类器之间的差异性和特征提取效果,使基分类器的泛化误差保持不变或下降的同时,能保持甚至提高基分类器之间的差异性,提高集成效果。实验结果表明该方法能取得较好效果。  相似文献   

16.
基分类器之间的差异性和单个基分类器自身的准确性是影响集成系统泛化性能的两个重要因素,针对差异性和准确性难以平衡的问题,提出了一种基于差异性和准确性的加权调和平均(D-A-WHA)度量基因表达数据的选择性集成算法。以核超限学习机(KELM)作为基分类器,通过D-A-WHA度量调节基分类器之间的差异性和准确性,最后选择一组准确性较高并且与其他基分类器差异性较大的基分类器组合进行集成。通过在UCI基因数据集上进行仿真实验,实验结果表明,与传统的Bagging、Adaboost等集成算法相比,基于D-A-WHA度量的选择性集成算法分类精度和稳定性都有显著的提高,且能有效应用于癌症基因数据的分类中。  相似文献   

17.
针对传统的分类器集成的每次迭代通常是将单个最优个体分类器集成到强分类器中,而其它可能有辅助作用的个体分类器被简单抛弃的问题,提出了一种基于Boosting框架的非稀疏多核学习方法MKL-Boost,利用了分类器集成学习的思想,每次迭代时,首先从训练集中选取一个训练子集,然后利用正则化非稀疏多核学习方法训练最优个体分类器,求得的个体分类器考虑了M个基本核的最优非稀疏线性凸组合,通过对核组合系数施加LP范数约束,一些好的核得以保留,从而保留了更多的有用特征信息,差的核将会被去掉,保证了有选择性的核融合,然后将基于核组合的最优个体分类器集成到强分类器中。提出的算法既具有Boosting集成学习的优点,同时具有正则化非稀疏多核学习的优点,实验表明,相对于其它Boosting算法,MKL-Boost可以在较少的迭代次数内获得较高的分类精度。  相似文献   

18.
Detection of anomalies is a broad field of study, which is applied in different areas such as data monitoring, navigation, and pattern recognition. In this paper we propose two measures to detect anomalous behaviors in an ensemble of classifiers by monitoring their decisions; one based on Mahalanobis distance and another based on information theory. These approaches are useful when an ensemble of classifiers is used and a decision is made by ordinary classifier fusion methods, while each classifier is devoted to monitor part of the environment. Upon detection of anomalous classifiers we propose a strategy that attempts to minimize adverse effects of faulty classifiers by excluding them from the ensemble. We applied this method to an artificial dataset and sensor-based human activity datasets, with different sensor configurations and two types of noise (additive and rotational on inertial sensors). We compared our method with two other well-known approaches, generalized likelihood ratio (GLR) and One-Class Support Vector Machine (OCSVM), which detect anomalies at data/feature level.  相似文献   

19.
为了有效提升支持向量机的泛化性能,提出两种集成算法对其进行训练.首先分析了扰动输入特征空间和扰动模型参数两种方式对于增大成员分类器之间差异性的作用;然后提出两种基于二重扰动机制的集成训练算法.其共同特点是,同时扰动输入特征空间和模型参数以产生成员分类器,并利用多数投票法对它们进行组合.实验结果表明,因为同时缩减了误差的偏差部分和方差部分,所以两种算法均能显著提升支持向量机的泛化性能.  相似文献   

20.
Recent researches in fault classification have shown the importance of accurately selecting the features that have to be used as inputs to the diagnostic model. In this work, a multi-objective genetic algorithm (MOGA) is considered for the feature selection phase. Then, two different techniques for using the selected features to develop the fault classification model are compared: a single classifier based on the feature subset with the best classification performance and an ensemble of classifiers working on different feature subsets. The motivation for developing ensembles of classifiers is that they can achieve higher accuracies than single classifiers. An important issue for an ensemble to be effective is the diversity in the predictions of the base classifiers which constitute it, i.e. their capability of erring on different sub-regions of the pattern space. In order to show the benefits of having diverse base classifiers in the ensemble, two different ensembles have been developed: in the first, the base classifiers are constructed on feature subsets found by MOGAs aimed at maximizing the fault classification performance and at minimizing the number of features of the subsets; in the second, diversity among classifiers is added to the MOGA search as the third objective function to maximize. In both cases, a voting technique is used to effectively combine the predictions of the base classifiers to construct the ensemble output. For verification, some numerical experiments are conducted on a case of multiple-fault classification in rotating machinery and the results achieved by the two ensembles are compared with those obtained by a single optimal classifier.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号