首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
This paper presents cluster‐based ensemble classifier – an approach toward generating ensemble of classifiers using multiple clusters within classified data. Clustering is incorporated to partition data set into multiple clusters of highly correlated data that are difficult to separate otherwise and different base classifiers are used to learn class boundaries within the clusters. As the different base classifiers engage on different difficult‐to‐classify subsets of the data, the learning of the base classifiers is more focussed and accurate. A selection rather than fusion approach achieves the final verdict on patterns of unknown classes. The impact of clustering on the learning parameters and accuracy of a number of learning algorithms including neural network, support vector machine, decision tree and k‐NN classifier is investigated. A number of benchmark data sets from the UCI machine learning repository were used to evaluate the cluster‐based ensemble classifier and the experimental results demonstrate its superiority over bagging and boosting.  相似文献   

2.
A k-means clustering algorithm for designing binary tree classifiers is introduced for the classification of cervical cells. At each nonterminal node of the designed binary tree classifier, two sets of effective feature are selected: one is based on the Bhattacharyya distance, a measure of separability between two classes; the other is based on the merits of classification accuracy. The classification result has shown the effectiveness of the features and the binary tree classifier used.  相似文献   

3.
多分类器融合实现机型识别   总被引:2,自引:0,他引:2  
针对空战目标识别中机型识别这一问题,提出了基于多分类器融合的识别方法。该方法以战术性能参数为输入,便于满足空战的实时性要求。通过广泛收集数据,得到机型识别的分类特征,选取分类特征的子集作为单分类器的特征,用BP网络设计单分类器,然后选用性能优良的和规则进行分类器融合,求得最终的决策。实验结果表明,多分类器融合的识别性能明显优于参与融合的分类器,也优于相同输入的单分类器。该方法的另一特点是能够进行缺省推理,因而有较强的抗干扰能力,适合真实战场环境的需要。  相似文献   

4.
分类器的动态选择与循环集成方法   总被引:1,自引:0,他引:1  
针对多分类器系统设计中最优子集选择效率低下、集成方法缺乏灵活性等问题, 提出了分类器的动态选择与循环集成方法 (Dynamic selection and circulating combination, DSCC). 该方法利用不同分类器模型之间的互补性, 动态选择出对目标有较高识别率的分类器组合, 使参与集成的分类器数量能够随识别目标的复杂程度而自适应地变化, 并根据可信度实现系统的循环集成. 在手写体数字识别实验中, 与其他常用的分类器选择方法相比, 所提出的方法灵活高效, 识别率更高.  相似文献   

5.
作为一种典型的大数据,数据流具有连续、无限、概念漂移和快速到达等特点,因此传统的分类技术无法直接有效地应用于数据流挖掘。本文在经典的精度加权集成(Accuracy weighted ensemble,AWE)算法的基础上提出概念自适应快速决策树更新集成(Concept very fast decision tree update ensemble,CUE)算法。该算法不仅在基分类器的权重分配方面进行了改进,而且在解决数据块大小的敏感性问题以及增加基分类器之间的相异性方面,有明显的改善。实验表明在分类准确率上,CUE算法高于AWE算法。最后,提出聚类动态分类器选择(Dynamic classifier selection with clustering,DCSC)算法。该算法基于分类器动态选择的思想,没有繁琐的赋权值机制,所以时间效率较高。实验结果验证了DCSC算法的有效和高效性,并能有效地处理概念漂移。  相似文献   

6.
The ability to accurately predict business failure is a very important issue in financial decision-making. Incorrect decision-making in financial institutions is very likely to cause financial crises and distress. Bankruptcy prediction and credit scoring are two important problems facing financial decision support. As many related studies develop financial distress models by some machine learning techniques, more advanced machine learning techniques, such as classifier ensembles and hybrid classifiers, have not been fully assessed. The aim of this paper is to develop a novel hybrid financial distress model based on combining the clustering technique and classifier ensembles. In addition, single baseline classifiers, hybrid classifiers, and classifier ensembles are developed for comparisons. In particular, two clustering techniques, Self-Organizing Maps (SOMs) and k-means and three classification techniques, logistic regression, multilayer-perceptron (MLP) neural network, and decision trees, are used to develop these four different types of bankruptcy prediction models. As a result, 21 different models are compared in terms of average prediction accuracy and Type I & II errors. By using five related datasets, combining Self-Organizing Maps (SOMs) with MLP classifier ensembles performs the best, which provides higher predication accuracy and lower Type I & II errors.  相似文献   

7.
首先分析了粗糙集理论和神经网络这两种理论的特点及其互补性,然后提出了一种构造组合分类器的新方法C3RST。新方法包括两个步骤,先对训练数据集进行约简,以此确定单个神经网络分类器的结构以及在组合分类器中要包含的分类器数目;然后将这些分类器组合起来,组合过程中各单个分类器的权值由粗糙集理论中的基本概念——属性重要性来决定。最后,在一些标准数据集上做实验验证C3RST的分类性能,结果表明该方法是有效的。  相似文献   

8.
以系统和速率最大化为目标,将协同基站群的分簇问题建模为带权连通图的最大利益树生成问题,提出了一种基于协同度的最大利益树分簇算法。定义了利益树的协同度,并选择了协同度最大的两裸利益树进行合并的方式来并行生成多个规模动态的协同分簇,从而解决了传统顺序分簇导致的系统性能受限的问题,提高了系统的分簇性能。仿真结果表明,本算法的系统频谱效率优于现有的动态分簇算法,并且算法为线性复杂度。  相似文献   

9.
对支持向量机的多类分类问题进行研究,提出了一种基于核聚类的多类分类方法。利用核聚类方法将原始样本特征映射到高维特征进行聚类分组,对每一组使用一个支持向量机二值分类器进行分类,并用这些二值分类器组成决策树的节点,构成了一个决策分类树。给出决策树的生成算法,提出了利用交叠系数来控制交叠,从而克服错分积累,提高分类准确率。实验结果表明,采用该方法,手写体汉字识别速度和正确率都达到了实用的要求。  相似文献   

10.
This paper proposes a classification framework based on simple classifiers organized in a tree‐like structure. It is observed that simple classifiers, even though they have high error rate, find similarities among classes in the problem domain. The authors propose to trade on this property by recognizing classes that are mistaken and constructing overlapping subproblems. The subproblems are then solved by other classifiers, which can be very simple, giving as a result a hierarchical classifier (HC). It is shown that HC, together with the proposed training algorithm and evaluation methods, performs well as a classification framework. It is also proven that such constructs give better accuracy than the root classifier it is built upon.  相似文献   

11.
一种新的分裂层次聚类SVM多值分类器   总被引:6,自引:0,他引:6  
张国云  章兢 《控制与决策》2005,20(8):931-934
提出一种分裂层次聚类SVM分类树分类方法.该方法通过融合模糊聚类技术和支持向量机算法,利用分裂的层次聚类策略,有选择地重新构造学习样本集和SVM子分类器,得到了一种树形多值分类器.研究结果表明,对于k类别模式识别问题,该方法只需构造k-1个SVM子分类器,克服了SVM子分类器过多以及存在不可区分区域的缺点,具有良好的分类性能.实验结果验证了该方法的优越性.  相似文献   

12.
Introducing Locality and Softness in Subspace Classification   总被引:4,自引:2,他引:2  
Subspace classifiers classify a pattern based on its distance from different vector subspaces. Earlier models of subspace classification were based on the assumption that individual classes lie in unique subspaces. In later extensions, locality was introduced into subspace classification allowing for a class to be associated with more than one sub manifold. The local subspace classifier is thus a piecewise linear classifier, and is more powerful when compared to the linear classification performed by global subspace methods. We present extensions to the basic subspace method of classification based on introducing locality and softness in the classification process. Locality is introduced by (subspace) clustering the patterns into clusters, and softness is introduced by allowing a pattern to be associated with more than one cluster. Our motivation for introducing both locality and softness is based on the premise that by introducing locality, it is possible to reduce the bias though at the cost of a possible increase in variance. By introducing softness (or aggregation), the variance can be reduced. Consequently, by introducing both locality and softness, we avoid the possibility of high variance that locality typically introduces. We derive appropriate algorithms to construct a local and soft model of subspace classifiers and present results obtained with the proposed algorithm. Received: 4 November 1998?Received in revised form: 7 December 1998?Accepted: 7 December 1998  相似文献   

13.
周玉 《计算机应用研究》2021,38(6):1683-1688
为了提高神经网络分类器的性能,提出一种基于K均值聚类的分段样本数据选择方法.首先通过K均值聚类把训练样本根据已知的类别数进行聚类,对比聚类前后的各类样本,找出聚类错误的样本集和聚类正确的样本集;聚类正确的样本集根据各样本到聚类中心的距离进行排序并均分为五段,挑选各类的奇数段样本和聚类错误的样本构成新的训练样本集.该方法能够提取信息量大的样本,剔除冗余样本,减少样本数量的同时提高样本质量.利用该方法,结合人工和UCI数据集对三种不同的神经网络分类器进行了仿真实验,实验结果显示在训练样本平均压缩比为66.93%的前提下,三种神经网络分类器的性能都得到了提高.  相似文献   

14.
针对目前主流恶意网页检测技术耗费资源多、检测周期长和分类效果低等问题,提出一种基于Stacking的恶意网页集成检测方法,将异质分类器集成的方法应用在恶意网页检测识别领域。通过对网页特征提取分析相关因素和分类集成学习来得到检测模型,其中初级分类器分别使用K近邻(KNN)算法、逻辑回归算法和决策树算法建立,而次级的元分类器由支持向量机(SVM)算法建立。与传统恶意网页检测手段相比,此方法在资源消耗少、速度快的情况下使识别准确率提高了0.7%,获得了98.12%的高准确率。实验结果表明,所提方法构造的检测模型可高效准确地对恶意网页进行识别。  相似文献   

15.
16.
In this paper, we propose a new design methodology of granular fuzzy classifiers based on a concept of information granularity and information granules. The classifier uses the mechanism of information granulation with the aid of which the entire input space is split into a collection of subspaces. When designing the proposed fuzzy classifier, these information granules are constructed in a way they are made reflective of the geometry of patterns belonging to individual classes. Although the elements involved in the generated information granules (clusters) seem to be homogeneous with respect to the distribution of patterns in the input (feature) space, they still could exhibit a significant level of heterogeneity when it comes to the class distribution within the individual clusters. To build an efficient classifier, we improve the class homogeneity of the originally constructed information granules (by adjusting the prototypes of the clusters) and use a weighting scheme as an aggregation mechanism.  相似文献   

17.
范莹  计华  张化祥 《计算机应用》2008,28(5):1204-1207
提出一种新的基于模糊聚类的组合分类器算法,该算法利用模糊聚类技术产生训练样本的分布特征,据此为每一个样本赋予一个权值,来确定它们被采样的概率,利用采样样本训练的分类器调整训练集的采样概率,依次生成新的分类器直至达到一定的精度。该组合分类器算法在UCI的多个标准数据集上进行了测试,并与Bagging和AdaBoost算法进行了比较,实验结果表明新的算法具有更好的健壮性和更高的分类精度。  相似文献   

18.
The random subspace method for constructing decision forests   总被引:28,自引:0,他引:28  
Much of previous attention on decision trees focuses on the splitting criteria and optimization of tree sizes. The dilemma between overfitting and achieving maximum accuracy is seldom resolved. A method to construct a decision tree based classifier is proposed that maintains highest accuracy on training data and improves on generalization accuracy as it grows in complexity. The classifier consists of multiple trees constructed systematically by pseudorandomly selecting subsets of components of the feature vector, that is, trees constructed in randomly chosen subspaces. The subspace method is compared to single-tree classifiers and other forest construction methods by experiments on publicly available datasets, where the method's superiority is demonstrated. We also discuss independence between trees in a forest and relate that to the combined classification accuracy  相似文献   

19.
梁锦锦  吴德 《控制与决策》2015,30(7):1298-1302
针对支持向量域分类器对大规模样本集的训练时间长且占用内存大的问题,构造聚类分片双支持向量域分类器。以均值聚类剖分原始空间,并选取密度指标大的样本作为初始聚类中心;对子空间构造双支持向量域分类器,根据样本与正负类最小包围超球的距离构造分段决策函数;定义样本的变尺度距离,以链接规则组合子空间的分类结果。数值实验表明,所提出算法的分类精度高且受参数变化的影响不大,分类时间短且随子空间数的增加而降低。  相似文献   

20.
提出了一种基于自适应距离度量的最小距离分类器集成方法,给出了个体分类器的生成方法。首先用Bootstrap技术对训练样本集进行可重复采样,生成若干个子样本集,应用生成的子样本集建立自适应距离度量模型,根据建立的模型对子样本集进行训练,生成个体分类器。在集成中,将结果用相对多数投票法集成最终的结论。采用UCI标准数据集实验,将该方法与已有方法进行了性能比较,结果表明基于自适应距离度量的最小距离分类器集成是最有效的。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号