首页 | 本学科首页   官方微博 | 高级检索  
 共查询到19条相似文献,搜索用时 203 毫秒
针对不均衡分类问题,提出了一种基于隶属度加权的模糊支持向量机模型。使用传统支持向量机对样本进行训练,并通过样本点与所得分类超平面之间的距离构造模糊隶属度,这不仅能够消除噪点和野值点的影响,而且可以在一定程度上约减样本;利用正负类的平均隶属度和样本数量求得平衡调节因子,消除数据不平衡时造成的分类超平面的偏移现象;通过实验结果验证了该算法的可行性和有效性。实验结果表明,该算法能有效提高分类精度,特别是对不平衡数据效果更加明显,在训练速度和分类性能上比传统支持向量机和模糊支持向量机有进一步的提升。  相似文献   

Web文本分类是数据挖掘领域的研究热点。针对Web文本数据集高维和不平衡的特点,将模糊隶属度和平衡因子引入近似支持向量机,提出模糊加权近似支持向量机。首先计算样本的平均密度,并结合样本数量求得平衡因子,克服传统加权算法仅以样本数为依据设置权值的缺陷,缓解数据不平衡造成的分类超平面偏移;再计算样本的模糊隶属度,消除噪声和奇异点造成的分类误差;近似支持向量机相比标准支持向量机具有明显的速度优势,更加适用于高维数据分类。实验表明,算法能有效提高不平衡数据的分类精度,在Web文本的训练速度和分类质量上有一定提高。  相似文献   

数据集动态重构的集成迁移学习   总被引:1,自引:0,他引:1       下载免费PDF全文
目前很多数据挖掘和机器学习方法都有一个基本假设:训练数据和测试数据必须服从相同的分布。但是在很多情况下这种假设不成立,没有考虑分布差异的传统机器学习方法就不能正确分类了。提出了一种新的迁移学习方法DRTAT,对原训练数据进行动态分割重组,适时地淘汰冗余数据,并进行分类器的集成。通过在多个文本数据集和UCI数据集上进行测试,并与TrAdaboost算法进行比较,表明了算法的先进性。  相似文献   

针对模糊C均值(FCM)聚类算法没有考虑样本不同属性的重要程度、邻域信息等问题,提出一种基于熵与邻域约束的FCM算法。首先通过计算样本各属性的熵值来为各属性赋予权重,结合属性权重改进距离度量函数;随后根据邻域样本与中心样本间的距离计算邻域隶属度权重,加权得到邻域隶属度,利用邻域隶属度约束目标函数,修正隶属度迭代过程,最终达到提升FCM聚类算法性能的目的。理论分析和在人造数据集、多个UCI数据集的试验结果表明,改进后的算法在聚类效果、鲁棒性上均优于传统FCM算法、PCM算法、KFCM算法、KPCM算法和DSFCM算法,表明了本文算法的有效性。  相似文献   

针对模糊C-均值聚类(FCM)算法对噪声敏感、容易收敛到局部极小值的问题,提出一种基于交叉熵的模糊聚类算法。通过引入交叉熵重新定义了传统FCM算法的目标函数,利用交叉熵度量样本隶属度之间的差异性,并采用拉格朗日求解方法和朗伯W函数解决了目标函数的优化问题,此外,分析了样本划分矩阵的分布情况,依据分布特性对噪声样本进行识别。人工数据集合和标准数据集加噪的实验结果表明,该算法提高了传统FCM算法的抗干扰能力,具有更强的鲁棒性,噪声样本识别的准确率较高。  相似文献   

文明瑶  廖伟国 《计算机仿真》2021,38(11):290-294
以实现数据增量式精准挖掘为目的 ,提出基于机器学习的不确定数据增量式挖掘算法.以机器学习算法中的模糊c-均值聚类(FCM)算法为基础,通过主成分分析法筛选原始数据集中指标,利用Relief算法计算指标权重,实现FCM算法改进.改进FCM算法通过阈值定义目标函数,经样本数据分类、特征提取和聚类,使目标函数达到最小值,实现数据挖掘.实验结果表明,上述算法的数据样本分类符合率可达99.28%,分类准确率在98%左右,且分类耗时短、效率高;特征提取能力受数据量增加影响较小;在数据增量情况下,改进算法增量式挖掘准确率保持在95%~ 98%之间,且所需迭代次数少.  相似文献   


传统模糊??-均值(FCM) 算法要求一个样本对于各个聚类的隶属度之和满足归一化条件, 从而导致算法对噪声和孤立点敏感, 对非均衡分布样本的聚类有效性降低. 针对该问题, 提出一种改进模糊隶属函数约束的FCM聚类算法, 通过放松归一化条件, 推导出新的隶属度划分公式, 并在聚类过程中不断进行隶属度修正, 从而达到消除噪声样本、提高聚类有效性的目的. 最后通过实验结果对比验证了改进算法的正确性.


传统的KNN文本分类算法在处理不均匀数据时,尤其是小型数据,容易受到边缘数据的干扰,导致分类效果明显下降.对此,本文提出一种基于模糊理论的KNN文本分类算法,该算法根据模糊理论的思想计算样本的隶属度函数,更合理地处理训练样本权值.实验表明,基于模糊理论的KNN算法能有效的弱化这些干扰,并在分类准确度上也有一定的提高.  相似文献   

沈洋 《计算机应用研究》2020,37(11):3281-3286
针对二叉树支持向量机多分类算法准确率与分类效率较低的问题,提出了一种基于加权模糊隶属度的二叉树支持向量机多分类算法(binary tree support vector machines multi-classification algorithm based on weighted fuzzy membership,PF-BTSVM)。该算法依据最大最小样本距离与质心距离构造出一个近似完全二叉树,提高了整体结构的分类效率;利用模糊隶属度函数以及正负辅助惩罚因子对训练集进行筛选,剔除掉对分类无用的样本与噪声值,实现了训练集的提纯并且削弱了不平衡分类时超平面的偏移。在数据集上的实验结果表明,与其他二叉树多分类算法相比,该算法在提高了分类准确率以及稳定性的的同时还加快了训练与分类的速度,而且这种优势当分类的不平衡度越大时越明显。  相似文献   

针对基于粒子群的模糊聚类算法运算效率较低的问题,提出隐隶属度模糊c均值聚类算法HMFCM(hidden-membership fuzzy c-means clustering)。HMFCM算法将FCM模糊隶属度迭代公式代入FCM目标函数中约简,得到无模糊隶属度的HMFCM目标函数,并利用PSO算法对聚类中心进行编码寻优,最后利用样本与聚类中心距离进行类别判决。HMFCM算法无需计算样本模糊隶属度,降低了聚类算法复杂度,提高了算法的计算效率及精度,而且该方法可以推广到其他基于生物寻优的聚类算法。通过仿真实验验证了所提出算法的有效性和时效性。  相似文献   

提出了一种基于最大隶属度原则的基因表达式编程(Gene Expression Programming,GEP)分类方法MDM-GEP。引入模糊集合中的隶属度描述分类的模糊性,在训练集上得到逼近各类别隶属函数的GEP分类器。对于待分类实例,计算其在各模糊集中的隶属度,基于最大隶属度的模糊模式识别原则确定最终归属类,并在三个UCI数据集上对该算法进行了实验。实验结果表明,MDM-GEP不仅具有较好的分类性能,而且有效解决了传统的简单GEP分类方法中存在的拒分区域问题。  相似文献   

To extract knowledge from a set of numerical data and build up a rule-based system is an important research topic in knowledge acquisition and expert systems. In recent years, many fuzzy systems that automatically generate fuzzy rules from numerical data have been proposed. In this paper, we propose a new fuzzy learning algorithm based on the alpha-cuts of equivalence relations and the alpha-cuts of fuzzy sets to construct the membership functions of the input variables and the output variables of fuzzy rules and to induce the fuzzy rules from the numerical training data set. Based on the proposed fuzzy learning algorithm, we also implemented a program on a Pentium PC using the MATLAB development tool to deal with the Iris data classification problem. The experimental results show that the proposed fuzzy learning algorithm has a higher average classification ratio and can generate fewer rules than the existing algorithm.  相似文献   

将模糊集的隶属度函数矩阵嵌入到二维主成分分析以及二维线性判别分析中,形成了一种基于模糊2DPLA的新方法。该方法首先通过基于模糊的KNN方法求出隶属度函数矩阵;然后将隶属度函数矩阵从图像矩阵的水平方向和垂直方向分别嵌入到二维主成分分析和二维线性判别分析中,从而更好地实现降维;最后采用基于矩阵的F-范数代替传统的基于向量的2一范数进行分类度量。实验阶段,采用Yale Face Database B, ORI和FERET人脸数据库进行了测试和验证。结果证明,该方法具有较好的鲁棒性,并能获得较高的识别率。  相似文献   


Landcover classifications have large uncertainty related to the heterogeneity of similar objects and complex spatial correlations in satellite images, making it difficult to obtain ideal classification results using traditional classification methods. Therefore, to address the uncertainty in landcover classifications based on remotely sensed information, we propose a novel fuzzy c-means algorithm, which integrates adaptive interval-valued modelling and spatial information. It dynamically adjusts the interval width according to the fuzzy degree of the target membership without pre-setting any parameters, controls the fuzziness of the target, and mines the inherent distribution of the data. Furthermore, reliability-based spatial correlation modelling is used to describe the spatial relationship of the target and to improve both robustness and accuracy of the algorithm. Experimental data consisting of SPOT5 (10-m spatial resolution) or Thematic Mapper (30-m spatial resolution) satellite data for three case study areas in China are used to test this algorithm. Compared with other state-of-the-art fuzzy classification methods, our algorithm markedly improved the ground-object separability. Moreover, it balanced improvement of pixel separability and suppression of heterogeneity of intra-class objects, producing more compact landcover areas and clearer boundaries between classes.  相似文献   

Evolutionary design of a fuzzy classifier from data   总被引:6,自引:0,他引:6  
Genetic algorithms show powerful capabilities for automatically designing fuzzy systems from data, but many proposed methods must be subjected to some minimal structure assumptions, such as rule base size. In this paper, we also address the design of fuzzy systems from data. A new evolutionary approach is proposed for deriving a compact fuzzy classification system directly from data without any a priori knowledge or assumptions on the distribution of the data. At the beginning of the algorithm, the fuzzy classifier is empty with no rules in the rule base and no membership functions assigned to fuzzy variables. Then, rules and membership functions are automatically created and optimized in an evolutionary process. To accomplish this, parameters of the variable input spread inference training (VISIT) algorithm are used to code fuzzy systems on the training data set. Therefore, we can derive each individual fuzzy system via the VISIT algorithm, and then search the best one via genetic operations. To evaluate the fuzzy classifier, a fuzzy expert system acts as the fitness function. This fuzzy expert system can effectively evaluate the accuracy and compactness at the same time. In the application section, we consider four benchmark classification problems: the iris data, wine data, Wisconsin breast cancer data, and Pima Indian diabetes data. Comparisons of our method with others in the literature show the effectiveness of the proposed method.  相似文献   

Fuzzy min-max neural networks. I. Classification.   总被引:1,自引:0,他引:1  
A supervised learning neural network classifier that utilizes fuzzy sets as pattern classes is described. Each fuzzy set is an aggregate (union) of fuzzy set hyperboxes. A fuzzy set hyperbox is an n-dimensional box defined by a min point and a max point with a corresponding membership function. The min-max points are determined using the fuzzy min-max learning algorithm, an expansion-contraction process that can learn nonlinear class boundaries in a single pass through the data and provides the ability to incorporate new and refine existing classes without retraining. The use of a fuzzy set approach to pattern classification inherently provides a degree of membership information that is extremely useful in higher-level decision making. The relationship between fuzzy sets and pattern classification is described. The fuzzy min-max classifier neural network implementation is explained, the learning and recall algorithms are outlined, and several examples of operation demonstrate the strong qualities of this new neural network classifier.  相似文献   

Designing of classifiers based on immune principles and fuzzy rules   总被引:2,自引:0,他引:2  
This paper proposed an algorithm to design a fuzzy classification system based on immune principles. The proposed algorithm evolves a population of antibodies based on the clonal selection and hypermutation principles. The membership function parameters and the fuzzy rule set including the number of rules inside it are evolved at the same time. Each antibody (candidate solution) corresponds to a fuzzy classification rule set. We compared our algorithm with other classification schemes on some benchmark datasets. The results demonstrated the effectiveness of the proposed immune algorithm.  相似文献   

一种新的模糊支持向量机多分类算法*   总被引:5,自引:3,他引:2  
在模糊多分类问题中,由于训练样本在训练过程中所起的作用不同,对所有数据包括异常数据赋予一个隶属度。针对模糊支持向量机(fuzzy support vector machines,FSVM)的第一种形式,引入类中心的概念,结合一对多1-a-a(one-against-all)组合分类方法,提出了一种基于一对多组合的模糊支持向量机多分类算法,并与1-a-1(one-against-one)组合和1-a-a组合的分类算法比较。数值实验表明,该算法是有效的,有较高的分类准确率,有更好的泛化能力。  相似文献   

In our previous papers, fuzzy model identification methods were discussed. The bacterial evolutionary algorithm for extracting fuzzy rule base from a training set was presented. The Levenberg–Marquardt method was also proposed for determining membership functions in fuzzy systems. The combination of the evolutionary and the gradient‐based learning techniques is usually called memetic algorithm. In this paper, a new kind of memetic algorithm, the bacterial memetic algorithm, is introduced for fuzzy rule extraction. The paper presents how the bacterial evolutionary algorithm can be improved with the Levenberg–Marquardt technique. © 2009 Wiley Periodicals, Inc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号