首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 937 毫秒
1.
This paper studies the state-of-the-art classification techniques for electroencephalogram (EEG) signals. Fuzzy Functions Support Vector Classifier, Improved Fuzzy Functions Support Vector Classifier and a novel technique that has been designed by utilizing Particle Swarm Optimization and Radial Basis Function Networks (PSO-RBFN) have been studied. The classification performances of the techniques are compared on standard EEG datasets that are publicly available and used by brain–computer interface (BCI) researchers. In addition to the standard EEG datasets, the proposed classifier is also tested on non-EEG datasets for thorough comparison. Within the scope of this study, several data clustering algorithms such as Fuzzy C-means, K-means and PSO clustering algorithms are studied and their clustering performances on the same datasets are compared. The results show that PSO-RBFN might reach the classification performance of state-of-the art classifiers and might be a better alternative technique in the classification of EEG signals for real-time application. This has been demonstrated by implementing the proposed classifier in a real-time BCI application for a mobile robot control.  相似文献   

2.
A case study including the discrimination of traffic accidents as accident free and accident cases on Konya-Afyonkarahisar highway in Turkey using the proposed hybrid method based on combining of a new data preprocessing method called subtractive clustering attribute weighting (SCAW) and classifier algorithms with the help of Geographical Information System (GIS) technology has been conducted. In order to improve the discrimination of classifier algorithms including artificial neural network (ANN), adaptive network based fuzzy inference system (ANFIS), support vector machine, and decision tree, using data preprocessing need in solution of these kinds of problems (traffic accident case study). So, we have proposed a novel data preprocessing method called subtractive clustering attribute weighting (SCAW) and combined with classifier algorithms. In this study, the experimental data has been obtained by means of using GIS. The obtained GIS attributes are day, temperature, humidity, weather conditions, and month of occurred accident. To evaluate the performance of the proposed hybrid method, the classification accuracy, sensitivity and specificity values have been used. The experimental obtained results are 53.93%, 52.25%, and 38.76% classification successes using alone ANN, ANFIS, and SVM with RBF kernel type, respectively. As for the proposed hybrid method, the classification accuracies of 67.98%, 70.22%, and 61.24% have been obtained using the combination of SCAW with ANN, the combination of SCAW with SVM (radial basis function (RBF) kernel type), and the combination of SCAW with ANFIS, respectively. The proposed SCAW method with the combination of classifier algorithms has been achieved the very promising results in the discrimination of traffic accidents.  相似文献   

3.
针对数据不平衡带来的少数类样本识别率低的问题,提出通过加权策略对过采样和随机森林进行改进的算法,从数据预处理和算法两个方面降低数据不平衡对分类器的影响。数据预处理阶段应用合成少数类过采样技术(Synthetic Minority Oversampling Technique,SMOTE)降低数据不平衡度,每个少数类样本根据其相对于剩余样本的欧氏距离分配权重,使每个样本合成不同数量的新样本。算法改进阶段利用Kappa系数评价随机森林中决策树训练后的分类效果,并赋予每棵树相应的权重,使分类能力更好的树在投票阶段有更大的投票权,提高随机森林算法对不平衡数据的整体分类性能。在KEEL数据集上的实验表明,与未改进算法相比,改进后的算法对少数类样本分类准确率和整体样本分类性能有所提升。  相似文献   

4.
霍纬纲  高小霞 《控制与决策》2012,27(12):1833-1838
提出一种适用于多类不平衡分布情形下的模糊关联分类方法,该方法以最小化AdaBoost.M1W集成学习迭代过程中训练样本的加权分类错误率和子分类器中模糊关联分类规则数目及规则中所含模糊项的数目为遗传优化目标,实现了AdaBoost.M1W和模糊关联分类建模过程的较好融合.通过5个多类不平衡UCI标准数据集和现有的针对不平衡分类问题的数据预处理方法实验对比结果,表明了所提出的方法能显著提高多类不平衡情形下的模糊关联分类模型的分类性能.  相似文献   

5.
A class-consistent k-means clustering algorithm (CCKM) and its hierarchical extension (Hierarchical CCKM) are presented for generating discriminative visual words for recognition problems. In addition to using the labels of training data themselves, we associate a class label with each cluster center to enforce discriminability in the resulting visual words. Our algorithms encourage data points from the same class to be assigned to the same visual word, and those from different classes to be assigned to different visual words. More specifically, we introduce a class consistency term in the clustering process which penalizes assignment of data points from different classes to the same cluster. The optimization process is efficient and bounded by the complexity of k-means clustering. A very efficient and discriminative tree classifier can be learned for various recognition tasks via the Hierarchical CCKM. The effectiveness of the proposed algorithms is validated on two public face datasets and four benchmark action datasets.  相似文献   

6.
The fuzzy c-means (FCM) algorithm is a widely applied clustering technique, but the implicit assumption that each attribute of the object data has equal importance affects the clustering performance. At present, attribute weighted fuzzy clustering has became a very active area of research, and numerous approaches that develop numerical weights have been combined into fuzzy clustering. In this paper, interval number is introduced for attribute weighting in the weighted fuzzy c-means (WFCM) clustering, and it is illustrated that interval weighting can obtain appropriate weights more easily from the viewpoint of geometric probability. Moreover, a genetic heuristic strategy for attribute weight searching is proposed to guide the alternating optimization (AO) of WFCM, and improved attribute weights in interval-constrained ranges and reasonable data partition can be obtained simultaneously. The experimental results demonstrate that the proposed algorithm is superior in clustering performance. It reveals that the interval weighted clustering can act as an optimization operator on the basis of the traditional numerical weighted clustering, and the effects of interval weight perturbation on clustering performance can be decreased.  相似文献   

7.
In big data era, more and more data are collected from multiple views, each of which reflect distinct perspectives of the data. Many multi-view data are accompanied by incompatible views and high dimension, both of which bring challenges for multi-view clustering. This paper proposes a strategy of simultaneous weighting on view and feature to discriminate their importance. Each feature of multi-view data is given bi-level weights to express its importance in feature level and view level, respectively. Furthermore, we implements the proposed weighting method in the classical k-means algorithm to conduct multi-view clustering task. An efficient gradient-based optimization algorithm is embedded into k-means algorithm to compute the bi-level weights automatically. Also, the convergence of the proposed weight updating method is proved by theoretical analysis. In experimental evaluation, synthetic datasets with varied noise and missing-value are created to investigate the robustness of the proposed approach. Then, the proposed approach is also compared with five state-of-the-art algorithms on three real-world datasets. The experiments show that the proposed method compares very favourably against the other methods.  相似文献   

8.
针对异构数据集下的不均衡分类问题,从数据集重采样、集成学习算法和构建弱分类器3个角度出发,提出一种针对异构不均衡数据集的分类方法——HVDM-Adaboost-KNN算法(heterogeneous value difference metric-Adaboost-KNN),该算法首先通过聚类算法对数据集进行均衡处理,获得多个均衡的数据子集,并构建多个子分类器,采用异构距离计算异构数据集中2个样本之间的距离,提高KNN算法的分类准性能,然后用Adaboost算法进行迭代获得最终分类器。用8组UCI数据集来评估算法在不均衡数据集下的分类性能,Adaboost实验结果表明,相比Adaboost等算法,F1值、AUC、G-mean等指标在异构不均衡数据集上的分类性能都有相应的提高。  相似文献   

9.
Although the \(k\)-NN classifier is a popular classification method, it suffers from the high computational cost and storage requirements it involves. This paper proposes two effective cluster-based data reduction algorithms for efficient \(k\)-NN classification. Both have low preprocessing cost and can achieve high data reduction rates while maintaining \(k\)-NN classification accuracy at high levels. The first proposed algorithm is called reduction through homogeneous clusters (RHC) and is based on a fast preprocessing clustering procedure that creates homogeneous clusters. The centroids of these clusters constitute the reduced training set. The second proposed algorithm is a dynamic version of RHC that retains all its properties and, in addition, it can manage datasets that cannot fit in main memory and is appropriate for dynamic environments where new training data are gradually available. Experimental results, based on fourteen datasets, illustrate that both algorithms are faster and achieve higher reduction rates than four known methods, while maintaining high classification accuracy.  相似文献   

10.
Acoustical parameters extracted from the recorded voice samples are actively pursued for accurate detection of vocal fold pathology. Most of the system for detection of vocal fold pathology uses high quality voice samples. This paper proposes a hybrid expert system approach to detect vocal fold pathology using the compressed/low quality voice samples which includes feature extraction using wavelet packet transform, clustering based feature weighting and classification. In order to improve the robustness and discrimination ability of the wavelet packet transform based features (raw features), we propose clustering based feature weighting methods including k-means clustering (KMC), fuzzy c-means (FCM) clustering and subtractive clustering (SBC). We have investigated the effectiveness of raw and weighted features (obtained after applying feature weighting methods) using four different classifiers: Least Square Support Vector Machine (LS-SVM) with radial basis kernel, k-means nearest neighbor (kNN) classifier, probabilistic neural network (PNN) and classification and regression tree (CART). The proposed hybrid expert system approach gives a promising classification accuracy of 100% using the feature weighting methods and also it has potential application in remote detection of vocal fold pathology.  相似文献   

11.
Zhou  Jukai  Liu  Tong  Zhu  Jingting 《Multimedia Tools and Applications》2019,78(23):33415-33434

K-means clustering is one of the most popular clustering algorithms and has been embedded in other clustering algorithms, e.g. the last step of spectral clustering. In this paper, we propose two techniques to improve previous k-means clustering algorithm by designing two different adjacent matrices. Extensive experiments on public UCI datasets showed the clustering results of our proposed algorithms significantly outperform three classical clustering algorithms in terms of different evaluation metrics.

  相似文献   

12.
This study presents the application of fuzzy c-means (FCM) clustering-based feature weighting (FCMFW) for the detection of Parkinson's disease (PD). In the classification of PD dataset taken from University of California – Irvine machine learning database, practical values of the existing traditional and non-standard measures for distinguishing healthy people from people with PD by detecting dysphonia were applied to the input of FCMFW. The main aims of FCM clustering algorithm are both to transform from a linearly non-separable dataset to a linearly separable one and to increase the distinguishing performance between classes. The weighted PD dataset is presented to k-nearest neighbour (k-NN) classifier system. In the classification of PD, the various k-values in k-NN classifier were used and compared with each other. Also, the effects of k-values in k-NN classifier on the classification of Parkinson disease datasets have been investigated and the best k-value found. The experimental results have demonstrated that the combination of the proposed weighting method called FCMFW and k-NN classifier has obtained very promising results on the classification of PD.  相似文献   

13.
Data weighting is of paramount importance with respect to classification performance in pattern recognition applications. In this paper, the output labels of datasets have been encoded using binary codes (numbers) and by this way provided a novel data weighting method called binary encoded output based data weighting (BEOBDW). In the proposed data weighting method, first of all, the output labels of datasets have been encoded with binary codes and then obtained two encoded output labels. Depending to these encoded outputs, the data points in datasets have been weighted using the relationships between features of datasets and two encoded output labels. To generalize the proposed data weighting method, five datasets have been used. These datasets are chain link (2 classes), two spiral (2 classes), iris (3 classes), wine (3 classes), and dermatology (6 classes). After applied BEOBDW to five datasets, the k-NN (nearest neighbor) classifier has been used to classify the weighted datasets. A set of experiments on used real world datasets demonstrated that the proposed data weighting method is a very efficient and has robust discrimination ability in the classification of datasets. BEOBDW method could be confidently used before many classification algorithms.  相似文献   

14.
现实中许多领域产生的数据通常具有多个类别并且是不平衡的。在多类不平衡分类中,类重叠、噪声和多个少数类等问题降低了分类器的能力,而有效解决多类不平衡问题已经成为机器学习与数据挖掘领域中重要的研究课题。根据近年来的多类不平衡分类方法的文献,从数据预处理和算法级分类方法两方面进行了分析与总结,并从优缺点和数据集等方面对所有算法进行了详细的分析。在数据预处理方法中,介绍了过采样、欠采样、混合采样和特征选择方法,对使用相同数据集算法的性能进行了比较。从基分类器优化、集成学习和多类分解技术三个方面对算法级分类方法展开介绍和分析。最后对多类不平衡数据分类研究领域的未来发展方向进行总结归纳。  相似文献   

15.
自适应的软子空间聚类算法   总被引:6,自引:0,他引:6  
陈黎飞  郭躬德  姜青山 《软件学报》2010,21(10):2513-2523
软子空间聚类是高维数据分析的一种重要手段.现有算法通常需要用户事先设置一些全局的关键参数,且没有考虑子空间的优化.提出了一个新的软子空间聚类优化目标函数,在最小化子空间簇类的簇内紧凑度的同时,最大化每个簇类所在的投影子空间.通过推导得到一种新的局部特征加权方式,以此为基础提出一种自适应的k-means型软子空间聚类算法.该算法在聚类过程中根据数据集及其划分的信息,动态地计算最优的算法参数.在实际应用和合成数据集上的实验结果表明,该算法大幅度提高了聚类精度和聚类结果的稳定性.  相似文献   

16.
目的 高光谱图像波段数目巨大,导致在解译及分类过程中出现“维数灾难”的现象。针对该问题,在K-means聚类算法基础上,考虑各个波段对不同聚类的重要程度,同时顾及类间信息,提出一种基于熵加权K-means全局信息聚类的高光谱图像分类算法。方法 首先,引入波段权重,用来刻画各个波段对不同聚类的重要程度,并定义熵信息测度表达该权重。其次,为避免局部最优聚类,引入类间距离测度实现全局最优聚类。最后,将上述两类测度引入K-means聚类目标函数,通过最小化目标函数得到最优分类结果。结果 为了验证提出的高光谱图像分类方法的有效性,对Salinas高光谱图像和Pavia University高光谱图像标准图中的地物类别根据其光谱反射率差异程度进行合并,将合并后的标准图作为新的标准分类图。分别采用本文算法和传统K-means算法对Salinas高光谱图像和Pavia University高光谱图像进行实验,并定性、定量地评价和分析了实验结果。对于图像中合并后的地物类别,光谱反射率差异程度大,从视觉上看,本文算法较传统K-means算法有更好的分类结果;从分类精度看,本文算法的总精度分别为92.20%和82.96%, K-means算法的总精度分别为83.39%和67.06%,较K-means算法增长8.81%和15.9%。结论 提出一种基于熵加权K-means全局信息聚类的高光谱图像分类算法,实验结果表明,本文算法对高光谱图像中具有不同光谱反射率差异程度的各类地物目标均能取得很好的分类结果。  相似文献   

17.
The k-means algorithm and its variations are known to be fast clustering algorithms. However, they are sensitive to the choice of starting points and are inefficient for solving clustering problems in large datasets. Recently, incremental approaches have been developed to resolve difficulties with the choice of starting points. The global k-means and the modified global k-means algorithms are based on such an approach. They iteratively add one cluster center at a time. Numerical experiments show that these algorithms considerably improve the k-means algorithm. However, they require storing the whole affinity matrix or computing this matrix at each iteration. This makes both algorithms time consuming and memory demanding for clustering even moderately large datasets. In this paper, a new version of the modified global k-means algorithm is proposed. We introduce an auxiliary cluster function to generate a set of starting points lying in different parts of the dataset. We exploit information gathered in previous iterations of the incremental algorithm to eliminate the need of computing or storing the whole affinity matrix and thereby to reduce computational effort and memory usage. Results of numerical experiments on six standard datasets demonstrate that the new algorithm is more efficient than the global and the modified global k-means algorithms.  相似文献   

18.

针对多视角聚类任务如何更好地实现视角间的合作之挑战, 提出一种新的视角融合策略. 该策略首先为每个视角设置一个划分, 然后通过自适应学习获取一个融合权重矩阵对每个视角的划分进行自适应融合, 最终利用视角集成方法得到全局划分结果. 将上述策略应用到经典的FCM(Fuzzy ??-means) 模糊聚类框架, 提出相应的多视角模糊聚类算法. 在模拟数据集和UCI 数据集上的实验结果均显示, 所提出的算法较几种相关聚类算法在应对多视角聚类任务时具有更好的适应性和更好的聚类性能.

  相似文献   

19.
不平衡入侵检测数据的代价敏感分类策略*   总被引:1,自引:0,他引:1  
提出一种新的预处理算法AdaP,不仅有效避免了数据过度拟合,且可独立使用。针对不平衡的入侵检测数据集,引入代价敏感机制,基于权值矩阵最小化误分类代价的思想,去除部分训练密集区域、拓展稀疏区域的同时再过滤噪声,最终实现了AdaP算法与AdaCost算法相结合的策略。实验证明此策略充分体现了提升算法有效提升前端弱分类算法分类精度和预处理算法平衡稀有类数据的优势,且可有效提高不平衡入侵检测数据的分类性能。  相似文献   

20.
现实生活中存在大量的非平衡数据,大多数传统的分类算法假定类分布平衡或者样本的错分代价相同,因此在对这些非平衡数据进行分类时会出现少数类样本错分的问题。针对上述问题,在代价敏感的理论基础上,提出了一种新的基于代价敏感集成学习的非平衡数据分类算法--NIBoost(New Imbalanced Boost)。首先,在每次迭代过程中利用过采样算法新增一定数目的少数类样本来对数据集进行平衡,在该新数据集上训练分类器;其次,使用该分类器对数据集进行分类,并得到各样本的预测类标及该分类器的分类错误率;最后,根据分类错误率和预测的类标计算该分类器的权重系数及各样本新的权重。实验采用决策树、朴素贝叶斯作为弱分类器算法,在UCI数据集上的实验结果表明,当以决策树作为基分类器时,与RareBoost算法相比,F-value最高提高了5.91个百分点、G-mean最高提高了7.44个百分点、AUC最高提高了4.38个百分点;故该新算法在处理非平衡数据分类问题上具有一定的优势。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号