首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 113 毫秒
1.
Based on the principle of one-against-one support vector machines (SVMs) multi-class classification algorithm, this paper proposes an extended SVMs method which couples adaptive resonance theory (ART) network to reconstruct a multi-class classifier. Different coupling strategies to reconstruct a multi-class classifier from binary SVM classifiers are compared with application to fault diagnosis of transmission line. Majority voting, a mixture matrix and self-organizing map (SOM) network are compared in reconstructing the global classification decision. In order to evaluate the method’s efficiency, one-against-all, decision directed acyclic graph (DDAG) and decision-tree (DT) algorithm based SVM are compared too. The comparison is done with simulations and the best method is validated with experimental data.  相似文献   

2.
Massive textual data management and mining usually rely on automatic text classification technology. Term weighting is a basic problem in text classification and directly affects the classification accuracy. Since the traditional TF-IDF (term frequency & inverse document frequency) is not fully effective for text classification, various alternatives have been proposed by researchers. In this paper we make comparative studies on different term weighting schemes and propose a new term weighting scheme, TF-IGM (term frequency & inverse gravity moment), as well as its variants. TF-IGM incorporates a new statistical model to precisely measure the class distinguishing power of a term. Particularly, it makes full use of the fine-grained term distribution across different classes of text. The effectiveness of TF-IGM is validated by extensive experiments of text classification using SVM (support vector machine) and kNN (k nearest neighbors) classifiers on three commonly used corpora. The experimental results show that TF-IGM outperforms the famous TF-IDF and the state-of-the-art supervised term weighting schemes. In addition, some new findings different from previous studies are obtained and analyzed in depth in the paper.  相似文献   

3.

The class imbalance problem occurs when the distribution among classes is not balanced. This can be a problem that causes classifier models to bias toward classes with many training samples. The class imbalance problem is inherent in text classification. The abstract feature extraction method is a versatile term weighting scheme. It serves not only as a feature extractor to form a structural form from unorganized text data but also as a dimension reduction technique and classifier. In this study, we tackle the problem of class imbalance in abstract feature extraction. The proposed method utilizes relative imbalance ratio as a factor to elevate the representation of minority classes. Besides, we also integrate relevant term factors to boost the general accuracy. Experiments conducted with three different data sets, one of which is collected for this study, show that the original abstract feature extraction method indeed suffers from the class imbalance problem and the proposed methods demonstrate significant improvements in terms of f1-micro, f1-macro, and Matthew’s correlation coefficient. The experimental results also suggest that the proposed method is a competitive classifier and term weighting scheme when compared to the well-known classifiers (KNN, SVM, and Nearest Centroid) and term weighting schemes (TF-IDF, TF-ICF, TF-ICSDF, TF-RF, TF-PROB, TF-IGM, and TF-MONO).

  相似文献   

4.
With the rapid growth of textual content on the Internet, automatic text categorization is a comparatively more effective solution in information organization and knowledge management. Feature selection, one of the basic phases in statistical-based text categorization, crucially depends on the term weighting methods In order to improve the performance of text categorization, this paper proposes four modified frequency-based term weighting schemes namely; mTF, mTFIDF, TFmIDF, and mTFmIDF. The proposed term weighting schemes take the amount of missing terms into account calculating the weight of existing terms. The proposed schemes show the highest performance for a SVM classifier with a micro-average F1 classification performance value of 97%. Moreover, benchmarking results on Reuters-21578, 20Newsgroups, and WebKB text-classification datasets, using different classifying algorithms such as SVM and KNN show that the proposed schemes mTF, mTFIDF, and mTFmIDF outperform other weighting schemes such as TF, TFIDF, and Entropy. Additionally, the statistical significance tests show a significant enhancement of the classification performance based on the modified schemes.  相似文献   

5.
本文在考察现有多类分类支持向量机(SVM)算法后,提出了一种基于二叉树结构的多分类器融合思想,融合过程充分考虑了类别之间的区分度,从而建立一颗相对优化的二叉树SVM的多类分类算法,并把改进后的多类SVM应用于入侵检测中以提高系统性能。在KDDCUP1999数据集上的实验结果表明了本方法的有效性。  相似文献   

6.
Acoustic events produced in controlled environments may carry information useful for perceptually aware interfaces. In this paper we focus on the problem of classifying 16 types of meeting-room acoustic events. First of all, we have defined the events and gathered a sound database. Then, several classifiers based on support vector machines (SVM) are developed using confusion matrix based clustering schemes to deal with the multi-class problem. Also, several sets of acoustic features are defined and used in the classification tests. In the experiments, the developed SVM-based classifiers are compared with an already reported binary tree scheme and with their correlative Gaussian mixture model (GMM) classifiers. The best results are obtained with a tree SVM-based classifier that may use a different feature set at each node. With it, a 31.5% relative average error reduction is obtained with respect to the best result from a conventional binary tree scheme.  相似文献   

7.
The classification performance of nearest prototype classifiers largely relies on the prototype learning algorithm. The minimum classification error (MCE) method and the soft nearest prototype classifier (SNPC) method are two important algorithms using misclassification loss. This paper proposes a new prototype learning algorithm based on the conditional log-likelihood loss (CLL), which is based on the discriminative model called log-likelihood of margin (LOGM). A regularization term is added to avoid over-fitting in training as well as to maximize the hypothesis margin. The CLL in the LOGM algorithm is a convex function of margin, and so, shows better convergence than the MCE. In addition, we show the effects of distance metric learning with both prototype-dependent weighting and prototype-independent weighting. Our empirical study on the benchmark datasets demonstrates that the LOGM algorithm yields higher classification accuracies than the MCE, generalized learning vector quantization (GLVQ), soft nearest prototype classifier (SNPC) and the robust soft learning vector quantization (RSLVQ), and moreover, the LOGM with prototype-dependent weighting achieves comparable accuracies to the support vector machine (SVM) classifier.  相似文献   

8.
9.
The subprime mortgage crisis have triggered a significant economic decline over the world. Credit rating forecasting has been a critical issue in the global banking systems. The study trained a Gaussian process based multi-class classifier (GPC), a highly flexible probabilistic kernel machine, using variational Bayesian methods. GPC provides full predictive distributions and model selection simultaneously. During training process, the input features are automatically weighted by their relevances with respect to the output labels. Benefiting from the inherent feature scaling scheme, GPCs outperformed convectional multi-class classifiers and support vector machines (SVMs). In the second stage, conventional SVMs enhanced by feature selection and dimensionality reduction schemes were also compared with GPCs. Empirical results indicated that GPCs still performed the best.  相似文献   

10.
SVM在多源遥感图像分类中的应用研究   总被引:7,自引:1,他引:7  
在利用遥感图像进行土地利用/覆盖分类过程中,可采用以下两种途径来提高分类精度:一是通过增加有利于分类的数据源,引入地理辅助数据和归一化植被指数(NDVI)来进行多源信息融合;二是选择更好的分类方法,例如支持向量机(SVM)学习方法,由于该方法克服了最大似然法和神经网络的弱点,非常适合高维、复杂的小样本多源数据的分类。为了提高多源遥感图像分类的精度,还研究了支持向量机在遥感图像分类中模型的选择,包括多类模型和核函数的选择。分类结果表明,支持向量机比传统的分类方法具有更高的精度,尤其是基于径向基核函数和一对一多类方法的支持向量机模型更适合多源遥感图像分类,因此,基于支持向量机的多源土地利用/覆盖分类能大大提高分类精度。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号