首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper extends the previous work in smooth support vector machine (SSVM) from binary to k-class classification based on a single-machine approach and call it multi-class smooth SVM (MSSVM). This study implements MSSVM for a ternary classification problem and labels it as TSSVM. For the case k>3, this study proposes a one-vs.-one-vs.-rest (OOR) scheme that decomposes the problem into k(k−1)/2 ternary classification subproblems based on the assumption of ternary voting games. Thus, the k-class classification problem can be solved via a series of TSSVMs. The numerical experiments in this study compare the classification accuracy for TSSVM/OOR, one-vs.-one, one-vs.-rest schemes on nine UCI datasets. Results show that TSSVM/OOR outperforms the one-vs.-one and one-vs.-rest for all datasets. This study includes further error analyses to emphasize that the prediction confidence of OOR is significantly higher than the one-vs.-one scheme. Due to the nature of OOR design, it can detect the hidden (unknown) class directly. This study includes a “leave-one-class-out” experiment on the pendigits dataset to demonstrate the detection ability of the proposed OOR method for hidden classes. Results show that OOR performs significantly better than one-vs.-one and one-vs.-rest in the hidden-class detection rate.  相似文献   

2.
针对直接多类分类方法,提出了一种新的基于直接构造多类SVM分类器的模糊多类支持向量机算法FCS-SVM。在算法中,重构了优化问题及其约束条件,以及Lagrange公式,并进行了推导。通过在标准数据集上的几个实验,对这些算法进行了比较分析。实验结果表明提出的算法可以得到比较理想的分类精度。  相似文献   

3.
A novel fuzzy compensation multi-class support vector machine   总被引:6,自引:0,他引:6  
This paper presents a novel fuzzy compensation multi-class support vector machine (FCM-SVM) to improve the outlier and noise sensitivity problem of traditional support vector machine (SVM) for multi-class data classification. The basic idea is to give the dual effects to penalty term through treating every data point as both positive and negative classes, but with different memberships. We fuzzify penalty term, compensate weight to classification, reconstruct the optimization problem and its restrictions, reconstruct {Lagrangian} formula, and present the theoretic deduction. By this way the new fuzzy compensation multi-class support vector machine is expected to have more generalization ability while preserving the merit of insensitive to outliers. Experimental results on benchmark data set and real data set show that the proposed method reduces the effect of noise data and yields higher classification rate than traditional multi-class SVM does.  相似文献   

4.
Classification of agricultural data such as soil data and crop data is significant as it allows the stakeholders to make meaningful decisions for farming. Soil classification aids farmers in deciding the type of crop to be sown for a particular type of soil. Similarly, wheat variety classification assists in selecting the right type of wheat for a particular product. Current methods used for classifying agricultural data are mostly manual. These methods involve agriculture field visits and surveys and are labor-intensive, expensive, and prone to human error. Recently, data mining techniques such as decision trees, k-nearest neighbors (k-NN), support vector machine (SVM), and Naive Bayes (NB) have been used in classification of agricultural data such as soil, crops, and land cover. The resulting classification aid the decision making process of government organizations and agro-industries in the field of agriculture. SVM is a popular approach for data classification. A recent study on SVM highlighted the fact that using multiple kernels instead of a single kernel would lead to better performance because of the greater learning and generalization power. In this work, a hybrid kernel based support vector machine (H-SVM) is proposed for classifying multi-class agricultural datasets having continuous attributes. Genetic algorithm (GA) or gradient descent (GD) methods are utilized to select the SVM parameters C and γ. The proposed kernel is called the quadratic-radial-basis-function kernel (QRK) and it combines both quadratic and radial basis function (RBF) kernels. The proposed classifier has the ability to classify all kinds of multi-class agricultural datasets with continuous features. Rigorous experiments using the proposed method are performed on standard benchmark and real world agriculture datasets. The results reveal a significant performance improvement over state of the art methods such as NB, k-NN, and SVM in terms of performance metrics such as accuracy, sensitivity, specificity, precision, and F-score.  相似文献   

5.
一种新的模糊补偿多类支持向量机   总被引:1,自引:1,他引:0  
张永  迟忠先  闫德勤 《计算机科学》2006,33(12):152-155
支持向量机是Vapnik等学者在统计学习理论的基础上提出的一种新的机器学习方法。针对支持向量机理论中的多类分类问题和对于噪音数据的敏感性,本文提出了一种模糊补偿多类支持向量机算法FC-SVM。该算法是在Weston等人提出的多类SVM分类器的直接构造方法中引入模糊补偿函数,针对每个输入数据对分类结果的两方面影响,将目标函数中的惩罚项不仅进行了模糊化,而且对于分类情况进行了加权补偿,并重构了优化问题及其约束条件,然后重构了Lagrange公式,给出了理论推导。在充分的数值实验基础上,将文中提出的方法应用于建设银行个人房贷的信用评估系统中,得到了较好的实验结果。  相似文献   

6.
This paper presents a new version of support vector machine (SVM) named l 2 ? l p SVM (0 < p < 1) which introduces the l p -norm (0 < p < 1) of the normal vector of the decision plane in the standard linear SVM. To solve the nonconvex optimization problem in our model, an efficient algorithm is proposed using the constrained concave–convex procedure. Experiments with artificial data and real data demonstrate that our method is more effective than some popular methods in selecting relevant features and improving classification accuracy.  相似文献   

7.
An abdominal aortic aneurysm (AAA) is a localized abnormal enlargement of the abdominal aorta with fatal consequences if not treated on time. The endovascular aneurysm repair (EVAR) is a minimal invasive therapy that reduces recovery times and improves survival rates in AAA cases. Nevertheless, post-operation difficulties can appear influencing the evolution of treatment. The objective of this work is to develop a pilot computer-supported diagnosis system for an automated characterization of EVAR progression from CTA images. The system is based on the extraction of texture features from post-EVAR thrombus aneurysm samples and on posterior classification. Three conventional texture-analysis methods, namely the gray level co-occurrence matrix (GLCM), the gray level run length matrix (GLRLM), the gray level difference method (GLDM), and a new method proposed by the authors, the run length matrix of local co-occurrence matrices (RLMLCM), were applied to each sample. Several classification schemes were experimentally evaluated. The ensembles of a k-nearest neighbor (k-NN), a multilayer perceptron neural network (MLP-NN), and a support vector machine (SVM) classifier fed with a reduced version of texture features resulted in a better performance (Az = 94.35 ± 0.30), as compared to the classification performance of the other alternatives.  相似文献   

8.

支持向量机(SVM) 在处理多分类问题时, 需要综合利用多个二分类SVM, 以获得多分类判决结果. 传统多分类拓展方法使用的是SVM的硬输出, 在一定程度上造成了信息的丢失. 为了更加充分地利用信息, 提出一种基于证据推理-多属性决策方法的SVM多分类算法, 将多分类问题视为一个多属性决策问题, 使用证据推理-模糊谨慎有序加权平均方法(FCOWA-ER) 实现SVM的多分类判决. 实验结果表明, 所提出方法可以获得更高的分类精度.

  相似文献   

9.
In this paper, we propose a modified version of the k-nearest neighbor (kNN) algorithm. We first introduce a new affinity function for distance measure between a test point and a training point which is an approach based on local learning. A new similarity function using this affinity function is proposed next for the classification of the test patterns. The widely used convention of k, i.e., k = [√N] is employed, where N is the number of data used for training purpose. The proposed modified kNN algorithm is applied on fifteen numerical datasets from the UCI machine learning data repository. Both 5-fold and 10-fold cross-validations are used. The average classification accuracy, obtained from our method is found to exceed some well-known clustering algorithms.  相似文献   

10.
This work addresses graph-based semi-supervised classification and betweenness computation in large, sparse, networks (several millions of nodes). The objective of semi-supervised classification is to assign a label to unlabeled nodes using the whole topology of the graph and the labeling at our disposal. Two approaches are developed to avoid explicit computation of pairwise proximity between the nodes of the graph, which would be impractical for graphs containing millions of nodes. The first approach directly computes, for each class, the sum of the similarities between the nodes to classify and the labeled nodes of the class, as suggested initially in [1] and [2]. Along this approach, two algorithms exploiting different state-of-the-art kernels on a graph are developed. The same strategy can also be used in order to compute a betweenness measure. The second approach works on a trellis structure built from biased random walks on the graph, extending an idea introduced in [3]. These random walks allow to define a biased bounded betweenness for the nodes of interest, defined separately for each class. All the proposed algorithms have a linear computing time in the number of edges while providing good results, and hence are applicable to large sparse networks. They are empirically validated on medium-size standard data sets and are shown to be competitive with state-of-the-art techniques. Finally, we processed a novel data set, which is made available for benchmarking, for multi-class classification in a large network: the U.S. patents citation network containing 3M nodes (of six different classes) and 38M edges. The three proposed algorithms achieve competitive results (around 85% classification rate) on this large network-they classify the unlabeled nodes within a few minutes on a standard workstation.  相似文献   

11.
The number of training samples per class (n) required for accurate Maximum Likelihood (ML) classification is known to be affected by the number of bands (p) in the input image. However, the general rule which defines that n should be 10p to 30p is often enforced universally in remote sensing without questioning its relevance to the complexity of the specific discrimination problem. Furthermore, identifying this many training samples is often problematic when many classes and/or many bands are used. It is important, then, to test how this generally accepted rule matches common remote sensing discrimination problems because it could be unnecessarily restrictive for many applications. This study was primarily conducted in order to test whether the general rule defining the relationship between n and p was well-suited for ML classification of a relatively simple remote sensing-based discrimination problem. To summarise the mean response of n-to-p for our study site, a Monte Carlo procedure was used to randomly stack various numbers of bands into thousands of separate image combinations that were then classified using an ML algorithm. The bands were randomly selected from a 119-band Enhanced Thematic Mapper-plus (ETM+) dataset comprised of 17 images acquired during the 2001-2002 southern hemisphere summer agricultural growing season over an irrigation area in south-eastern Australia. Results showed that the number of training samples needed for accurate ML classification was much lower than the current widely accepted rule. Due to the asymptotic nature of the relationship, we found that 95% of the accuracy attained using n = 30p samples could be achieved by using approximately 2p to 4p samples, or ≤ 1 / 7th the currently recommended value of n. Our findings show that the number of training samples needed for a simple discrimination problem is much less than that defined by the general rule and therefore the rule should not be universally enforced; the number of training samples needed should also be determined by considering the complexity of the discrimination problem.  相似文献   

12.
A vertex u in a digraph G = (VA) is said to dominate itself and vertices v such that (uv) ∈ A. For a positive integer k, a k-tuple dominating set of G is a subset D of vertices such that every vertex in G is dominated by at least k vertices in D. The k-tuple domination number of G is the minimum cardinality of a k-tuple dominating set of G. This paper deals with the k-tuple domination problem on generalized de Bruijn and Kautz digraphs. We establish bounds on the k-tuple domination number for the generalized de Bruijn and Kautz digraphs and we obtain some conditions for the k-tuple domination number attaining the bounds.  相似文献   

13.
In this paper, a new algorithm is developed to reduce the computational complexity of Ward’s method. The proposed approach uses a dynamic k-nearest-neighbor list to avoid the determination of a cluster’s nearest neighbor at some steps of the cluster merge. Double linked algorithm (DLA) can significantly reduce the computing time of the fast pairwise nearest neighbor (FPNN) algorithm by obtaining an approximate solution of hierarchical agglomerative clustering. In this paper, we propose a method to resolve the problem of a non-optimal solution for DLA while keeping the corresponding advantage of low computational complexity. The computational complexity of the proposed method DKNNA + FS (dynamic k-nearest-neighbor algorithm with a fast search) in terms of the number of distance calculations is O(N2), where N is the number of data points. Compared to FPNN with a fast search (FPNN + FS), the proposed method using the same fast search algorithm (DKNNA + FS) can reduce the computing time by a factor of 1.90-2.18 for the data set from a real image. In comparison with FPNN + FS, DKNNA + FS can reduce the computing time by a factor of 1.92-2.02 using the data set generated from three images. Compared to DLA with a fast search (DLA + FS), DKNNA + FS can decrease the average mean square error by 1.26% for the same data set.  相似文献   

14.
Automatic text classification is usually based on models constructed through learning from training examples. However, as the size of text document repositories grows rapidly, the storage requirements and computational cost of model learning is becoming ever higher. Instance selection is one solution to overcoming this limitation. The aim is to reduce the amount of data by filtering out noisy data from a given training dataset. A number of instance selection algorithms have been proposed in the literature, such as ENN, IB3, ICF, and DROP3. However, all of these methods have been developed for the k-nearest neighbor (k-NN) classifier. In addition, their performance has not been examined over the text classification domain where the dimensionality of the dataset is usually very high. The support vector machines (SVM) are core text classification techniques. In this study, a novel instance selection method, called Support Vector Oriented Instance Selection (SVOIS), is proposed. First of all, a regression plane in the original feature space is identified by utilizing a threshold distance between the given training instances and their class centers. Then, another threshold distance, between the identified data (forming the regression plane) and the regression plane, is used to decide on the support vectors for the selected instances. The experimental results based on the TechTC-100 dataset show the superior performance of SVOIS over other state-of-the-art algorithms. In particular, using SVOIS to select text documents allows the k-NN and SVM classifiers perform better than without instance selection.  相似文献   

15.
This paper presents a thorough study of gender classification methodologies performing on neutral, expressive and partially occluded faces, when they are used in all possible arrangements of training and testing roles. A comprehensive comparison of two representation approaches (global and local), three types of features (grey levels, PCA and LBP), three classifiers (1-NN, PCA + LDA and SVM) and two performance measures (CCR and d′) is provided over single- and cross-database experiments. Experiments revealed some interesting findings, which were supported by three non-parametric statistical tests: when training and test sets contain different types of faces, local models using the 1-NN rule outperform global approaches, even those using SVM classifiers; however, with the same type of faces, even if the acquisition conditions are diverse, the statistical tests could not reject the null hypothesis of equal performance of global SVMs and local 1-NNs.  相似文献   

16.
一种新的基于二叉树的SVM多类分类方法   总被引:25,自引:0,他引:25  
孟媛媛  刘希玉 《计算机应用》2005,25(11):2653-2654
介绍了几种常用的支持向量机多类分类方法,分析其存在的问题及缺点。提出了一种基于二叉树的支持向量机多类分类方法(BT SVM),并将基于核的自组织映射引入进行聚类。结果表明,采用该方法进行多类分类比1 v r SVMs和1 v 1 SVMs具有更高的分类精度。  相似文献   

17.
Effective and efficient texture feature extraction and classification is an important problem in image understanding and recognition. Recently, texton learning based texture classification approaches have been widely studied, where the textons are usually learned via K-means clustering or sparse coding methods. However, the K-means clustering is too coarse to characterize the complex feature space of textures, while sparse texton learning/encoding is time-consuming due to the l0-norm or l1-norm minimization. Moreover, these methods mostly compute the texton histogram as the statistical features for classification, which may not be effective enough. This paper presents an effective and efficient texton learning and encoding scheme for texture classification. First, a regularized least square based texton learning method is developed to learn the dictionary of textons class by class. Second, a fast two-step l2-norm texton encoding method is proposed to code the input texture feature over the concatenated dictionary of all classes. Third, two types of histogram features are defined and computed from the texton encoding outputs: coding coefficients and coding residuals. Finally, the two histogram features are combined for classification via a nearest subspace classifier. Experimental results on the CUReT, KTH_TIPS and UIUC datasets demonstrated that the proposed method is very promising, especially when the number of available training samples is limited.  相似文献   

18.
This paper develops a new methodology for pattern classification by concurrently determined k piecewise linear and convex discriminant functions. Toward the end, we design a new l1-norm distance metric for measuring misclassification errors and use it to develop a mixed 0–1 integer and linear program (MILP) for the k piecewise linear and convex separation of data. The proposed model is meritorious in that it considers the synergy as well as the individual role of the k hyperplanes in constructing a decision surface and exploits the advances in theory and algorithms and the advent of powerful softwares for MILP for its solution. With artificially created data, we illustrate pros and cons of pattern classification by the proposed methodology. With six benchmark classification datasets, we demonstrate that the proposed approach is effective and competitive with well-established learning methods. In summary, the classifiers constructed by the proposed approach obtain the best prediction rates on three of the six datasets and the second best records for two of the remaining three datasets.  相似文献   

19.
研究一种用支持向量机(SVM)进行多类音频分类的方法,其中引入增广两类分类法(AB法)设计多类分类器。该算法把音频分为四类:音乐、纯语音、带背景音的语音和典型的环境音,并分析了这几类音频的八个区别性特征,包括修正低能量成分比率(MLER)和修正基频(MPF)两个新特征以及频域总能量、子带能量、频率中心等其它六个基本特征,综合考察了不同特征集在基于SVM分类器中的分类精度。实验结果表明,提取的音频特征有效,基于SVM的多类音频分类效果良好。  相似文献   

20.
We describe a system that learns from examples to recognize persons in images taken indoors. Images of full-body persons are represented by color-based and shape-based features. Recognition is carried out through combinations of Support Vector Machine (SVM) classifiers. Different types of multi-class strategies based on SVMs are explored and compared to k-Nearest Neighbors classifiers. The experimental results show high recognition rates and indicate the strength of SVM-based classifiers to improve both generalization and run-time performance. The system works in real-time.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号